Re: Large Memory usage while doing median filter

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

Re: Large Memory usage while doing median filter

Thank you for the suggestion, but my data is 32 bit float. And since the precision in data is important, I cannot convert them to uint8 data.

As Jerome suggested, it might be due to the extra large array scipy is creating to do faster sorting.

In typical astronomy applications I encounter, our images are bigger than 1kx1k, I wonder whether there exist other tools to do median filtering. For a moving window median, since only a few pixels leaves and enter the window, if we take advantage of that, then I would imagine the sort time required to find median in each window position wouldn't be very high.

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

Re: Large Memory usage while doing median filter

Hi Juan,
Thank you for the suggestion, but my data is 32 bit float. And since the precision in data is important, I cannot convert them to uint8 data.

As Jerome suggested, it might be due to the extra large array scipy is creating to do faster sorting.
In typical astronomy applications I encounter, our images are bigger than 1kx1k, I wonder whether there exist other tools to do median filtering.
For a moving window median, since only a few pixels leaves and enter the window, if we take advantage of that, then I would imagine the sort time required to find median in each window position wouldn't be very high.

Does anybody know of any such fast median filter routines in python?
Thanking you,
-cheers
joe

On 11 May 2015 at 11:01, Juan Nunez-Iglesias <[hidden email]> wrote:
If you can cast your image as a uint8 image, try the median filter in scikit-image's filters.rank module. It's very fast and has a minimal memory footprint. But it doesn't work on floats or high ints.

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

Would you report this as an issue on github so that it doesn't get lost?

A second thought is that a different implementation of a median filter exists in the signal package as medfilt and medfilt2d. I haven't ever used any of these functions, but it might be worth a shot to try them.

Thank you for the suggestion, but my data is 32 bit float. And since the precision in data is important, I cannot convert them to uint8 data.

As Jerome suggested, it might be due to the extra large array scipy is creating to do faster sorting.

In typical astronomy applications I encounter, our images are bigger than 1kx1k, I wonder whether there exist other tools to do median filtering. For a moving window median, since only a few pixels leaves and enter the window, if we take advantage of that, then I would imagine the sort time required to find median in each window position wouldn't be very high.

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

Re: Large Memory usage while doing median filter

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.

Maybe I didn't search hard enough, but I don't see where such an array is allocated. There are several layers of calls, from python in ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed it. Can you point to where such an array is created, or was that really a guess?

Re: Large Memory usage while doing median filter

> > I guess this is because scipy creates a 1024x1024x(40000) array to do the
> > sort along the last axis.
> > maybe no the best from the memorry point of view.
> >
>
>
> Maybe I didn't search hard enough, but I don't see where such an array is
> allocated. There are several layers of calls, from python in
> ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed
> it. Can you point to where such an array is created, or was that really a
> guess?

It is really a guess ... I did not have a look at the source code.

To do such things, a colleague of mine did some CUDA (OpenCL would be
the same) but it is out of the scope.

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.

Maybe I didn't search hard enough, but I don't see where such an array is allocated. There are several layers of calls, from python in ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed it. Can you point to where such an array is created, or was that really a guess?
Warren

The really large array is allocated in NI_InitFilterOffsets, on line 518, in ni_support.c which is called from line 726 of ni_filter.c, in Ni_RankFilter.

For me, calling ndimage.median_filter(arr, 150), with arr a (1024, 1024) array of doubles or floats results in an allocation of 4050000000 bytes ( 3.77 GB). Which seems a little bit bigger than we would like here.

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.

Maybe I didn't search hard enough, but I don't see where such an array is allocated. There are several layers of calls, from python in ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed it. Can you point to where such an array is created, or was that really a guess?
Warren

The really large array is allocated in NI_InitFilterOffsets, on line 518, in ni_support.c which is called from line 726 of ni_filter.c, in Ni_RankFilter.

Thanks Eric.

Warren

For me, calling ndimage.median_filter(arr, 150), with arr a (1024, 1024) array of doubles or floats results in an allocation of 4050000000 bytes ( 3.77 GB). Which seems a little bit bigger than we would like here.
-Eric

Re: Large Memory usage while doing median filter

On 11/05/15 18:01, Ralf Gommers wrote:

> Good thing then that there's a GSoC on rewriting ndimage in Cython is
> about to start:)

Yes.

However, the world isn't always perfect, and neither is SciPy. But
correctly working code is infinitely better than no code, even it it
hungry on memory. And the cheapest solution to excessive memory use is
(almost) always to buy more RAM. That tends to be way cheaper than
paying a developer :)