On Fri, Jun 25, 2010 at 10:38 AM, Anne Archibald <aarchiba@physics.mcgill.ca
> wrote:
> Let's drop the MKL discussion completely. It is not a solution to the
> current problem, since some versions of numpy must be compiled without
> it. If you want to discuss the implications of the GPL, please start
> another thread so I can killfile it.
>> On 25 June 2010 11:35, Sturla Molden <sturla@molden.no> wrote:
>> > I also wonder if this in-accuracy is due to FFTPACK's algorithms or the
> > Fortran compiler? NumPy uses fftpack_lite which is a C library. It is
> > easy to compile fftpack_lite for single precision as well (basically
> > just tydef'ing Treal to float). Is that inaccurate too? If not, we have
> > a Fortran compiler problem.
>> It seems clear from the performance that FFTPACK is falling back to an
> O(n**2) algorithm for large prime sizes. This will also result in the
> much worse numerical behaviour that we see.
>> It seems to me that the right short-term solution is to switch to
> double precision for difficult sizes (any non-power-of-two is probably
> fine) and add a warning to the docstring (in particular mentioning the
> horrible performance as well as the roundoff error, which can be an
> issue for doubles too). This we can do for 0.8.
>> In the medium term, I think a python implementation of Bluestein's
> algorithm is the way to go; it's not so difficult, given that we
> already have efficient FFTs, but it's a bit bug-prone for 0.8. (And
> there's some tuning to be done - for example, do we pad up to the next
> power of two, or the next number divisible entirely by 2,3,5 and 7?
> The latter might actually be faster.)
>>As the size of the primes go up, the savings over simple matrix
multiplication (3x3,4x4,7x7) go down, so simply falling back to the
Bluestein algorithm might be faster even for some of these. It needs
benchmarks. I'm a bit concerned with generating the exp(i*x^2) type factors
as roundoff could be a consideration there, and there will also be some
increase in memory needs. Which is to say some work needs to be done.
I suspect the current FFTPACK single precision failings are due more to
errors in generating the elements for the large arrays then in the use of
large arrays in themselves, the observed roundoff errors are larger than I
would expect from matrix multiplication by decently orthogonal matrices.
In the long term, if numpy is to become an array library plus a python
> wrapper, the FFT code must migrate to C. I don't anticipate this being
> too bad either, but it amounts to extending FFTPACK.
>>I haven't checked what is out there lately. Might be worth a look.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20100625/77b28a88/attachment.html