From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20041020
Firefox/0.10.1
Description of problem:
It appears that gcc in FC3 miscompiles numerical code.
The problem appears to be with lapack libraries and can be
demonstrated with octave (which uses them):
[dima@localhost ~]$ octave
GNU Octave, version 2.1.57 (i686-pc-linux-gnu).
Copyright (C) 2004 John W. Eaton.
....
octave:1> a=rand(100);
octave:2> tic; eig(a); toc
error: dgeev failed to converge
octave:2>
---------------
Sometimes it just hangs there for few minutes after which I kill it.
I tried to compile octave myseelf agains ATLAS (different, optimized
blas/lapack implementation) libraries, which I compile myself as well.
The result was the same. I also tried different versions of octave.
This al works on RHEL3, RH9, FC2, FC1.
It is possible that the problem is actually with glibc. I was not able
to recompile octave with gcc33 to check that.
Version-Release number of selected component (if applicable):
gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)
How reproducible:
Always
Steps to Reproduce:
1. start octave
2. type at the octave prompt as shown
3.
Actual Results: octave hangs or gives an error
Expected Results: "0.004" (may vary slightly) -- this is 4 msec that
it too the code to run on FC2.
Additional info:
this is on athlon/xp 2000MHz/500Meg.

Then it is IMHO not a GCC bug. -ffloat-store is not the default on purpose, it is too slow and most of the software out there doesn't need it.
IMHO you want to open a bug against lapack (and/or octave) and request that
it be compiled in two versions on IA-32: -ffloat-store and -mfpmath=sse -msse2
(the latter for P4 & recent AMD CPUs).

The issue here is that lapack is being around for a while and is
compiled by previous generations of gcc as well as bunch of other
compilers just fine. Suddenly it breaks, which make me think that
the default compiler options are not "safe." When clock chimes 13
times it is not the 13th chime that is broken, it is the clock that
needs fixing. I did catch lapack, but who knows what else might be
broken?
As for the speed part -- I heard rumors that -ffloat-store is
slow, but in fact all my benchmark run now with the same speed as
always (and some, which involve itterations, like Schur decomposition,
runs about 20% faster because of the faster convergence).
Anyway, I did not think of -ffloat-stare as a fix, but ruther a
workaround for a potentially more serious problem with gcc.

Beg to disagree. For code that relies on computation not being done with extra
precision -ffloat-store is a must and lapack clearly relies on it.
Why things worked in this particular case with < GCC 3.4 and don't work anymore
is most probably because GCC is now better at optimizing and likely will have less spills to memory (and only spills to memory on the mis-designed i387 FPU
round to the declared precision instead of using full long double precision).
If you give me one exact routine in lapack that causes the problems you are seeing, I can look at it in detail and tell exactly what is going on.
But I certainly don't intend to debug half of octave/lapack to figure it out.

All my comments are gone -- I will try again.
I understand that the problem is (most likely) due to aggressive
optimization. My concern was that it is _too_ aggressive and produces
wrong code.
The error which I posted was in the file dgeev.f (from lapack).
I attached the file for the reference.
I am going to write to octave mailing list about it. Perhaps it would
be good to involve lapack peopl as well. I did file this as a bug
against lapack:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=138791
Thanks for your attention and a fast response.
Sincerely,
Dmitri.

Since I do not call it directly (I use octave, which calls this
lapack routine), it is hard for me to tell you how it is called
exactly. I am going to write to Octave mailing list and perhaps
John Eaton (Octave author) can give you an authoritative answer.
Sincerely,
Dmitri.

I just created an attachment (sorry that I didn't put all the comments
there, I thought it would bring me back here).
The above line will produce a binary linked against the dynamic
lapack/blas in FC3:
planck[libmwrep]> ldd test_svd_fedora
liblapack.so.3 => /usr/lib/liblapack.so.3 (0x003d6000)
libg2c.so.0 => /usr/lib/libg2c.so.0 (0x00db6000)
libm.so.6 => /lib/tls/libm.so.6 (0x00d30000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x002f9000)
libc.so.6 => /lib/tls/libc.so.6 (0x00c04000)
libblas.so.3 => /usr/lib/libblas.so.3 (0x00101000)
/lib/ld-linux.so.2 (0x00beb000)
If I run this exact same binary on a RedHat9.0 box, it runs fine:
kellogg[libmwrep]> ./test_svd_atlas
Entering dgesvd. If this takes more than a second or two
it means it has hanged. Kill it with Ctrl-C
dgesvd finished
svals:
4. 8.32667268E-17 4.67733824E-51 1.52114348E-84
However, on a Fedora3 machine it hangs forever, with 100% cpu utilization.
I hope this can be fixed with a new lapack release which is correctly
compiled soon.
Many thanks in advance,
f.

You can get lapack from rawhide (development tree) which does not have
this problem. I personally added -ffloat-store to the FFLAGS in
src.rpm and rebuild an rpm file. Works fine:
[dima@tumbleweed bug]$ ./test_svd_fedora
Entering dgesvd. If this takes more than a second
it means it has hanged. Kill it with Ctrl-C
dgesvd finished
svals:
4. 8.32667268E-17 4.67733824E-51 1.52114348E-84
I still do not understand why the fix is not in the update tree...
Dmitri.

OK, I just fished out the -28 blas/lapack RPMs out of a 'development'
fedora repo, and slapped them onto my FC3 box. The problem is indeed
solved by them.
But they should be backported to FC3-updates, please. These corrected
packages have been available since 12/21, and I would not have wasted
a whole day tracking this (which I thought was a bug in my code) if
only they had been made available to updates. Yum would have nicely
picked them up right away.
Please put these corrected -28 blas/lapack packages on the released
updates repos for FC3.
And many thanks to Dmitri for your help!

Fedora Core 3 is now maintained by the Fedora Legacy project for security
updates only. If this problem is a security issue, please reopen and
reassign to the Fedora Legacy product. If it is not a security issue and
hasn't been resolved in the current FC5 updates or in the FC6 test
release, reopen and change the version to match.
Thank you!