DMC has significantly more accurate floating point than other compilers do.
This is particularly apparent in the floating point library, exp() included.
It involves correctly handling things like NaN's and Infinities, which
requires some extra code to be executed. Many C compilers simply ignore
those cases.
-Walter
Laurentiu Pancescu wrote in message <9o86u7$2oki$1 digitaldaemon.com>...

I rewrote completely all the numerically-intensive functions,
and I was amazed by the speed of DMC generated code: it's the
best compiler on Win32!! Borland's free compiler generates a
crashing EXE, while Cygwin and MinGW generated code with about
half the speed of DMC's code - unbelievable!! It seems that the
"-ff" switch is very effective (almost doubles execution speed
in this case). Even more, after this code rewrite, the X32
version is exactly as fast as the Win32 version (which is normal, I
must have done some stupid things in the first version). Only
gcc-2.95.2 on Debian GNU/Linux beats DMC, but the difference
is not so much (about 9% faster code)...
Congratulations, Walter!! DMC is really great, and the ability
of treating Infinity and NaN without inline assembly is
extremely useful for mathematical applications.
Laurentiu
"Walter" <walter digitalmars.com> wrote:

DMC has significantly more accurate floating point than other compilers do.
This is particularly apparent in the floating point library, exp() included.
It involves correctly handling things like NaN's and Infinities, which
requires some extra code to be executed. Many C compilers simply ignore
those cases.
-Walter

Thanks! - but I have to ask, what is gcc-2.95.2 doing that DMC is not to the
code? -Walter
Laurentiu Pancescu wrote in message <9ofker$r28$1 digitaldaemon.com>...

I rewrote completely all the numerically-intensive functions,
and I was amazed by the speed of DMC generated code: it's the
best compiler on Win32!! Borland's free compiler generates a
crashing EXE, while Cygwin and MinGW generated code with about
half the speed of DMC's code - unbelievable!! It seems that the
"-ff" switch is very effective (almost doubles execution speed
in this case). Even more, after this code rewrite, the X32
version is exactly as fast as the Win32 version (which is normal, I
must have done some stupid things in the first version). Only
gcc-2.95.2 on Debian GNU/Linux beats DMC, but the difference
is not so much (about 9% faster code)...
Congratulations, Walter!! DMC is really great, and the ability
of treating Infinity and NaN without inline assembly is
extremely useful for mathematical applications.
Laurentiu
"Walter" <walter digitalmars.com> wrote:

DMC has significantly more accurate floating point than other compilers

This is particularly apparent in the floating point library, exp()

It involves correctly handling things like NaN's and Infinities, which
requires some extra code to be executed. Many C compilers simply ignore
those cases.
-Walter

Thanks! - but I have to ask, what is gcc-2.95.2 doing that DMC is not to the
code? -Walter

I don't know, the GCC gen'd assembly code is too large for
me... :( But I did more tests (tweaking compiler options only, I
didn't touch the code), and managed to get the code compiled by
gcc-2.95.2 on GNU/Linux to be 22% faster than DMC's code.
Maybe I could get even more with pgcc (Pentium Compiler Group's
patch to gcc, see www.goof.com/pcg).
Actually, I think it's very dependent on the runtime libs:
GNU/Linux has a very highly optimized math library (like most
system code on GNU systems), which also handles Infinity, NaN and
other oddities. I used also gcc-2.95.2, in the DJGPP flavor,
which has its own libm, and the code is just 50% slower than
DMC's, not about 100%, as MinGW and Cygwin. Cygwin uses Cygnus'
library, while MinGW uses Microsoft's MSVCRT, and it's a little
slower than Cygwin at exp() and friends.
To get a fair comparison, one should probably use "pure" user
code, without any lib calls, so that a weak compiler wouldn't be
advantaged by a highly optimized library (MSVC generates much
slower code than DMC or gcc, but the first version of my app
ran 139% faster than DMC compiled version and 52% faster than
MinGW, probably due to a very good math library).
Regards,
Laurentiu

If you have a billion dollars to spend on engineers, you can task them to
coding the entire rtl in optimized assembly language!
You're right that you have to check if you're testing the rtl speed or the
generated code speed. I was losing a benchmark to gcc once, and couldn't
figure out why because in every case dmc generated better code. Turns out
the time was all being sucked up in a strcpy() of a constant which gcc had
inlined and essentially eliminated.
-Walter
Laurentiu Pancescu wrote in message <9og353$12og$1 digitaldaemon.com>...

"Walter" <walter digitalmars.com> wrote:

Thanks! - but I have to ask, what is gcc-2.95.2 doing that DMC is not to

code? -Walter

I don't know, the GCC gen'd assembly code is too large for
me... :( But I did more tests (tweaking compiler options only, I
didn't touch the code), and managed to get the code compiled by
gcc-2.95.2 on GNU/Linux to be 22% faster than DMC's code.
Maybe I could get even more with pgcc (Pentium Compiler Group's
patch to gcc, see www.goof.com/pcg).
Actually, I think it's very dependent on the runtime libs:
GNU/Linux has a very highly optimized math library (like most
system code on GNU systems), which also handles Infinity, NaN and
other oddities. I used also gcc-2.95.2, in the DJGPP flavor,
which has its own libm, and the code is just 50% slower than
DMC's, not about 100%, as MinGW and Cygwin. Cygwin uses Cygnus'
library, while MinGW uses Microsoft's MSVCRT, and it's a little
slower than Cygwin at exp() and friends.
To get a fair comparison, one should probably use "pure" user
code, without any lib calls, so that a weak compiler wouldn't be
advantaged by a highly optimized library (MSVC generates much
slower code than DMC or gcc, but the first version of my app
ran 139% faster than DMC compiled version and 52% faster than
MinGW, probably due to a very good math library).
Regards,
Laurentiu

You're right that you have to check if you're testing the rtl speed or the
generated code speed.

I implemented my own exp function, using MacLaurin series
expansion, and doing a sum after 10 million such calculated values
(just to make sure no rtl is getting into way). Here are the
results (max optimizations on all compilers):
- bcc32 does it in 92 seconds (I also noticed that bcc32
doesn't handle INFINITY properly, so I modified the test not to get
into any Inf or NaN)
- DMC produces the correct result in 75 seconds
- GCC-2.95.3-6 (MinGW-special) gives correct result in 22 seconds.
I had different arguments for my exp(), so that no smart
compiler optimizes something away.
However, I don't think that my code has any relevance from a
benchmark's point of view - it's too simple... DMC seems to be
by far the best commercial compiler for Win32, no matter which
code I'm trying (33% improvement over BCC 5.5.1 isn't something
any compiler can achieve, usually MSVC generated code a sesible
slower than bcc32's).
Laurentiu