While browsing Linux.com, I came across this article with benchmarks of 32-bit versus 64-bit Gentoo performance. In some cases, apparently 32-bit is still faster, especially in compile time. Which reminds me, it takes less than a half hour to compile X on an Athlon64 4000+! Takes me a few hours, Gnome not included, on my puny Pentium III.

I was a little disappointed to see some of the author's conclusions:

Quote:

64-bit operating systems may not be practical for simple desktop use at this point, partially because of some of the hassles in setting them up, and partially because they offer little performance increase for most desktop applications.

I plan for my next computer to run an Athlon64 chip, and it's a little disheartening to see (yet another) tech writer hint that decent, widespread 64-bit software adoption is still a ways off. I keep telling myself that when AMD's marketing slogans on their website say the Athlon64 will run "tomorrows 64-bit software", they really mean "tomorrow is today". Bit disillusioned, perhaps.

Furthermore, I was surprised that the article writer concluded with:

Quote:

Sometimes the purpose of a benchmarking project is to show which squeaky wheels need the grease. This benchmarking project has shown that there's still a long way to go for AMD64-specific optimizations in the GNU/Linux world.

I'd always been under the impression--and even spread the word--that Gentoo's implementation of a 64-bit OS was the best available in the Linux world, or at least the most complete. That apart from some scattered 32-bit emulation problems, usually configuration-related, about the most serious issue facing AMD64 users was the lack of a 64-bit Flash plugin for Firefox!

In the end, I guess I've heard another perspective on 64-bit computing, but I did think it was nice that Gentoo got exposure on a major Linux trade publication, and that he seemed to know what he was doing in terms of setting up Gentoo. "My first experience installing Gentoo" articles are all well and good, but I prefer to hear what experienced users are putting their systems through. Hopefully the article gets someone new to try out Gentoo for him/herself, eh?

Last edited by 96140 on Tue Nov 21, 2006 12:54 am; edited 1 time in total

I made the jump from a 2.4Ghz P4 to a 2.0Ghz Athlon64 and my results are not the same.

All hardware is the same except for the new motherboard and CPU.
My compile times are dramatically shorter.
My ut2004 scores seem a little faster, but I had 8x AGP originally, but can only get 4x AGP on the 64 bit motherboard. (I don't know if or how much difference that would make.)
I have also ran a 32 bit version of ut2004 in a chroot environment and the frame rates are almost identical to the 64 bit environment.

Maybe the test in the article didn't have a correctly compiled gcc and the newer NVIDIA drivers help a lot?

So bottom line is these chips should be faster than what you had before and you shouldn't see enough of a speed hit to stay in only 32 bit mode. More and more apps will start to take advantage of 64 bits so why wait.

Last edited by Headrush on Thu Jun 16, 2005 12:46 pm; edited 1 time in total

Ghz is not everything. Your AMD chip will indeed run faster than that Pentium chip. The numbers that give the chips their name (4000+) represent that the chip will run at the very least as fast as a Pentium chip with Hz equal to that number. It is called performance rating.

Ghz is not everything. Your AMD chip will indeed run faster than that Pentium chip. The numbers that give the chips their name (4000+) represent that the chip will run at the very least as fast as a Pentium chip with Hz equal to that number. It is called performance rating.

And its marketing, too...
They're different chips, though, and comparing them hertz to hertz doesn't work.

And its marketing, too...
They're different chips, though, and comparing them hertz to hertz doesn't work.

Guys, nobody is comparing these and I know about the performance rating for the AMD chips.
(I was trying to show these chips are indeed fast.)

My point to nightmorph was that running in 64 bit mode should not be on average slower than 32 bit like suggested in the article. I'm sure having a completely optimized build and using CFLAGS for a athlon64 instead of pentium4 will make a difference.

That was the point I was trying to make. Sorry if I was less than clear.

Headrush wrote:

My point to nightmorph was that running in 64 bit mode should not be on average slower than 32 bit like suggested in the article. I'm sure having a completely optimized build and using CFLAGS for a athlon64 instead of pentium4 will make a difference.

The problem is, as far as I know, Gcc can optimize for x86 better than it can for x86-64 even with "proper" CFLAGS.

The software was Gentoo Linux 2005.0 using the Universal ISOs. I performed a stage 3 installation with no USE flags, the compiler options set for -pipe -O2 -fomit-frame-pointer, and the Pentium 4 and K8 -march options. I used the Pentium 4 option with the Athlon 64 because it has the same technologies (SSE, SSE2, MMX) that the Pentium 4 option provides. This could enhance performance in some tests, as the AMD-specific architectures below the K8 do not include SSE2.

Is this a good idea? I'm getting an X2, and I don't think I'll be using 64 bit just yet. Should I use -march=pentium4?

No, definitely not. The author of the article definitely misunderstood the reason for keeping the proper -march setting. Browse the AMD64 forums for awhile, and check out the documentation to see why the Pentium4 instruction set should not be used with an Athlon64.

That's apparent even to me; I too think the test results were somewhat skewed by the author's -march setting. There's a reason why the -march=k8/-march=athlon64 settings are meant to be used over pentium4 The GNU gcc online documentation does a decent job of explaining why Athlon64 users should use one setting over another. Basically, the writer of the article had some incorrect assumptions about the differences/similarities between the two processors.

Still, regardless of -march setting, I was disappointed to see more tests in which 32-bit exceeded 64-bit performance. But regardless, if you're running a more recent Athlon64 chip, considering that it supports 32-bit applications at native speeds, and it's one helluva blazing fast processor, having to run some 32-bit compiled binaries shouldn't really be that much of a bother.

I really should have realized it was a bad idea. I guess I almost fell in the "It's on an official looking site, must be correct" trap.

What I should have done, is look at the FAQ in the Gentoo on AMD64 forum. It linked to this thread about x86 installs on AMD64 CPUs. Seems I can keep using my existing install. Just change CFLAGS and emerge -e world after I upgrade the hardware. That's nice

As stability is more important to me than performance I'll probably go with:

Sounds like a good plan. Though on that note, if you change a number of CFLAGS and want to recompile your system with them--which is a good idea, otherwise you aren't getting the benefits of the new flags--it would be a better idea to do the following:

This way you have a complete toolchain (gcc, binutils, etc.) built with the new CFLAGS, and your world packages have been compiled with a proper rebuilt Athlon64 toolchain. It's a common enough technique, usually used when upgrading gcc, but also useful for getting a fast, stable system after CFLAG & CXXFLAG changes.

Sure, it might be a bit of compile time, but going by the test results from this article, compiling is a snap for Athlon64 owners. Especially since with a dual core Athlon X2, you will be able to set MAKEOPTS="-j3" or even "-j4", increasing the number of parallel makes. Mmm, dual core goodness.

Wasn't there a better way to do that? I remember someone having written an emerge wrapper that first emerges the toolchain, then emerges it again with the newly emerged toolchain, and then emerges world twice, fully, except for the toolchain...

Sounds like a good plan. Though on that note, if you change a number of CFLAGS and want to recompile your system with them--which is a good idea, otherwise you aren't getting the benefits of the new flags--it would be a better idea to do the following:
Code:
# emerge -e system
# emerge -e system
# emerge -e world
# emerge -e world

Do you think that's neccessary? I mean, the only thing that should change is the addition of few sse2 optimizations. Ah, what the hell, I'll probably do it anyway just to be on the safe side. After all:

Wasn't there a better way to do that? I remember someone having written an emerge wrapper that first emerges the toolchain, then emerges it again with the newly emerged toolchain, and then emerges world twice, fully, except for the toolchain...

I used to run an Athlon 2500+ OC'd to 2.1ghz w/ a 390mhz FSB running Corsair XMS PC3200. I now run an AMD64 3000+ (also stock rated at 1.8ghz the same as the 2500+) OC'd to 2.21ghz w/ a 520mhz memory bus running Geil Platinum PC4000 OC'd 20mhz. My compile times take 30% less time on my new 64 bit system if that tells you anything. I remember compiling kdebase-3.3.x in 1hr 14mins on the 2500+ and was proud of that. The same package took <55 mins on this system. xorg on this 64 bit box:

As some replies to the original article have pointed out, the rather primitive benchmark tests were fatally flawed by incorrect compile-time assumptions, among other things. I do this sort of thing (performance engineering, benchmarking, optimization, and capacity planning) for a living.

The flip side of that is that it is also a mistake to let a *vendor* run your benchmarks; they will be tuned to favor the vendor's environment and will not necessarily be reproducible with a real workload. On top of that, I'm not at all sure the default Gentoo compiler is really tuned to the AMD 64 architecture yet.

So ... my recommendation is that if you can affford it and have a 64-bit workload, by all means get a 64-bit processor and load Gentoo on it. While the performance without attention to tuning might be the same as a 32-bit processor, with a little effort you'll leave a 32-bit machine in the dust._________________--
M. Edward (Ed) Borasky
znmeb@borasky-research.net
http://www.borasky-research.net/

I have a Pentium 4 running at 3GHz with 1GB of RAM. I have always felt it as being a very quick computer.

Just recently I built a new computer, based upon an AMD 4000+, again with 1GB of RAM. I have (to my knowledge) installed only 64bit software on this computer.

It was a very long time ago that I had such a speed experience from any PC. I don't even think about doing any measurements to compare speed. It's the feeling in the back. It really is a lot quicker. I feel it in my fingers on the keyboard. I feel it in my backbone while sat on the chair. (No network backbone, but the real physical thing. The one that hurts when sitting too long in from of a computer, you know.)

It's like driving a V8 equipped car after having spent months in the modern V6 or four-banger thingies they sell us nowadays.

How often do you get such a speed-trip on a PC? Well, to me it was a loooong time ago.

But with this AMD4000+ thing, I was just blinded.

Unfortunately, it has been in pieces for the last two or three weeks, while I have installed water-cooling in the case. It's undergoing a 24-hour leak test while I'm writing this. If all goes well, it should be up and running again tomorrow.

Regards
Biker_________________The "Freedom of the Press" belongs to those that have a press.

It was a very long time ago that I had such a speed experience from any PC. I don't even think about doing any measurements to compare speed. It's the feeling in the back. It really is a lot quicker. I feel it in my fingers on the keyboard. I feel it in my backbone while sat on the chair. (No network backbone, but the real physical thing. The one that hurts when sitting too long in from of a computer, you know.)

It's like driving a V8 equipped car after having spent months in the modern V6 or four-banger thingies they sell us nowadays.

I'm sorry, but these types of "benchmarks" don't really impress me. Subjective experience is simply way to unreliable.

I have little doubt that as the GCC 64bit code generation matures 64bit will be faster overall. I've seen no evidence of this being the case yet though. Perhaps with GCC 4.1?

As of yet, the benchmarks I've seen show 64bit binaries to be benefitial to apps such as databases. Overall though, I've not seen any that portray 64bit binaries as being faster overall. Quite the opposite actually. Perhaps this has changed since I last looked into it.

No one would be happier than me if there were reliable benchmarks that showed a performance boost overall from using 64bit binaries. If you know of any such, please post them. Until then, I'll go with the greater compatibility , and according the the tests I've seen performance, of a 32bit system.

If you only take the bits into account, the only thing you get is the possibility as developer to use 64bit-pointer to address a wider range of memory __and__ the possibility to use 64bit integers to calculate with greater numbers.

The speed thing is based on a better, up-to-date architecture of the AMD64 processors. They have more registers, so they can calculate faster because they can hold more numbers in fast registers.
The second advantage based on the architecture is (on socket 939) the dual channel memory access, which speeds up things dramatically.

The bad thing about 64bit is, that the code will be a bit bigger than 32bit code, because of larger machine instructions (pointer have double the size, some integers have if implemented).

All in all my conclusion is, that AMD was lucky this time and faster than Intel in developing a fast processor architecture. But the speed advantage has nothing to do with 64bit, but more to do with this great architecture they created. We will see, if in the next generation of chips maybe Intel wins again._________________Heaven: The police are British, the chefs Italian, the mechanics German, the lovers French and it's organized by the Swiss.
Hell: The police are German, the chefs British, the mechanics French, the lovers Swiss and it's organized by the Italians.

@Biker: I share this experience - my AMD64 3400+ is more as twice as fast as my old Athlon XP 1800+. But this is not the point about it.

The article was not about wheter the AMD64 is good or bad, but wheter it is better to run 32bit or 64bit Linux on it. I don't have any knowledge about the optimizations, but it is interesting that obviously the gcc isn't as fast with 64bit as with 32bit. Well, it's actually not, because the gcc is optimized for a good result and not a good performance itself. It's probably the same with the optimizations: compilations with O3 takes often longer than with O2/O1 (tell me if I'm wrong). Maybe it takes more time to generate optimized code for all the 64bit extension?

But I don't see any reason why I should use a 32bit environment on my AMD64. In the meantime it's running quite well - certainly there are more problems on this new amd64 branch than on the dusty x86 branch.

So it's really hard to tell what's the best for speed between a 32bits compiled linux and a 64bits linux because most of software shouldn't take in consideration the large extention (32->64 bits) but 64bits integer native software as scientific calculus or similar should if they have been coded for ! 64bits adressage should enlarge size code too. In a other hand, a 64bits compiled soft should take a lot advantage of the extras registers that could improve both speed and size of the code.

But all this work is done by compilers. And our is GCC. So it depends a lot on what kind of optimisations GCC could really handle on new 64 bits arch.

So since i haven't a 64 bits sytem to make some tests, i propose some benchmark to do and ask for testers on 64 bits systems !
I've elaborate some way to do benchmarks some while ago to see the impact on different kind of CFLAGS and the speed decrease of hardened compiled system.

For this i use: app-benchmarks/nbench

We need to do the following tests with the same CFLAGS (could be interesting to test with -O2 and -O3 to see huge differences) and the same CPU obviously:

This sure seems to disagree with other test I've seen. It portrays 64bit as notably faster. That's encouraging.
Any opinions on methodology and reliability of this test? Pretty slim set of benchmarks imho.

If you only take the bits into account, the only thing you get is the possibility as developer to use 64bit-pointer to address a wider range of memory __and__ the possibility to use 64bit integers to calculate with greater numbers.

The speed thing is based on a better, up-to-date architecture of the AMD64 processors. They have more registers, so they can calculate faster because they can hold more numbers in fast registers.
The second advantage based on the architecture is (on socket 939) the dual channel memory access, which speeds up things dramatically.

The bad thing about 64bit is, that the code will be a bit bigger than 32bit code, because of larger machine instructions (pointer have double the size, some integers have if implemented).

I think what you are correctly saying is as of right now, the advantage of Athlon64s isn't the 64bits, but the other features of the processor and associated motherboards right now. As gcc matures we can expect to see more benefits for the the use of 64 bits.

I've done various tests on my AMD64, between my old 32-bit Gentoo and the rebuilt 64-bit one. Basically, there is no discernable difference in almost anything, except for one - the OpenSSL speed tests. Not surprising, since crypto is well-known to benefit from 64-bits. The speeds soar by 3x when running in 64-bit, but everything else, from UT2004 to various compiles, is about the same._________________------------------------------------------
Alastair Stevens
www.altrux.me.uk