Introduction

Enterprise versions of Linux based on kernel 2.6, and 64 bit database servers are now very mature. Dual core 64 bit Opteron and 64 bit Xeons with 2 MB L2-caches are available. It was definitely time to update our previous Linux Database Server CPU comparison.

In this article, you will find a comparison of the latest Xeon (Irwindale), the previous Xeon (Nocona), the old Xeon (Galatin), the Dual core Opteron, and the "normal" Opteron, of course. We also included the Pentium-D to get an idea of what a Dual core Xeon could do, although the comparison is not completely fair: the memory subsystem of a Dual core Xeon will have higher latency and slightly lower bandwidth as it will use ECC buffered DIMMs instead of non-buffered DIMMs.

In our previous article, we used SUSE SLES 8 (kernel 2.4.21) and the Xeon 3.6 GHz "Nocona" matched the performance of the Opteron 250 in 32 bit DB2, but failed to impress in MySQL. Intel's Xeon was not recognized as a 64 bit capable CPU by SLES 8 with kernel 2.4 however, and the Opteron gained 12% (DB2) and 30% (MySQL) when running in 64 bit.

On SLES 9, we can unleash the full 64 bit potential of both the Intel Xeon and Opteron. Kernel 2.6 includes better and improved support for NUMA, 64 bit, large memory pages, threading and fully recognizes EM64T CPUs as 64 bit capable. How do the Xeon and Opteron compare when they both run 64 bit applications on a 64 bit enterprise version of Linux? Should you invest in Dual core CPUs, or are these expensive CPUs beaten by two single CPUs? Should you wait for Dempsey, the dual core Xeon?

These are a few of the questions that we will answer. While we still continue to improve the quality of our benchmarks, we decided to report our first impressions.

The scope and focus of this test

Our last Database server comparison generated quite a bit of very useful and interesting feedback. Living up to the excellent AnandTech tradition, we have read them carefully and taken many suggestions to heart.

However, the Lab of the Technical University of Kortrijk where we performed our tests did not dispose of such an impressive disk array, and we were determined to focus on the database performance of the different CPUs and CPU-chipset-memory combinations. All tests were done (99% of the time) with in-memory queries. Investigating the performance of different disk storage systems is a time-consuming and completely different project.

Some of you might still be convinced that in-memory tests are not really relevant. Consider that the availability of cheap 64 bit system makes it possible to use much more RAM than before. Flat 64 bit addressing of more than 4 GB of RAM used to be a privilege of very expensive servers (Power4, etc.), but this is no longer the case with the introduction of Intel's EM64T Xeons and AMD's AMD64 Opteron.

With the current prices of 1 GB DDR(-II) sticks, it is very easy and inexpensive to build a database server with 8 GB of RAM. Even 16 GB (16x1 GB) is not that expensive, considering the price of a quad Opteron server. As a seasoned sys-admin told me, "the performance of database servers can be brought back to life with some extra RAM." It is in many cases that a large amount of RAM can do more than very expensive 15,000RPM SCSI disks.

Again, this article is not about the typical huge central databases of banks that need to handle a large number of transactions, with writes operations being very frequent.

We test on SUSE SLES 9 (SUSE Enterprise Edition) SP1, Linux kernel 2.6.5-151smp. Yes, this is not the latest kernel version, which is 2.6.12 at the time of this article. We used 2.6.5 because it is the last kernel available for our enterprise version of SUSE. The very nature of this project also forces us to check our numbers with at least 5 consecutive tests, and a lot of time is spent in checking parameters and so on, so we need to "freeze" the kernel version for a few weeks. We did perform a few tests on Gentoo, however, with kernel version 2.6.12.

Nice article. I'd also be interested in PostgreSQL, being the "other" major open source database... specifically, whether it's any better at scaling with multiple CPUs. (Not that I have any practical use for this information, I'm just curious.)Reply

Seriously, mickyb and elmo may be correct about the Intel compilers (I frankly don't have a clue what's used in most shops)...

The real problem is that it's a virtual impossibilty to create a "level playing field", but I have to say to the critiques of the article that Johan has done a stellar job of coming as close as possible!Reply

mickyb - Thanks for the input! Fair enough...maybe Johan could use both the Pathscale compiler (which is optimized highly for Opteron) and the Intel optimized compiler on his next series of tests?Reply

I dissagree with the comment that a large number of people don't use the Intel compiler. I (other developers and IT shops) only use Intel compiler's for Linux. It is the fastest one out there for x86 and Itanium.

If you are running a large database that requires a large server (compared with a desktop loaded with RAM to run a personal blog site) like this article is testing, you will be setting up the environment with a trained IT professional that will use the compiler that is fast and stable.

When we build our product for all the UNIX platforms, we always use the vendor compiler instead of gnu. gnu works great and is free, but it is not optimized nearly as much.

This is like saying the same audience won't recompile Linux on the platform they are going to install it on. This is the first thing you should do....and with an Intel compiler. There should be no real reason why one vendor Linux is faster than the others except for compile options and loaded modules. You cannot run Linux out of the box, it doesn't come in a box where I get it. :)Reply

C'mon mate...anybody who has read your posts knows you're heavily biased towards Intel, just as people who have read mine know that I am biased towards AMD. The important thing is to try and set aside the bias to look at things from both sides...I do try, but admittedly don't ALWAYS succeed. :-)

I imagine you probably posted before you read the explanation of what a query cache is...understandable.

As to not using an Intel specific compiler, I suppose that if it HAD been used I would be complaining as well. We have to rely on Johan and Anand (who frankly know a Hell of a lot more about this than either of us) to choose based on what the market actually uses...if you can site impartial industry sources that show otherwise, I'm sure we would all (especially the AT staff) would love to see them.
I do know that over the years, Johan and Anand have shown themselves to be quite unbiased in their articles (you should go read some of them on Aces as well)!

There are certainly things that I could pick apart as well..e.g. when he states
"In the second half of 2004, already one million EM64T Xeons were shipped"
Yes they were shipped, but that doesn't mean they were sold. The majority of those shipments were probably to OEMs for inventory buildup. Remember that Intel had a huge inventory write-off at the same time, and this was most likely a shift in inventory.

Regardless, none of this has to do with the validity of the article which is excellent and makes sense. If you think about it, it should have been expected...the only for AMD to have increased their marketshare in servers is by performance. They certainly don't have the budget or marketing clout that Intel has!Reply

About ISAM and DB/2... ISAM (Indexed Sequential Access Method) is NOT a database! It has no referential integrity nor rollback/commit features (although those can be activated on mainframe). ISAM was popular on mainframe when there wasn't any database (or rather when database was a too massive application to run!) and even there they were superseeded by VSAM. They're not much different from DOS random access files (an index file pointing to the relative record number on the main file).
And it's no suprise that DB/2 scales well: mainframes rarely feature a single CPU, at least as far as I know.... IBM have had some 20 years to practice on multi-cpu machines!
Reply