AMD’s 16-core Bulldozer pushes into the server room

After some delay, AMD has shipped a new Opteron processor with the most cores …

AMD released two new families of Opteron processors based on the company’s “Bulldozer” architecture today, and previewed plans for a third, in the company’s latest effort to take back market share from Intel’s Xeon CPU in the server market.

The Opteron 6200 series processors, previously known by the code-name "Interlagos," integrates two of the dies used for the desktop processor into a multi-chip module packing up to 16 cores, the most ever on an x86 processor. Opteron 6200s are targeted at the high-performance server market. The other new arrival, the Opteron 4200 series (previously known as "Valencia"), is focused on power efficiency—AMD claims it has the lowest power consumption per core of any x86 processor. But while both tout record-breaking features, the question is whether they’re significant enough to get customers to switch their infrastructure away from Intel.

AMD Server Product Marketing Manager Michael Detwiler told Ars that customers could see up to a 35 percent performance increase over AMD’s Opteron 6100 series, the company’s previous top-performance server processor. “It could be anywhere from 25 percent to more than 35 percent, depending on workload they run,” he said. Detweiler also claimed the Opteron 6200 had tested with up to 84% better performance in two-socket configurations than Intel’s Xeon 5600 series on some applications—particularly in floating-point processing heavy computing.

Some older server operating systems will encounter problems on the 6200 series. Detwiler said that the 6200 has been tested as compatible with Windows Server 2003 Release 2 service pack 3 and after. Any Linux based on the 2.6.31 kernel or earlier also fails—including older versions of VMware ESX. Detwiler said that older Windows and Linux versions could still be run as virtualized operating systems on the platform.

The first official benchmarks for the 6200 give it an edge in other departments as well. In a set of Spec benchmarks released today on Hewlett-Packard server platforms done head-to-head with similarly configured Xeon systems, a dual configuration of the 2.56 Opteron 6282 SE turned in peak integer rates about 60% higher than those of a dual 2.4 Ghz Xeon E5645.

When you count the cores, that performance is a lot less impressive—the Xeons tested only had six cores per processor. So the Opteron 6282 SE's performance per core was actually only 61% of that of the Xeon processor.

But AMD's pitch for the Opteron 6200 series is all about price performance. The "top bin" Opteron 6200 processor is priced at $1,019, Detwiler said—nearly $100 less than its 6100 predecessor. And by packing more performance into lower-cost, more densely packed systems, AMD is hoping to win the hearts and minds of companies looking to save on rack space and power usage.

That's also the angle AMD is taking with its upcoming addition to the Opteron line—the 4200 series, code-named "Zurich." Another Bulldozer-based processor, the 4200 is targeted at hosting companies and big cloud service providers for dedicated Web hosting "micro-servers" and other applications. It will use a new socket type, and is designed to work with more desktop-like components—allowing AMD to take on Atom-based and ARM-based servers.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive. Shipping half a million before the official release date also isn't to be sniffed at. AMD should have just skipped the desktop Bulldozer release and done a shrink of Thuban at the same time as Llano.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive. Shipping half a million before the official release date also isn't to be sniffed at. AMD should have just skipped the desktop Bulldozer release and done a shrink of Thuban at the same time as Llano.

I really wonder how a 32 nm Thuban clocked a bit faster would perform. On the desktop I bet it would be pretty good, especially at the things most desktop users do.

I think that with the new approach taken by AMD with Bulldozer, we'll start to see some confusion on how to compare processors with a different architecture. I believe that given how Bulldozer works, we should compare "Modules" with Intel's Core and not Bulldozer's "Core" with Intel's Core. The reason being that in a module two integer cores, and a floating point unit, are all snapped together. AMD even tout how this performs better than hyperthreading. So in my opinion we should use Modules as the unit of comparison for Bulldozer. And if we do so, Bulldozer has 20% more performance than Xeon per "module".

NO business is going to upgrade their infrastructure because a new processor is released. It's just another option when what they have can no longer function (usually because of old hardware failure). If AMD wants to gain in the server market, they have to produce quality for years not one chip.

8 modules per processor, with the potential to run 16 threads simultaneously... this would be a virtualization workhorse.

Mack trucks might be slower than a Lamborghini, but if you want to move a bulk of stuffs (server workloads) it will ultimately be faster (one trip delivering everything simultaneously instead of multiple back-and-forth trips)

I think that with the new approach taken by AMD with Bulldozer, we'll start to see some confusion on how to compare processors with a different architecture. I believe that given how Bulldozer works, we should compare "Modules" with Intel's Core and not Bulldozer's "Core" with Intel's Core. The reason being that in a module two integer cores, and a floating point unit, are all snapped together. AMD even tout how this performs better than hyperthreading. So in my opinion we should use Modules as the unit of comparison for Bulldozer. And if we do so, Bulldozer has 20% more performance than Xeon per "module".

I'm more concerned that a Bulldozer core is 2-issue (down from Thuban's 3-issue) when the Core architecture that Sandy Bridge still inherits has been 4-issue for years. With SMT, an Intel core is pretty close to an AMD module. AMD's "core" marketing with Bulldozer is as bad as Intel's "GHz" marketing from the Netburst era.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive

The biggest problem is that huge amounts of commercial server software is now licensed per core. So being price:performance competitive but needing twice as many cores will cripple you on the software costs. And if you're using stuff like Oracle, WebSphere, and the like, the software cost differential will absolutely destroy any small hardware price difference.

NO business is going to upgrade their infrastructure because a new processor is released. It's just another option when what they have can no longer function (usually because of old hardware failure). If AMD wants to gain in the server market, they have to produce quality for years not one chip.

This is wrong on so many levels. First businesses often see a benefit to upgrading out of cycle, and no company with any sense waits until a system breaks to replace it. Finally AMD has been making quality processors for many years, due to the massive advantage in usable memory bandwidth they had over Intel prior to Nehalem all of my database servers until this year have been AMD.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive

The biggest problem is that huge amounts of commercial server software is now licensed per core. So being price:performance competitive but needing twice as many cores will cripple you on the software costs. And if you're using stuff like Oracle, WebSphere, and the like, the software cost differential will absolutely destroy any small hardware price difference.

Yep, though I may pick up some 16 core servers next year to run my virtualized SQL Server workloads so when we do our new EA we can get 64 cores worth of licenses for SQL2012 (MS is granting EA customers with current SA as many cores of SQL2012 as they have SQL Enterprise in production at their renewal date). In that regard buying as many cores per socket as I can get will be chump change in the long term thanks to software savings.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive

The biggest problem is that huge amounts of commercial server software is now licensed per core. So being price:performance competitive but needing twice as many cores will cripple you on the software costs. And if you're using stuff like Oracle, WebSphere, and the like, the software cost differential will absolutely destroy any small hardware price difference.

Yep, though I may pick up some 16 core servers next year to run my virtualized SQL Server workloads so when we do our new EA we can get 64 cores worth of licenses for SQL2012 (MS is granting EA customers with current SA as many cores of SQL2012 as they have SQL Enterprise in production at their renewal date). In that regard buying as many cores per socket as I can get will be chump change in the long term thanks to software savings.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive

The biggest problem is that huge amounts of commercial server software is now licensed per core. So being price:performance competitive but needing twice as many cores will cripple you on the software costs. And if you're using stuff like Oracle, WebSphere, and the like, the software cost differential will absolutely destroy any small hardware price difference.

Yep, though I may pick up some 16 core servers next year to run my virtualized SQL Server workloads so when we do our new EA we can get 64 cores worth of licenses for SQL2012 (MS is granting EA customers with current SA as many cores of SQL2012 as they have SQL Enterprise in production at their renewal date). In that regard buying as many cores per socket as I can get will be chump change in the long term thanks to software savings.

As I understand it, MS licenses per socket, not per core.

Not with 2012, to be "cloud compatible" they're changing the licensing to per core. Like I said for existing EA customers with current SA we'll be able to give MS a core count that we are using under our existing agreement and they will issue us that many cores on our next EA. See here for more info. For us it will actually be kind of nice as after the conversion we will be able to allow SQL Server VM's to move anywhere in the cluster rather than using VMWare affinity rules to limit them to only the two hosts that are licensed.

This looks better for Bulldozer - server applications are far more amenable to multi-threading, and thus the many-cores model will shine. It also looks like the prices are very competitive

The biggest problem is that huge amounts of commercial server software is now licensed per core. So being price:performance competitive but needing twice as many cores will cripple you on the software costs. And if you're using stuff like Oracle, WebSphere, and the like, the software cost differential will absolutely destroy any small hardware price difference.

Not to say this isn't an issue (it is), but it isn't an issue everywhere. Red Hat (and probably others) still license per socket, and lots of server loads can be (and is being) run on a FOSS stack.Time will tell how well AMD can carve maketshare with what they have.

Yep, though I may pick up some 16 core servers next year to run my virtualized SQL Server workloads so when we do our new EA we can get 64 cores worth of licenses for SQL2012 (MS is granting EA customers with current SA as many cores of SQL2012 as they have SQL Enterprise in production at their renewal date). In that regard buying as many cores per socket as I can get will be chump change in the long term thanks to software savings.

Why would you run an expensive application like SQL2012 on a processor with relatively weak cores?

Yep, though I may pick up some 16 core servers next year to run my virtualized SQL Server workloads so when we do our new EA we can get 64 cores worth of licenses for SQL2012 (MS is granting EA customers with current SA as many cores of SQL2012 as they have SQL Enterprise in production at their renewal date). In that regard buying as many cores per socket as I can get will be chump change in the long term thanks to software savings.

Why would you run an expensive application like SQL2012 on a processor with relatively weak cores?

It doesn't make sense.

In some cases more threads may give better performance than faster cores.

Compete with what may I ask? The amount of AMD based servers I see in data centers is less than a few percent, and the percentage has been dropping. Legitimate IT folk don't worry about the sticker on front of the blade chassis being Intel or AMD, and the people selling the servers (HP, Dell, IBM, etc) have never been real keen on pimping AMD gear. They do, but almost as an after thought, and always occupying weird price tiers where the Intel towers and blades easily bracket.

Has nothing to do with the quality factor of AMD chips. The sad fact is, that's the only thing AMD makes. Intel architectures are surrounded with Intel components while the AMD platform is surrounded with (??).

AMD had it's glory when the AMD64 was competiting against P4D's with extra cores duct taped on. That was also when a AMD64 based platform consumed half the power and easily raped the Netburst based junk Intel geld onto too long just to get it out of their reseller channels. That was last decade...things have changed. Sandy Bridge is murdering AMD on desktop sales and I simply don't see anything positive for AMD on the horizon given Intel is not likely to make the same mistake.

We use a FOSS server stack -- CentOS, PostgreSQL, Apache, Perl, KVM, QEMU -- so extra cores work very nicely for server work w/o any worries about licensing fees. And in our environment, CentOS (RHEL) packages are not compiled with an uber Intel compiler so much of the Sandy Bridge > Thuban/Bulldozer advantage shown in benchmarks across the internet simply don't exist. A good example of this is PostgreSQL. On PostgreSQL running on native CentOS, Thuban 3.3ghz runs at about the same speed Sandy Bridge 3.3ghz. Under CentOS KVM, the gap jumps to +35% in favor of AMD's chips.

Now it would have been nice if AMD had just die shrunk Thuban/Lisbon/Magny-Cours for higher clock rates but the slower performance per mhz per core for Bulldozer is an OK trade-off for more physical cores. Especially since we already make this trade-off for our production servers -- i.e., desktop AMD/Intel CPUs with higher mhz/fewer cores versus Opterons/Xeons with more cores/lower mhz. (Our development Thuban 3.3ghz is actually crazy crazy crazy faster than our production Opterons/Xeons under low concurrency -- I have to downclock my dev server using cpufreq just to make sure code runs OK on slower hardware.)

We use a FOSS server stack -- CentOS, PostgreSQL, Apache, Perl, KVM, QEMU -- so extra cores work very nicely for server work w/o any worries about licensing fees. And in our environment, CentOS (RHEL) packages are not compiled with an uber Intel compiler so much of the Sandy Bridge > Thuban/Bulldozer advantage shown in benchmarks across the internet simply don't exist. A good example of this is PostgreSQL. On PostgreSQL running on native CentOS, Thuban 3.3ghz runs at about the same speed Sandy Bridge 3.3ghz. Under CentOS KVM, the gap jumps to +35% in favor of AMD's chips.

Now it would have been nice if AMD had just die shrunk Thuban/Lisbon/Magny-Cours for higher clock rates but the slower performance per mhz per core for Bulldozer is an OK trade-off for more physical cores. Especially since we already make this trade-off for our production servers -- i.e., desktop AMD/Intel CPUs with higher mhz/fewer cores versus Opterons/Xeons with more cores/lower mhz. (Our development Thuban 3.3ghz is actually crazy crazy crazy faster than our production Opterons/Xeons under low concurrency.)

++

KVM + 16core AMD CPUs = awesome!

We use a form of KVM for our cloud environment (Hexagrid) and everything is optimized for AMD - these 16core CPUs would just be awesome for their compute nodes! Cores are way more important than Ghz when it comes to VMs... IMO

Yep, though I may pick up some 16 core servers next year to run my virtualized SQL Server workloads so when we do our new EA we can get 64 cores worth of licenses for SQL2012 (MS is granting EA customers with current SA as many cores of SQL2012 as they have SQL Enterprise in production at their renewal date). In that regard buying as many cores per socket as I can get will be chump change in the long term thanks to software savings.

Why would you run an expensive application like SQL2012 on a processor with relatively weak cores?

It doesn't make sense.

Just gaming the system, we'll run on the Interlagos systems for 90 days prior to renewal so that we can maximize the number of cores we receive on our next EA. We don't push our SQL servers all that hard most of the time, our heavy lifting is done by Oracle Enterprise, but getting 64vCPU's worth of 2012 licenses should allow me to avoid any costs for additional licenses during our next 3 year EA contract.

It won't run a lot of older OSes? How the hell did they manage to do such a screwup???

Why is that such a big problem?

I just want to know WHY. What are the technical reasons?

My guess the OSs in question can not support that many CPU cores... I mean its a lot...

But its really a non-issue as this is really designed for large VM clusters,etc... all of which support it...

According to tests, the OSes in question don't even boot. I don't think that's because of the number of cores. While it's a non-issue for me too, I'd still like to know the technical details. Being a geek and all..

I've read a lot of articles and trawled AMD's site, but haven't found anything..