Intel's Xeon 5600 processors

Westmere-EP adds two more cores to an already-potent mixby Scott Wasson  12:13 AM on July 23, 2010

As you may know, Intel has enjoyed a resurgence in its server and workstation processor business over the past several years, due in no small part to regular and effective refinements to its core CPU technology. The introduction of the "Nehalem" quad-core Xeons last year was the biggest step forward the firm has taken in many years, with a whole new system architecture nicely complementing a revamped processor microarchitecture. The results were major gains in scalability, performance, and power efficiency compared to the prior generation of Xeonsalong with renewed strength for Intel's competitive standing versus its main rival, AMD.

By contrast, this year's revision of the Xeon is comparatively simple, even modest. The new 32-nm Xeons, code-named Westmere-EP, raise the on-chip core count by twoto a total of six cores per chipwhile fitting into the same socket and cooling infrastructure as the Nehalem Xeons before them. The Westmere Xeons' clock frequencies are largely similar, as is the per-clock performance of each core.

A change like that is easy to grasp, but it's also easy to underestimate. In the thread-rich realm of server-class applications, with a robust system architecture like this one, adding two more cores can boost performance by nearly 50%. From another angle, that boost could translate into a similarly large increase in energy efficiency, because half again as much work is being accomplished for each watt-hour the system consumes. If talk like that doesn't float your boat, you're probably not a system administrator responsible for a room full of servers. I'd wager most folks in such roles would happily accept a 50% gain in power-efficient performance each year, if they could get it.

The question is whether the Westmere-EP Xeons really deliver on their advertised promise. We've had a number of systems cranking away in Damage Labs for the last little while in order to find out, and, without giving away the game entirely, the news is even better than you might think. Since our last look at workstation/server-class processors, the state of the art in such systems has changed on multiple fronts, from the growing prevalence of platforms tailored for power efficiency to the proliferation of solid-state disks. Our revised suite of test systems provides a nice overview of the landscape. Read on to see how it all fits together.

Westmere-EP: both less and more
Intel's 32-nm chip fabrication process is what makes Westmere possible. This relatively new fabrication technology allows substantially more gatesand thus transistors, logic, and ultimately coresto fit into a given amount of chip area than the 45-nm processes used formerly by Intel and still today by AMD. In this generation of process tech, Intel has carried over its high k + metal gate transistors, first used at 45 nm, and moved to immersion lithographyin which a liquid medium is used to better focus lightfor the first time. By now, Intel is well into ramping its 32-nm production, with the dual-core Clarkdale and six-core Gulftown processors making up a large proportion of its consumer mobile and desktop CPU lineups. In fact, our review of these Xeon processors is rather late; the Westmere-based Xeon 5600 series has been shipping to customers for a number of months, as well.

Code name

Key
products

Cores

Threads

Last-level
cache size

Process
node
(Nanometers)

Estimated
transistors
(Millions)

Die
area
(mm²)

Harpertown

Xeon 5400

2 x 2

2 x 2

2 x 6 MB

45

2 x 410

2 x 107

Nehalem-EP

Xeon 5500

4

8

8 MB

45

731

263

Westmere-EP

Xeon 5600

6

12

12 MB

32

1170

248

Shanghai

Opteron 2300

4

4

6 MB

45

758

258

Istanbul

Opteron 2400

6

6

6 MB

45

904

346

Lisbon

Opteron 4100

6

6

6 MB

45

904

346

Magny-Cours

Opteron 6100

2 x 6

2 x 6

2 x 6 MB

45

2 x 904

2 x 346

The remarkable thing about the Westmere-EP Xeons, as illustrated in the table above, is that they incorporate two more cores and 50% more cacheL3 size is up from 8MB in Nehalem to 12MB hereyet they are actually smaller chips than their predecessors.

A close-up of a Westmere-EP wafer. Source: Intel.

AMD hasn't made a process transition lately, and GlobalFoundries currently lags behind Intel by roughly a year, if not more. Thus, Westmere's competition is a much larger chip, at 346 mm², with the same core count. In fact, the most direct competition for the Westmere Xeons is arguably the Opteron 6000 series, which is based on two of those larger chips packaged together in each socket. The contrasts here are stark enough to incite me to use italics twice in two paragraphs, so we're not talking small potatoes. Smaller chips, of course, are generally more desirable for a number of reasons, including lower manufacturing costs and typically lower power draw with tamer thermals.

By and large, Westmere-EP is essentially a Nehalem Xeon that's been ported over to the new 32-nm process, but it has received a host of notable tweaks along the way, not least of which is the aforementioned addition of 50% more cores and cache. Thanks to Intel's version of simultaneous multithreading, known as Hyper-Threading, a six-core Xeon can track and execute 12 hardware threads. Two Westmere Xeons in a 2P system present an imposing total of 24 threads to the OS.

The other modifications in Westmere-EP are minor but numerous. Some of them boost performance in various ways. A suite of seven new instructions, collectively dubbed AES-NI, can accelerate cryptography. The chip's integrated memory controller now supports two DIMMs per channel at 1333MHz, raising the limit from 1066MHz in Nehalem. Also, the number of memory buffers has risen from 64 to 88, offering the potential for higher peak bandwidth at a given memory frequency. And, as is almost customary these days, certain latencies have been reduced in the CPU's virtualization hardware, potentially enhancing performance for consolidated servers.

Another set of changes in this new silicon focuses on advancing power efficiency. The Nehalem Xeons introduced a gate capable of shutting off power to idle cores; Westmere adds a power gate for the "uncore" portion of the chip capable of reducing the voltage to the memory controller, L3 cache, and QuickPath interconnect when both sockets in a 2P system are idle. Another potential heavy hitter for server installations will be the memory controller's ability to support low-voltage DDR3 memory, which has become available in recent months. The chip's APIC timer now continues running when the CPU goes into a deep sleep state, too.

A pair of Xeon X5670 processors

From this Westmere-EP silicon, Intel has spun an entire range of new Xeons dubbed the 5600 series. We detailed the various models here when those products were first introduced. The 5600 lineup and its pricing appear to have remained largely static since then.