Trending Tags

Follow

About Michael J. Miller

Miller, who was editor-in-chief of PC Magazine from 1991 to 2005, authors this blog for PC Magazine to share his thoughts on PC-related products. No investment advice is offered in this blog. All duties are disclaimed. Miller works separately for a private investment firm which may at any time invest in companies whose products are discussed in this blog, and no disclosure of securities transactions will be made.

High-End Server Chips Debut at Hot Chips '12

While Intel and AMD account for most of the volume, a fair number of servers (which make up a substantial part of revenue) run other chip architectures. At the annual Hot Chips conference yesterday, representatives from IBM, Oracle, and Fujitsu discussed the architectures their organizations will use in creating high-end microprocessors, typically used in applications ranging from mainframes to high-performance computing. All of these are large processors and tend to use the most advanced techniques. In addition, Applied Micro showed off what is likely to be the first 64-bit ARM server chip to ship, while Intel detailed its existing Xeon E5 processor. (I wrote about other AMD and Intel's offerings yesterday.)

IBM's Scott Taylor discussed the next Power 7+ microprocessor, used in the company's Power systems, typically used for ERP, OLTP, and Java. This chip features eight processor cores, with four-way symmetric multithreading, allowing for a total of 32 threads per chip. Each core has 256KB of level 2 cache, and, in a concept introduced with the earlier Power 7, the chip has 80MB of embedded DRAM as a shared level 3 cache (in contrast with the SRAM usually used as cache in most chips). It is designed to support systems up to 32 sockets. As you might expect, this is a big chip, measuring 567 square millimeters manufactured on a 32nm SOI process.

What's new in this version? It has 2.5 times the amount of level 3 memory, an improved core that allows for up to a 25 percent improvement in frequency, a doubling of single precision floating-point performance, dedicated accelerators to speed up tasks like SSL and file encryption, random number generation, and more power gating for better power management. The Power 7+ is meant to be socket-compatible with the older Power 7, including support for a dual-chip module, which would give a system twice the number of cores and five times the amount of L3 cache as with the earlier chip.

For its mainframe business, IBM showed off its zNext, its third-generation high frequency microprocessor capable of running at up to 5.5GHz, the highest rated speed I've seen for a mass produced server chip. This is the heart of the new zEnterprise EC12 mainframe.

Compared with the previous version (the z196), the zNext will have six cores instead of four and runs at 5.5GHz rather than 5GHz. IBM's Chung-Lung (Kevin) Shum touted advances in the cores themselves, including improved out-of-order operations and a streamlined pipeline, as well as improvements in the cache subsystem. The chip has 48MB of embedded DRAM as a shared level 3 cache, twice as much as the previous version. IBM says this will be the first general purpose microprocessor to support hardware transactions memory and self-directed runtime profiling to help the system tune itself.

Overall, the processor uses 2.75 billion transistors on a 597 square millimeter chip produced on a 32nm SOI process.

In the SPARC family, both Fujitsu and Oracle showed new 16-core architectures, as each takes the architecture originally designed by Sun in different directions.

Fujitsu's Takumi Maruyama showed off the SPARC64 X, a new generation process aimed at UNIX servers. The idea here is to combine features from the company's previous generations aimed at UNIX servers (SPARC64 VII+, which offered a high frequency, up to 3GHz) and high performance computing (SPARC64 VIIfx, which had a higher memory bandwidth). In addition, it adds new hardware acceleration for decimal operations, encryption, and database acceleration.

The Sparc64 X has 16 cores with two threads each, 24MB of level 2 cache, and runs at up to 3GHz, with over 100GB/s peak memory bandwidth. Overall, it uses 2.95 billion transistors on a 540 square millimeter die on a 28nm CMOS process. Fujitsu claims top performance at 382 Gigaflops (billions of floating point instructions per second), and says it has shown seven times the throughput of the Sparc64 VII+.

Oracle's Sebastian Turullols and Ram Sivaramakrishnan showed off the SPARC T5, a 16-core processor optimized for Oracle workloads and engineered systems produced at 28nm. The basic processor includes 16 of the SPARC S3 cores running at 3.6GHz, along with 8MB of shared L3 cache. The cores seem to be shrinks from the 40nm T4, with dynamic threading allowing up to eight "strands" per core, with advanced on-chip encryption acceleration designed to work with the Solaris ZFS file system for faster system encryption. Each core has a 128KB level 2 cache and a crossbar interconnect links the 16 cores to an 8MB 16-way level 3 cache.

This mostly seems to be a shrink of the SPARC T4, but what's different is that the chips are specifically designed to allow for eight-way "glueless scalability." (In other words, designers can connect up to eight chips without any other logic chips controlling them, compared with four chips in the previous generation.) This uses a directory to track all of the level 3 caches in the system and maintain coherency among them, with a high-speed internode coherency fabric. Oracle didn't disclose the chip size.

Perhaps the most interesting architecture, and certainly the newest, comes from Applied Micro, which showed off its upcoming 64-bit ARM-based CPU known as X-Gene, available later this year. Gaurav Singh and Greg Favor talked about how many workloads are moving from being CPU intensive to being data movement intensive. Applied Micro has an ARM architectural license, so it has designed its own cores around the ARMv8 instruction set. The company says it will result in a high-performance, but low-power microarchitecture aimed at balancing performance, power, and size.

This is built on compute modules which have two 64-bit cores (a four-way out-of-order superscalar microarchitecture) and a shared L2 cache. Each of these cores has its own 128-bit SIMD floating point unit. Multiple modules (it showed three in the diagram) would be connected to a shared L3 cache, and the system is designed to have multiple chips all connected on a interconnect bus with DRAM and 10Gbit networking. Applied Micro calls it the "world's first true low power server on chip." A number of other companies have started showing ARM-based server chips including Marvell and Calxeda (with indications that Cavium, Nvidia, and Samsung are working on it), but it looks like Applied Micro will be the first to bring a 64-bit ARM server chip to market.

Also in this session, Intel talked about its more mainstream Xeon E5 processor, known as Sandy Bridge-EP. This is an eight core, 16 thread chip produced on a 32nm process, using a high bandwidth ring interconnect that is already shipping as part of the company's Romley platform.

Intel's Jeff Gilbert and Mark Rowland described how the changes from the Nehalem/Westmere architecture to the Sandy Bridge cores, plus the addition of the on-die ring interconnect, additional memory channels and faster inter-socket communications (QPI) helped improve performance, while changes such as running average power limiting helped reduce power consumption.

This has quickly become the mainstream server processor and on a volume basis, I wouldn't be surprised if it outsold all the others combined. Still, it's interesting to see how other vendors continue to find niches and applications for a variety of other designs.

Automatic Renewal Program: Your subscription will continue without interruption for as long as you wish, unless
you instruct us otherwise. Your subscription will automatically renew at the end of the term unless you authorize
cancellation. Each year, you'll receive a notice and you authorize that your credit/debit card will be charged the
annual subscription rate(s). You may cancel at any time during your subscription and receive a full refund on all
unsent issues. If your credit/debit card or other billing method can not be charged, we will bill you directly instead. Contact Customer Service