New "Xeon Phi" will ship this year—but exascale is still years away.

Share this story

We told you today about the newly crowned world’s fastest supercomputer, which used IBM Blue Gene/Q chips and 1.6 million cores to hit a record 16 petaflops—a petaflop being one quadrillion, or a thousand trillion calculations per second. As if that isn’t fast enough, researchers are trying to build new architectures that would bring high-performance computing into the exascale range, 1,000 times faster than a petaflop.

Intel, whose chips are used in the majority of the world’s 500 fastest supercomputers, claimed today that the newly named “Xeon Phi” line of chips (out later this year) is an early stepping stone toward exascale. The Xeon Phi processors are built with the 22nm 3D Tri-gate Transistors that are also used in the consumer-focused Ivy Bridge chips. Xeon Phi will act in a similar way as the NVIDIA GPUs that serve to speed up many of the world’s fastest clusters. That is, it works as a “co-processor” alongside a server CPU to accelerate workloads.

You might be able to get to an exaflop just by connecting enough of today’s chips—but it wouldn’t be cost-efficient or energy-efficient, so a new architecture is needed. A total of 40 to 50 gigaflops of performance per watt is needed for exascale, John Hengeveld, director of marketing for high-performance computing at Intel, told the IDG News Service. The first Xeon Phi chip, code-named Knights Corner, will have more than 50 cores and deliver four or five gigaflops per watt. Intel says it’ll hit a teraflop in a single processor.

Clearly, exascale is still a ways away—that’s why Intel is targeting 2018 as the year it becomes reality. Petascale computing was first hit in 2008, and the world’s biggest clusters have soared more than an order of magnitude past that mark. But Moore’s Law alone won’t be enough to take HPC much further, some experts believe.

"We're at the point where the processors themselves aren't really getting any faster," Michael Papka of the Argonne National Laboratory—home of the third-fastest supercomputer—told Computerworld. Instead, increasing the size of clusters and improving the use of parallel processing is responsible for much of the speed gains we see each time a new version of the Top 500 supercomputers list is announced. Single-core performance has stagnated on the consumer side too, with speed gains coming from adding cores, running multiple threads per core, and other clever strategies.

Knights Corner was actually previewed one year ago, as Ars reported at the time. What’s new is the Xeon Phi name and Intel’s promise that it will be delivered this year—although an exact release date and pricing weren’t announced. The chips are far enough along that Intel says they’ll be used in a supercomputer called Stampede to be deployed next year at the Texas Advanced Computing Center, the IDG News Service reported. The Xeon Phi and its "Many Integrated Cores (MIC)" technology also makes an appearance on the Top 500 list in a 119-teraflop cluster ranked 150th in the world.

Promising exascale computing in 2018 is rather easy in 2012. But the race to exascale is on, and Intel will have plenty of competition.

Promoted Comments

They're called "three year old GPUs". My gaming box is made with two of them, a pair of Radeon HD 5750s. They're rated for 82 watts each and so deliver 12 gigaflops per watt flat out.

I imagine something modern would be even more efficient.

Yup. Video cards have been able to claim this type of performance for a couple years which is why they've been used in supercomputer clusters and NVIDIA and AMD have divisions targeting this use. The problem is that they are less flexible in how they can be used so not all calculations work well on the and it typically requires more effort to get code running well on them. The Intel Xeon Phi chips will be using using the x86 architecture and will be more capable and more familiar for people to work with.

This is the general trade off that is almost always present. The more focused the design of a chip is the faster and/or more efficient it can be at performing those tasks. The more flexible it is the bigger/slower it will be. There can be many different points on this spectrum that are useful for different situations and are profitable for different companies to pursue.