The performance of the systems on the Top500 list continues to improve exponentially. (Note that it's a logarithmic scale on the vertical axis, with each line marking a tenfold improvement.) The three series of dots here represent the total performance of the top 500 systems at the top, the performance of the No. 1 system in the middle, and the performance of the 500th system at the bottom. (Click to enlarge.)
Top500.org

Performing more than 33 quadrillion calculations per second, a new Chinese supercomputer called Tianhe-2 arrived two years earlier than expected to claim the top spot in a list of the 500 most powerful supercomputers in the world.

The Tianhe-2 has 32,000 Xeon processors boosted by 48,000 Xeon Phi accelerator processors for a total of 3.12 million processor cores linked together with a Chinese interconnect called TH Express-2. It's also got 1 petabyte of memory (that's about 12,500 times as much as in an ordinary personal computer), runs the university's Kylin Linux operating system, and sucks down 17.8 megawatts of power.

All that means the machine's sustained performance is 33.86 petaflops, which is to say 33.86 quadrillion floating-point mathematical calculations per second. That figure all the more notable given that the researchers who compile the Top500 list expected Tianhe-2, also called Milky Way-2, to be deployed in two years.

Its performance is nearly double that of the machine now bumped to second place, the Cray XK7 system called Titan at Oak Ridge National Laboratory, with a speed of 17.59 petaflops. Third place went to Sequoia, an IBM BlueGene/Q system installed at Lawrence Livermore National Laboratory with a speed of 17.17 petaflops.

Tianhe-2 is the successor to the Tianhe-1A supercomputer that topped the list in 2010. The systems indicate a growing ability at the Chinese supercomputing center to not just capably assemble hardware from elsewhere but also to design much of it in-house.

"Most of the features of the system were developed in China, and they are only using Intel for the main compute part," said Top500 editor and computer researcher Jack Dongarra. "The interconnect, operating system, front-end processors, and software are mainly Chinese."

The two Chinese systems spotlight a broader trend in supercomputing, too: the use of special-purpose accelerator processors. Moore's Law is steadily marching on, strictly defined by the number of transistors on a processor doubling every two years. But the clock speed of chips stalled years ago, redirecting a lot of chip development into parallelism -- doing lots of work in small jobs that run simultaneously in parallel rather than in fewer jobs that run sequentially.

General-purpose chips have moved to parallelism by adopting multicore designs, but a newer trend is to offload work onto special-purpose chips that use more extreme parallelism. Intel's Xeon Phi chips are one example, but Nvidia's graphics chips -- repurposed for number crunching -- are more widely used on the Top500.

On the Top500 list, 39 use Nvidia chips, 11 use Xeon Phi, and three use ATI Radeon chips. The top machine uses Xeon Phi, the No. 2 uses Nvidia chips, and Tianhe-1A, now bumped down to No. 10, uses Nvidia chips.

Supercomputers these days are getting a speed boost from special-purpose helper processors called accelerators. Nvidia's graphics processing unit (GPU) chips, repurposed to perform numeric computations, are the most common.
Top500.org

Expect the accelerators to spread further.

"I'm willing to bet that by 2015, all top 10 systems on the Top500 list will be GPU/accelerator-based," Horst Simon, who is Lawrence Berkeley National Laboratory's deputy director, a computer scientist, and one of the four Top500 editors, said in a May interview.

Simon also believes it won't be possible to reach a performance level of a quintillion floating-point operations per second, or 1 exaflops, by 2020. One big limit he pointed out is power consumption:

The increasing trend in power efficiency, though it might look like a gradual slope over time, is really a one-time gain that came from switching to accelerator/manycore [architectures] in 2010. This is not a sustainable trend in the absence of other new technology. There is no more magic -- we're maxed out. Right now, the most efficient system needs 1 to 2 megawatts per petaflops. Multiply that by 1,000 to get to exascale and the power is simply unaffordable.

Still, the list has remained remarkably steady over the years, and extrapolating past trends into the future shows the No. 1 machine will cross the exaflop threshold in 2018.

Other items of note from the latest list:

Sixty-five of the top 500 systems are in China, a number that has leveled off at least for now. The United States was home to 253 systems.

Intel processors are used on 80 percent of the systems.

Sixty-seven percent of systems used processors with eight or more cores.

A total of 180 systems dropped off the list from the previous one released a half a year ago. The minimum level of performance needed to rate in the top 500 supercomputers increased from 76.5 teraflops to 96.6 teraflops.

Hewlett-Packard was the top supplier of systems, with 189 systems. IBM was next with 160, but IBM made four of the top 10 systems.