Inside IBM's Sequoia: 16 thousand trillion calculations per second

IBM's Sequoia is the fastest in the world

Fujitsu's K Computer is no longer the fastest supercomputer in the world. According to the latest Top500 list of the world's fastest supercomputers, that honour now goes to IBM's Sequoia system, which beats the K's 10.5 petaflop Linpack benchmark with a sustained world record speed of 16.32 petaflops. That's over 16 thousand trillion calculations per second.

The Sequoia is a Blue Gene/Q system running IBM's 64-bit, 18-core PowerPC A2 processors. There are 98,304 of these advanced 1.6GHz chips housed in its 96 racks. The configuration gives Sequoia just over 1.57 million parallel computing cores.

Like the giant redwood that gives Sequoia its name, the world's fastest supercomputer isn't small. It occupies 3,422 square feet, and LLNL (the Lawrence Livermore National Laboratory in California) had to strengthen the floor of its building before it was installed. The 96 racks reportedly weigh as much as 30 adult elephants.

Sequoia was designed with a theoretical top speed of 20 petaflops. This makes it eight times more powerful than LLNL's previous number cruncher, Blue Gene/L, which was benchmarked at 478.2 teraflops and ruled the Top500 list from 2005-2008.

The National Nuclear Security Administration uses this incredible processing power to simulate the safety and reliability of the US nuclear stockpile, eliminating the need for underground testing. According to LLNL, "Sequoia will provide a more complete understanding of weapons performance, notably hydrodynamics and properties of materials at extreme pressures and temperatures."

The June 2012 Top500 rankings were revealed at the International Supercomputing Conference in Hamburg. Sequoia knocked the 705,024-core Japanese K computer (10.5 petaflops) into second place, with another Blue Gene/Q system, Mira (8.2 petaflops), in third.

The IBM-built, Intel-powered SuperMUC iDataPlex system debuted in fourth place (2.9 petaflops), while the Chinese Tianhe-1A (2.6 petaflops) completed the top five.

Sequoia cements IBM's position as the world's premier supercomputer provider. There are 213 IBM-built systems listed in the Top500, including 21 Blue Gene/Q clusters, four of which rank in the top 10.

The team behind the Sequoia

In addition to Sequoia and Mira, the Italian Fermi cluster at CINECA (1.7 petaflops) slots in at number seven, while the German JuQUEEN (1.4 petaflops) is number eight. The question is: how long will Sequoia be able to hold on to the top spot?

China has two systems in the top 10 and, according to The Register, it is planning a 100+ petaflop supercomputer for 2015. The Tianhe-1A wasn't just a statement of intent. With its combination of 7,168 NVIDIA Tesla M2050 GPUs and 14,336 Intel Xeon processors, it marked a significant change in the way supercomputers might be built in the future.

According to the Top500 data, "58 systems use accelerators or co-processors (up from 39 six months ago), 53 of these use Nvidia chips, two use Cell processors, two use ATI Radeon and there is one new system with Intel MIC technology."

Intel's Many Integrated Core (MIC) technology has launched as Xeon Phi, and appears in a supercomputer ranked 150th on the Top500. It's a low-key debut for a co-processor that can deliver a teraflop of performance and fit into a single PCIe slot, especially when Intel's first teraflop computer in 1996 consisted of 9,298 CPUs and occupied 72 server cabinets.

Teraflop graveyard

What happened to that first teraflop supercomputer? It's probably buried in a landfill. According to Gary Grider, deputy division leader of the lab's High Performance Computing Division at the US Department of Energy's Los Alamos National Laboratory in New Mexico, supercomputers are often disposed of in this way when they are decommissioned.

But some can have their lives extended with chip upgrades. The Jaguar system at the Oak Ridge National Laboratory is a case in point. Its AMD chips were recently upgraded to 16-core Opteron 6274 processors running at 2.2 GHz. Of its 18,688 computing nodes, 960 now incorporate an Nvidia GPU for extra processing power.

This improved Jaguar is rated at 1.9 petaflops and appears in sixth place on the Top500 list. It's only in the first phase of its overhaul too. After the rest of its upgrade, it will evolve into 'Titan' - a supercomputing cluster with an estimated 10-20 petaflops of computing performance.