World's first petaflops super dumped on scrap heap

Roadrunner, the first supercomputer to break through the petaflops barrier and the first capability-class machine to demonstrate the viability of using specialized coprocessors in conjunction with processors, is having its plug pulled at Los Alamos National Laboratory.

It's time to hack in and play Crysis while you still can.

The massively parallel hybrid supercomputer, built for the US Department of Energy by IBM, was commissioned in September 2006 and was designed to pit Big Blue against Cray, which at the time was working on a petaflops-busting Opteron machine at Oak Ridge National Laboratory.

Uncle Sam likes to have at least two dominant supercomputer suppliers and it likes to hedge its bets on multiple architectures – or more precisely, it used to until someone filed the budget for an exascale system. At the moment, given that multi-billion commitment to develop the processing, interconnect, and packaging technologies that will be necessary to get an exaflops machine with an exabyte of storage to market in the dreamed-of 20 megawatt power envelope, the US government labs are trying to think about how to get the bill for a couple 100 petaflops systems through Congress.

The Roadrunner design took a pair of dual-core Opteron processors and slapped them on a BladeCenter LS21 blade server. Each blade had two QS22 blade servers, each with two of IBM's PowerXCell 8i processors, attached to them through PCI-Express x8 links. So you had two Cell chips that were acting like coprocessors for each Opteron chip – or, more accurately, one Cell chip per each Opteron core.

The Cell chip married a Power4-derived core to eight specialized processors that could be used for lots of different tasks, such as doing math or rendering images. The flexibility of the Cell chip is one reason Sony chose it for its PlayStation 3 – and to be fair, it was designed as a game chip first and as a supercomputing coprocessor second, just as has been the case with the transformation of GPUs from graphics chips to numeric coprocessors. The Roadrunner machine had 6,948 tri-blade nodes, with a total of 13,896 x86 cores and 101,520 vector units on the Cell.

The Roadrunner supercomputer at Los Alamos

IBM originally had pretty big plans for the Cell chips, and it never did explain why it defunded the Cell development even as it stopped selling the last Cell-based blade serverin January 2012.

The rumor going around several years back was that IBM heir apparent Robert Moffat, who was busted in a Wall Street insider trading scam and who was in charge of servers before that happened, was trying to shave costs. And in doing so, if this rumor is true, then Moffat left the way wide open for Nvidia to walk in with its GPU coprocessors and inevitably also lead to Sony moving to AMD processors for its PlayStation 4.

This was not a particularly long view to take for a company that has an expensive foundry in East Fishkill, New York, to feed. IBM seemed to be on the right track with Roadrunner, using cheap chips and InfiniBand to make a very dense and at the time very power-efficient machine. All it had to do was ramp up the Cell roadmap and keep pushing the packaging envelope. Instead, IBM made a big deal about Cell, and messed it up – at least when it came to servers.

Think of all the copper...

IT is always about politics as much as it is about technology, and in the case of supercomputing this is doubly so because outside politics and nationalism play such big parts in decisions. A company such as IBM, which is trying to buy back billions of dollars in shares every year to keep Wall Street addicted, can't try to pursue too many different designs, so it has to make choices.

To El Reg's thinking, IBM should have thrown the "Blue Waters" Power 775 system under the bus a long time before it even tried to come to market, and should have done more to keep alive Cell if it wanted to keep its Power chip franchise alive outside of its commercial Power Systems servers.

It didn't take a supercomputer to figure that out. Or rather, it did, and IBM had already built it and should have seen the future coming that it helped build. What looked like a science project – hybrid computing – may be the only way to get to exascale.

Los Alamos officially decommissioned Roadrunner on Sunday, nearly five years after it first broke through a sustained petaflops of operation on the Linpack Fortran benchmark test in May 2008.

The Department of Energy did not often talk specifically about the work that Roadrunner did, but generally speaking the system (and a number of others) was designed to simulate nuclear explosions so that US weapons designers could redesign nukes without having to violate the Nuclear Test Ban Treaty. In its statement, Los Alamos said that Roadrunner was tasked with trying to correlate the energy flow in a nuke and how changes in that flow improved or decreased its explosive yield.

For the next month or so, researchers at Los Alamos will be able to play around with the system now that it has been wiped and declassified. One such job will be looking at memory-compression techniques on a massively parallel system, and another will look at how to optimize data routing on a large cluster to boost performance.

During its life, before it was put behind the firewall to do its nuke work, Roadrunner was also used during a shakedown period to run all kinds of codes and simulations, with research covering nanowire material behavior, magnetic reconnection, laser backscatter, HIV phylogenetics, and a simulation of the universe at a 70-billion-particle scale.