US weather boffins tap IBM for 1.6 petaflops super

The US National Center for Atmospheric Research has fallen behind in the race for more powerful parallel supercomputers, and it needs a petaflops-class machine to run its weather and climate modeling simulations. The only trouble is, there's no power at its Boulder, Colorado facility to juice up such a behemoth.

And so, NCAR is building its new 1.6 petaflops supercomputer center under the big sky in Cheyenne, Wyoming, and of course it is nicknaming the machine "Yellowstone" after the nearby national park that is the home of the Old Faithful geyser.

NCAR's Wyoming Supercomputing Center

There are a few interesting things about this new Yellowstone supercomputer. First, Cray did not land the deal, IBM did. And you might have been thinking that Cray was being given the inside track when, in April 2010, NCAR bought an XT5m mini super from Cray. Moreover, NCAR researchers have access to the "Jaguar" supercomputer at the US Department of Energy's Oak Ridge National Laboratory and the "Kraken" machine located at the University of Tennessee: two of the largest systems ever built by Cray. It sure looked like Cray might be able to land a deal at NCAR. But this might have been more about leverage than anything else.

NCAR was founded in the 1960 and is funded by the National Science Foundation to do weather and climate modeling. Historically, NCAR was on the very bleeding edge of supercomputing, and it was fond of designs from Seymour Cray, the supercomputer designer from Control Data Corp who eventually left to found Cray. Between 1963 and 1975, NCAR was a CDC shop, but in 1976 NCAR got the very first Cray-1A vector supercomputer. Over the years, NCAR has monkeyed around with Thinking Machines massively parallel computers (it got the first one of those, too) and used some of the early IBM PowerParallel RISC boxes in the early 1990s. In the 1990s, NCAR caused a huge kerfuffle when it brought in NEC, Fujitsu, and Hitachi on a bid for its next-generation, which got the Japanese IT giants accused of dumping and eventually huge import tariffs were slapped on them. Ironically, Cray didn't get the NCAR deal either, and NCAR has more or less been an IBM shops since then.

We said a year-and-a-half ago that Big Blue would be working hard to keep Cray out of NCAR, and it has obviously succeeded. In fact, it NCAR confirmed to El Reg that Big Blue kept out three competitors for the Yellowstone deal; it could not name names.

But the other odd twist is that NCAR is currently using IBM's Power-based machines to run its weather simulations, and it did not choose the Power7-based, multi-petaflops "Blue Waters" behemoth that was intended to be installed at the National Center for Supercomputing Applications at the University of Illinois but which had its plug pulled by IBM back in August because at the price that NCSA and IBM agreed to back in 2007 to build Blue Waters the company could not make any money. At the list prices that IBM is charging for the Power 775 server nodes on which Blue Waters was based, you would pay around $150m per petaflops of peak theoretical performance. That ain't cheap.

The current "Bluefire" super at NCAR is a Power6-based Power 575 cluster running AIX; it has 4,064 cores and also the first cluster of the water-cooled Power 575 that IBM sold. So you might think that moving to a Power 775 cluster would be the natural move for NCAR to make. But at the prices IBM is trying to charge for the commercialized Blue Waters iron, it is perhaps not much of a surprise that the Yellowstone super is based on x86 processors.

NCAR's Wyoming data center awaiting the Yellowstone super from IBM

The feeds and speeds of the new machine are not being divulged, but the machine is based on IBM's iDataPlex hybrid blade-rack designs. Like other supercomputer deals announced ahead of the SC11 supercomputing conference next week, Yellowstone will be based on the "Sandy Bridge-EP" Xeon E5 processors from Intel, which are due early next year. The machine will have a total of 74,592 cores and a total of 149.2TB of main memory. Assuming that NCAR is using eight-core Xeon E5s and two-socket server nodes, that is 4,662 servers with 32GB of main memory per node. Yellowstone will have a 17PB file system, too, and the overall system will have 27 times the oomph and 12 times the disk capacity of the Bluefire system back in Colorado. By the way, the machine has 9.7 million times the oomph of that original Cray-1A vector system. The nodes in the Yellowstone machine will use an InfiniBand network from Mellanox Technologies to speak to each other and share work, just like the current Bluefire system does.

According to a spokesman for NCAR, depending on the features installed on the system, the Yellowstone machine is expected to cost somewhere between $25m and $35m. IBM expects to begin installation of the iDataplex nodes early in 2012 and the plan is for NCAR to be up and running in the summer next year. The University of Wyoming is, according to The Billings Gazette, ponying up $1m a year for the next two decades to get access to 20 per cent of the compute capacity of the machine.

Wouldn't it be funny if NCAR used geothermal power to run Yellowstone, and did outside air-cooling, too? ®