Supercomputers look beyond power

San Jose, Calif. -- Cray Inc. and IBM Corp. will split nearly half a billion dollars as part of a government contract announced late last month to develop a class of supercomputers that are not only more powerful but also easier to use than today's systems. A third competitor, Sun Microsystems, was dropped.

Cray and IBM were selected for Phase III of the High Productivity Computing Systems program (HPCS) managed by the Defense Advanced Research Projects Agency (Darpa). Cray will receive $250 million and IBM $244 million to develop prototype systems by 2010.

The prototypes will have to show a path to computing more than 1 quadrillion floating-point operations per second (petaflops) and improve application development time tenfold over the typical development time in 2002, when the HPCS program began. The two winners will also have to show Darpa officials a business plan for developing systems based on the prototypes for government and commercial users.

"The key here is delivering new levels of productivity. Anyone can deliver a large system, but not everyone can make it truly useful," said William Harrod, the HPCS program manager at Darpa.

Supercomputers today are typically measured by their raw CPU performance on the Linpack benchmark, the ranking used for the Top500 list of the world's most powerful computers. The HPCS program promotes a more nuanced range of seven benchmarks, including Linpack as well as measures of a system's bandwidth and memory capabilities.

The Cray and IBM machines will make marked advances in raw performance over today's supers but are expected to deliver huge leaps in other areas that would make it easier to develop and port software to massive systems as well as to recover quickly from a hardware failure.

HPCS machines should be capable of 2 to 4 petaflops, compared with more than 200 teraflops for the IBM Blue Gene/L system that leads the Top500 list today. But measured in giga-updates per second (gups)--or how fast a system can update a random part of its memory--the new machines should hit between 8,000 and 64,000 gups, compared with just 35 for today's Blue Gene/L.

The new benchmark suite has "seen a lot of interest," said Jack Dongarra, a University of Tennessee researcher who is co-author of the Top500 list and who helped develop the HPCS benchmark suite.

Listed on the benchmark Web site (http://icl.cs.utk.edu/hpcc/) are performance stats for 137 systems tested on the suite. The National Science Foundation referenced the benchmarks in a recent call for proposals for a $200 million petaflops-class system it plans to buy.

The new suite will not rank systems. But researchers plan to augment the suite with a tool that would let users weight the seven benchmarks to create customized tests that best correspond to the needs of their applications.

Tennessee's Dongarra praised the HPCS program because "it has injected about $1 billion over 10 years into high-performance computing. That's got to have a tremendous impact."

Nonetheless, "most of the money has been spent on industry," Dongarra said. "It's unfortunate more money hasn't gone into academic computing, because we need to develop the centers and attract the students who will be the next-generation leaders in this field."

Cray received $25 million and IBM $12.2 million in November as the first payouts for the Phase III contract. The rest of the awards will be doled out as the companies hit milestones. Those include a software design review in 18 months, a hardware review in 30 months, a subsystem prototype in late 2009 and a one-quarter-scale system prototype in 2010.

"This agreement marks one of the largest investments for the U.S. government in a next-generation supercomputer and one of the most watched procurements in high-performance computing," said Peter Ungaro, chief executive of Cray. "We are talking about a system that will scale well north of 10 petaflops."

Cray's win is a bright spot for a company that has been struggling. At the end of June 2005, Cray laid off 10 percent of its employees, or about 90 people. Many of those who weren't cut remained under a salary-reduction program until late last year. "Whether Cray could have survived without this [win] was unclear," said Tennessee's Dongarra said, but the company "is much more focused now than it has been in the past."

"This will help us maintain the very high rate of R&D spending that we have had for the last few years," Cray's Ungaro said.

The impact of the loss to Sun may not be so great, given its R&D budget of nearly $2 billion a year. Still, it too has struggled to regain profitability since the dot-com crash.

Sun recently won large supercomputing deals in Japan and at the University of Texas at Austin. It also recently picked up market share in servers overall, coming within a percentage point of rival Dell, which sits in third place behind leaders IBM and Hewlett-Packard.

Sun's Fortress programming language for supercomputers is still part of an HPCS evaluation (see story, page 1). The company also continues to pursue work in silicon optics and on a novel capacitive coupling chip-to-chip interconnect called Proximity. Both technologies were part of its Phase III proposal.

"We will keep investing in [Proximity] because it has value in board-level modules and high-speed switches. I believe you will see products using Proximity before the HPCS systems emerge in about four and a half years," said Jim Mitchell, a Sun research manager who led the company's HPCS effort.

Finding a low-cost and reliable way to align capacitive pads accurately is the big hurdle for getting the kind of two nanosecond chip-to-chip links Proximity promises, said Mitchell.

Fast track to petaflops

The 2010 HPCS systems are not expected to be the first to break the petaflops barrier. That honor is more likely to go to one of two systems slated to be constructed by 2008.

IBM is developing its first super to use both its Cell processor and Opteron CPUs from Advanced Micro Devices. Code-named Roadrunner, the system will be deployed at Los Alamos National Laboratory. "It could require heroic work to program such a hybrid system. It's unclear how the Cell processor will be used in this combination, so there are still a lot of unanswered questions," said Dongarra.

A system being deployed by Cray at Oak Ridge National Lab is also a contender for the first petaflops machine, he said.

IBM has been tight-lipped about the details of its HPCS proposal. The company did disclose that the system will be based on a Power7 microprocessor, its AIX operating system and General Parallel File System. The CPU will enhance floating-point performance and will have some form of interconnect integrated on the die, said Anthony Befi, vice president for high-performance computing at IBM.

Cray has been more candid about the Cascade system it proposed to Darpa planners. Cascade is essentially a cluster-in-a-box that will deliver a mix of scalar, FPGA and hybrid vector/massively multithreaded processor boards in a single system.

Cascade will use Opteron/Linux boards to handle overall systems services and act as applications processors. A new board will be based on a hybrid ASIC that will shift on the fly between modes for vector processing and massive multithreading. Cray expects to design an FPGA accelerator board for Cascade based on its XD1 system.

The toughest innovation for Cascade is in developing compiler software that can handle a mix of applications calling for scalar, vector or massively multithreaded applications with minimal guidance from the programmer.