Meet Gordon, the World’s First Flash Supercomputer

Gordon: The world's first supercomputer built with flash storage rather than spinning hard disks (Photo: Alan Decker)

Supercomputers aren’t what they used to be. The Chinese are building a supercomputer with their own microprocessors, shunning American chip giants Intel and AMD. The Spanish are building one with cellphone chips. And this week, the San Diego Supercomputer Center (SDSC) officially plugged in the first supercomputer that uses flash storage rather than good old-fashioned spinning disks.

Naturally, they call it Gordon. As in Flash Gordon.

Gordon uses 300 terabytes of flash, spanning 1,024 high-performance Intel 710 series drives, and the system includes new software designed to aggregate resources from multiple physical server nodes into “super-nodes,” so users have immediate access to data, rather than waiting for the system to access particular drives. Allan Snavely, the SDSC’s associate director, sees this as the world’s largest thumb drive. Flash memory is stuff used not only in USB thumb drives but cell phones and digital cameras.

According to Snavely, Gordon can run massive databases up to 10 times faster than traditional memory, and it now ranks 48th on the official Top500 list of the fastest supercomputer in the world. The project is part of a larger trend in the supercomputer game, where systems are moving away from traditional components, toward new types of hardware that can improve speed, cost, efficiency, and, in the case of the Chinese, independence from the West.

With Gordon, the big deal is its ability to handle data, says Nicholas Schork, a professor at the Scripps Research Institute, who helped build the first high-density map of the human genome 10 years ago and is now the director of bioinformatics and biostatistics at Scripps Translational Science Institute.

“We’ve been anticipating a deluge of data and it is here,” Schork says. “In no time at all, the six billion sequences of the human genome can be done, in no time at all. The ability to sequence has outpaced the ability to interpret the data. Interpreting the genome is where the action is – you have to annotate the data, find patterns in it.”

When it officially becomes a research tool on New Year’s Day, Gordon will have 16,384 compute cores and a theoretical peak performance of 340 Teraflops per second. Its aggregate flash memory will be able to read and write at just over 200GB per second.

Before building the system themselves, the SDSC wizards sought help from Cray and the other big supercomputing companies, but they didn’t want to play. “We said: ‘Can we get something like this?’ And they said: ‘Take a hike,’” says Snavely. So the center pursued grants, landed a $20 million, NSF five-year grant and set up an in-house skunk works, a small group dedicated to the project. “We did massive amounts of testing. As soon as we could test anything, we tested.”

They worked with Intel Chief Technology Officer for High-Performance Computing Ecosystems Mark Seager, who predicted that “this kind of technology is going to be adapted into the wider market.”

Gordon utilizes a unique architecture, designed by ScaleMP, where a supernode that aggregates 32 of Gordon’s servers and two I/O servers into a single virtual cache so “it can be used without putting too much brain into using it,” according to Rob Pennington, of the National Science Foundation.

Bob Sinkovits, applications lead for the Gordon project at SDSC, says that using flash memory is just a better idea. “Flash memory has a number of advantages over traditional hard drives, including higher bandwidths or the rate at which large blocks of data can be read or written, lower power consumption, and greater mechanical stability owing to the lack of moving parts. For data-intensive applications, though, the biggest advantage is much lower latency, or the delay between a request for data and the delivery of the first byte.”