Is Gordon the future of HPC?

Not that Gordon

HPD: just what we need in the computing industry – another acronym. But this is the term that Michael Norman, interim director of the San Diego Supercomputer Center (SDSC), is using to discuss how HPD (High Performance Data) goes hand in hand with HPC.

If you have both HPC and HPD, then you’ve got something that can be an order of magnitude faster on some apps. Then you’re cooking with gas.

The proof of concept will be a 245 TFlop supercomputer that goes by the slightly eccentric name “Gordon”. When completed, the system should rank somewhere in the top 30. Gordo will sport 64TB of DRAM and a massive 256TB of flash memory, utilizing the incredible speed of SSDs.

Gordon will be kept busy doing things like predicting earthquake damage (and, perhaps, even earthquakes) along with extensive data mining focused on the predictive side of things. In the article, Norman makes the point that it’s not just FLOPS that are important in computing, but also IOPS, and that IOPS will become increasingly important in the future. It’s hard to argue with that; I expect that over time we’ll see more attention paid to IOPS as data sets continue to grow and IOPS become more of a choke point.

One of the more interesting aspects here is that Gordon is an example of large-node supercomputing. It’s architected as a collection of 32 “supernodes” – each of which has 32 compute nodes containing 64GB of DRAM and 4TB of SSD.

They’re going to use ScaleMP’s memory virtualization technology to tie each of these nodes together into a shared memory supernode capable of 7.7 TFlops. This is a pretty big win for Intel, since it’s providing both the processors and the SSDs. They’ll be using Infiniband (at 16Gbs) to tie the supernodes together into the 245 FLOP completed system.

There are a lot of applications that can’t be efficiently parallelized, and Gordon is designed expressly to handle those kinds of workloads. However, when you can load an entire massive database directly into shared memory, you reduce disk latency by orders of magnitude, and things get much speedier.

Combining shared memory via ScaleMP software and high-performance Infiniband results in a fast and broad system interconnect that’s faster than a lot of the SMP busses and crossbars of yesteryear.

While these architectures work great for HPC-like workloads, they still have too much latency for things like transactional databases and other commercial applications. However, I can see a day soon when general business customers will be installing architectures much like this (albeit on a much smaller scale) to handle their own data mining and predictive analytic tasks. ®