For HPC and Deep Learning, GPUs are here to stay

In this special guest feature from Scientific Computing World, David Yip, HPC and Storage Business Development at OCF, provides his take on the place of GPU technology in HPC.

There was an interesting story published earlier this week in which NVIDIA’s founder and CEO, Jensen Huang, said: ‘As advanced parallel-instruction architectures for CPU can be barely worked out by designers, GPUs will soon replace CPUs’.

There are only so many processing cores you can fit on a single CPU chip. There are optimized applications that take advantage of a number of cores, but typically they are used for sequential serial processing (although Intel is doing an excellent job of adding more and more cores to its CPUs and getting developers to program multicore systems). By contrast, a GPU has massively parallel architecture consisting of many thousands of smaller, more efficient cores designed for handling multiple tasks simultaneously.

Here’s the thing though, a GPU is never going to run an accounting package for example, or Microsoft Word, it’s always going to be used as an accelerator for computational calculations. I think that needs clarifying. A GPU is only an accelerator, it’s not the main computation component of a system, so there’ll always be a need for traditional CPUs.

And, of course, GPUs aren’t the only accelerators out there either – there are a number of different technologies such as the Xeon Phi from Intel (which can also be classed as processor by itself), FPGAs from Altera and Xilinx, and even DSPs such as those from Texas Instruments.

It is unlikely that we will see GPUs outside of HPC and gaming, you aren’t likely to see GPUs in enterprise data centres, just yet, but as people find more and more uses for them, who knows? We already have GPU accelerated databases. But, you will see them in the hyperscale datacenters of this world where GPUs are put to use for Artificial Intelligence, Siri, Google Home, and Amazon Alexa and the like – the grunt work is done by the accelerators.

Truth be told, NVIDIA’s Huang does agree actually and went on to say that GPUs are the perfect solution for AI-based applications, believing that GPUs are set to play a larger role in certain aspects of computing, and, importantly, they won’t be replacing desktop CPUs anytime soon.

Datacenter dominance?

If we look at the list of TOP500 supercomputers, the majority of the systems either use GPUs or Phi to boost performance. Although a little dated, an Intersect360 Research report published in 2015 found that a third of HPC systems were equipped with accelerators (I’ll wager that figure to be much higher now) – what is interesting about this research is the fact that 80% of the accelerator types used were GPUs with NVIDIA having 78 per cent of that market, and the remaining 2% being AMD. Intel and Xeon Phi had 10 per cent of that market each.

NVIDIA is a dominant force in gaming technology and with GPUs now supporting a huge range of High Performance Computing (HPC) applications, the company is omni-present, with millions of CUDA cores already in the wild. But the real boon for NVIDIA, ARM and their GPUs is the growth of Artificial Intelligence and Machine Learning, where GPUs have become the go to technology to accelerate the algorithms. In fact, so much so, that NVIDIA is prepared to bet billions on its technology driving this new era of data-intensive computing.

CPUs vs GPUs

Whilst it’s true to say that you can replace CPUs with GPUs, it’s not simply case of replacing one with the other – there are power requirements to consider. It’s somewhat of a trade-off and does depend on the constraints such as the size of the HPC system, physical space in the data centre and power to the system – modern data centres only have around 5-10kw of power. Using GPUs in the HPC datacenter in place of CPUs can dramatically increase the power requirements needed, but if your computational performance goes through the roof, then I’d argue it’s a trade-off worth making.

Alongside this, the bottleneck between CPU and GPU – providing data to the accelerator fast enough – is another challenge to contend with. NVLink does go some way to solve this problem. Traditionally, data has to come over the PCIe bus to the accelerator, at 32 GB/s, NVlink provides 80 GB/s more between the GPU and CPU than the traditional PCIe.

But, even with these challenges, GPUs are essential in HPC. Scientists, researchers, universities and research institutes all know that speeding up applications is nothing but good for business – and research – so GPUs are here to stay. However, they won’t substitute CPUs for everything and they’re not the only accelerators around.

Resource Links:

Latest Video

Industry Perspectives

Addison Snell gave this talk at the Stanford HPC Conference. “Intersect360 Research returns with an annual deep dive into the trends, technologies and usage models that will be propelling the HPC community through 2017 and beyond. Emerging areas of focus and opportunities to expand will be explored along with insightful observations needed to support measurably positive decision making within your operations.” [READ MORE…]

White Papers

As the first to 40Gb/s, 56Gb/s and 100Gb/s bandwidth, Mellanox has both boosted data center and cloud performance and improved return on investment at a pace that exceeds its own roadmap. To that end, Mellanox has now announced that it is the first company to enable 200Gb/s data speeds with Mellanox Quantum switches, ConnectX-6 adapters, and LinkX cables combining for an end-to end 200G HDR InfiniBand solution in 2018. Download the new report, courtesy of Mellanox Technologies, to lean more about 200G HDR InfiniBand solutions.