The great race to power machine learning

Since the birth of the modern era of computing there has been an arms race between CPU microprocessor manufacturers that has pushed computer capabilities even higher, characterised by Moore’s Law.

This era’s computer technology can be characterised as running sophisticated but essentially dumb applications. A new era is beginning that will drive microprocessor manufacturers to support intelligent applications, such as those based on newly emerged deep learning and other machine learning algorithms.

Deep learning is the umbrella term for a set of techniques for architecting and training neural networks that in recent years has made huge leaps forward in accuracy.

For example, deep learning neural networks are at the root of the most successful technologies for natural language understanding, image recognition, advanced game playing (such as Go), and others. However, deep learning requires a lot of processing power to train the neural networks.

Ovum believes that the need for faster processing and higher resolution analysis will drive a new arms race for next-generation microprocessors that are designed to support artificial intelligence (AI) powered applications, bringing new players into the market.

New players have emerged in a bid to satisfy demand for AI/machine learning/cognitive computing applications. These include startups Knupath and Nervada, and a major player on the software side, Google, which is now entering the hardware domain.

They join AMD, Intel, and Nvidia. To date high-end Nvidia GPUs have dominated this nascent market for supporting intelligent applications.

At the Google I/O conference in May 2016 Google CEO Sundar Pichai announced the Tensor Processing Unit (TPU). TPU is an application-specific integrated circuit (ASIC) that Google designed for its Tensor Flow deep learning algorithms.

According to Google, the device has a higher performance per Watt rating than nearest competitor devices such as GPUs and FPGAs by an order of magnitude (10x). It also achieves this performance/Watt capability by reducing computational precision (single precision over double precision, for example), requiring fewer transistors per operation.

TPU powered Google DeepMind AlphaGo, which beat by 4 to 1 the world Go champion Lee Sedol in a five-game match. TPU is available to users of Google Cloud Platform through its machine learning API but is not sold as a separate product.

Knupath is a startup that offers the Hermosa processor, which it describes as “an architecture based on neurological design to deliver acceleration of targeted workloads”. Nervana has an ASIC, Nervana Engine, planned for launch in 2017 to power its Neon deep learning library.

With custom chips for deep learning emerging from startups, it is likely that Google wishes to demonstrate it is ahead of the curve by going public about its hardware for deep learning that it has been using for some time.

Earlier this year Nvidia launched its most powerful GPU designed for machine learning applications, the Pascal GPU. AMD has announced that it developed a software library, Heterogeneous-Compute Interface for Portability (HIP), to convert Nvidia GPU CUDA software so that it runs on AMD GPUs as C++ software.

Meanwhile, Intel announced its next-generation Xeon Phi “Knights Landing” chips that will target machine learning workloads, especially for the neural network training phase. According to Intel’s analysis, by optimising deep learning libraries to run on the Xeon Phi it is able to achieve an order of magnitude higher performance than simply running the libraries out of the box.