GPUs Power Facebook’s New Deep Learning Machine

Today Nvidia announced that Facebook will power its next-generation computing system with Tesla GPUs, enabling a broad range of new machine learning applications.

While training complex deep neural networks to conduct machine learning can take days or weeks on even the fastest computers, the Tesla platform can slash this by 10-20x. As a result, developers can innovate more quickly and train networks that are more sophisticated, delivering improved capabilities to consumers.

Facebook is the first company to adopt NVIDIA Tesla M40 GPU accelerators, introduced last month, to train deep neural networks. They will play a key role in the new “Big Sur” computing platform, Facebook AI Research’s (FAIR) purpose-built system designed specifically for neural network training.

“Deep learning has started a new era in computing,” said Ian Buck, vice president of accelerated computing at NVIDIA. “Enabled by big data and powerful GPUs, deep learning algorithms can solve problems never possible before. Huge industries from web services and retail to healthcare and cars will be revolutionized. We are thrilled that NVIDIA GPUs have been adopted as the engine of deep learning. Our goal is to provide researchers and companies with the most productive platform to advance this exciting work.”

In addition to reducing neural network training time, GPUs offer a number of other advantages. Their architectural compatibility from generation to generation provides seamless speed-ups for future GPU upgrades. And the Tesla platform’s growing global adoption facilitates open NVIDIA GPUs Power collaboration with researchers around the world, fueling new waves of discovery and innovation in the machine learning field.

Big Sur Optimized for Machine Learning

NVIDIA worked with Facebook engineers on the design of Big Sur, optimizing it to deliver maximum performance for machine learning workloads, including the training of large neural networks across multiple Tesla GPUs. Two times faster than Facebook’s existing system, Big Sur will enable the company to train twice as many neural networks – and to create neural networks that are twice as large – which will help develop more accurate models and new classes of advanced applications.

The key to unlocking the knowledge necessary to develop more intelligent machines lies in the capability of our computing systems,” said Serkan Piantino, engineering director for FAIR. “Most of the major advances in machine learning and AI in the past few years have been contingent on tapping into powerful GPUs and huge data sets to build and train advanced models.”

The addition of Tesla M40 GPUs will help Facebook make new advancements in machine learning research and enable teams across its organization to use deep neural networks in a variety of products and services.

First Open Sourced AI Computing Architecture

Big Sur represents the first time a computing system specifically designed for machine learning and artificial intelligence research will be released as an open source solution.

Committed to doing its AI work in the open and sharing its findings with the community, Facebook intends to work with its partners to open source Big Sur specifications via the Open Compute Project. This unique approach will make it easier for AI researchers worldwide to share and improve techniques, enabling future innovation in machine learning by harnessing the power of the GPU accelerated computing.

Resource Links:

Latest Video

Industry Perspectives

In this Nvidia podcast, Bryan Catanzaro from Baidu describes how machines with Deep Learning capabilities are now better at recognizing objects in images than humans. “AI gets better and better until it kind of disappears into the background,” says Catanzaro — NVIDIA’s head of applied deep learning research — in conversation with host Michael Copeland on this week’s edition of the new AI Podcast. “Once you stop noticing that it’s there because it works so well — that’s when it’s really landed.” [Read More...]

White Papers

This white paper reviews common HPC-environment challenges and outlines solutions that can help IT professionals deliver best-in-class HPC cloud solutions—without undue stress and organizational chaos.