Building Your Own GPU-Powered Machine Learning Server Part 1. Building the Server

Prefix – Understanding the Graphics Processing Unit (GPU)

The Current Standard – Cloud Computing

Cloud Computing services are abundant these days. You can quickly set up simple Linux or Windows servers for your development or testing environments quickly and easily. Most of the services have a “free tier” of single or dual Vcore virtual machines that you can tunnel into and host your Jupyter Notebooks or Flask Apps with relative ease. Some even come with packaged instances ready made for us Data Nerds. Here is a list of links to some of the more popular services:

When you start getting into Neural Networks with a ton a of Matrix Operations and constant Gradient Loss Optimization over many Epochs, the processors simply can’t hold up, and this is where GPUs come in handy. To fill the need, most of the cloud computing providers have implemented GPU-Powered Cloud Services like AWS’sp2.xlarge instances that host your processing heavy ML experiments.

A brief CPU/GPU overview:

CPUs are great at handling a few tasks quickly.

Essentially, a CPU can do as many tasks simultaneously as the number of cores it has.

Double that if it has Hyper-Threading.

Example: An Intel Quad Core CPU with Hyper-Threading can do 8 tasks at once.

Modern CPUs also have built in GPU capabilities (Intel Integrated Graphics), memory controllers, PCI controllers, and generally handle the majority of low level hardware related communication and control on the chip.

They stream data from memory, work on it, and pass it back, but it is generally a handful of operations at one time quickly.

GPUs have a ton of cores. They can do many simpler operations simultaneously, which is great for neural networks and calculating your gradients with a large feature size.

Consider that rendering 3D Graphics is also a series of many many simulatnaeous matrix operations.

They have incredibly fast memory. The communication between the GPU memory and GPU processors is much faster than CPU to System Memory.

Consider the textures it loads and transforms from memory when it renders 3D Graphics.

CUDA (Compute Unified Device Architecture) is the parallel computing architecture from Nvidia that allows software to hook into these awesome GPU capabilities for things outside of gaming.

CUDA was traditionally used by graphically heavy softwares like Adobe and Da-Vinci Resolve to do color correction and add effects to images and video.

Nvidia’s recent stock boom is largely thanks to Machine Learning and Block Chain Mining, but their advances in the field are largely due to the gaming industry’s need for faster and better GPUs.

The accessibility of these awesome GPUs (and most fast computers) is largely thanks to the gaming industry.

An awesome gaming computer will be equally capable of running your web of Tensors as it will running Overwatch on Epic settings at 4k 120hz.

Why Ubuntu? You need a linux distro to runnvidia-dockerin order to build and launch GPU based docker imageslike Keras and Tensorflow-GPU. At the time of this writing, OSx and Windows areNOTsupported bynvidia-docker.

Consumer vs ‘Pro’ Graphics Cards

So now that we understand a bit about why GPUs are used in machine learning, lets explore our options. I recently picked up an EVGA GTX 1080 for $550 at BestBuy. If you are reading this early in mid May 2018, this is hands down the best price you are going to find for this card anywhere. “But wait,” you say. “None of the cloud computing guys use GTX1080s! They use Tesla or Volta GPUs”. You’re right, Amazon and Google don’t slap gaming level GPUs into their servers. They use Nvidia’s pro line. But the numbers don’t lie.

Thats right. That’s an extra 0. The Nvidia Tesla P100 is less than 25% faster at training an LSTM than a stock Nvidia GTX 1080 but almost 1,000% the price. Nvidia also offers a Ti variant of the card for $800 which actually outperforms the P100 in most benchmarks except one. Heat!

Naturally, if you are running 1,000 of these in a server room, that 35 degree difference is no joke. Also, big names don’t slap ‘gaming’ gear into their servers.

If you are reading this, chances are you are somewhat familiar with cloud computing for Data Science. You might also be somewhat familiar with how much it costs to train a big model for a day or two. If your employer is footing the bill, good on you, but if you are like me, an enthusiast and tinkerer, then you start to wonder how much it would cost and how much time it would take to build and understand the inner workings of these servers.

My Plan

I have an aging 2008 MacBook Pro with SSD and 8GB of RAM but I also have an amazing Hexa-Core audio workstation / gaming rig.

The system cost about $2,500 spread over the course of 4 years with little upgrades here and there. An AWS p2.xlarge (the cheapest GPU powered T2 instance as of the time of this writing) is $.90/h. I had everything but the card and after a few days of struggling with the P2 (no python 3.6 on AMIs), I ran to BestBuy to get the 1080. It was the last one in the SF area. Would the AWS be cheaper? Maybe initially, but those training sessions can quickly add up. Besides, I wouldn’t get Overwatch on epic settings. And there is something about building your own system that is ultimately incredibly satisfying.

I have built literally hundreds of systems for my business www.KoreTechs.com from ZFS servers to $10k color correction workstations and let me tell you, it isn’t as scary or complicated as it looks. I’m not going to write a tutorial on how to build a PC because there are hundreds out there, but I will give you some pointers below.

My Recommendations:

Purchasing or building a computer can seem overwhelming. Here are some tips:

Intel i7s are all great.

From Core i7 to the latest 8th generation “Coffee Lake”, the differences in actual performance were negligible from generation to generation.

If you are on a budget, you can get a used i7 gaming rig for as little as $300.