Inference systems: The 2nd Piece of the Deep Learning Puzzle

We continue our five-part series on the steps to take before launching a machine learning startup. The complete report, available here, covers how to get started, choose a framework, decide what applications and machine learning technology to use, and more. This post also explores inference systems, and how they can apply capabilities to data.

Inference systems provide the second piece of the deep learning puzzle by applying capabilities to the data.

What do they do?

Deep learning can be broken down into two parts: training and inference. When the deep learning neural network has been trained on what to look for, the inference system essentially makes predictions or ‘infers’ based on the input data in order to provide results. Netflix’s recommendation engines are a prime example of the power of inference.

Tech specs and further features

A great example of an inference system is NVIDIA’s TensorRTTM. This high performance deep learning inference engine maximizes inference throughput and efficiency, and provides the ability to take advantage of fast reduced precision instructions provided in the Pascal GPUs. TensorRT v2 delivers up to 45x faster inference under 7 ms real-time latency with INT8 precision.

How will it help me?

Inference systems will optimize, validate and deploy your trained neural network, regardless of how demanding your throughput requirements might be. Multiple network topologies, like AlexNet or Caffenet, tend to be supported. In the case of TensorRT, developers can avoid having to spend their time performance tuning for inference deployment, and instead focus on developing novel AI-powered applications.

Deep learning hardware options

We can’t look at the latest hardware for deep learning without giving a nod to the Dell EMC PowerEdge R730, and R740 servers.

Dell EMC PowerEdge R730

In just 2U of rack space, the PowerEdge R730 server packs a punch, thanks to a combination of powerful processors, large memory, fast storage options and GPU accelerator support. It’s scalable and configurable, enabling you to adapt to virtually any workload. Vital statistics include the Intel Xeon processor E5-2600 v4 product family, and up to 24 DIMMs of DDR4 RAM.

It’s highly scalable storage features up to 16 x 12Gb SAS drives, while the high-performance 12Gb PowerEdge RAID Controller (PERC9) is an ideal tool for your virtualized environment. Data access can further be boosted by an optional SanDisk DAS Cache application acceleration technology.

The equally impressive new PowerEdge R740 offers an ideal balance of accelerator cards, storage and compute resources in a 2U, 2-socket platform. The R740 boasts up to 16 x 2.5″ or 8 x 3.5″ drives and iDRAC9, as well as up to three 300W accelerator cards or six 150W cards. It’s scalable, versatile and can simplify the entire IT lifecycle.

Dell EMC PowerEdge R740

Future articles in the insideHPC guide on launching a machine learning startup will cover the following additional topics:

Resource Links:

Latest Video

Industry Perspectives

Often, it’s not enough to parallelize and vectorize an application to get the best performance. You also need to take a deep dive into how the application is accessing memory to find and eliminate bottlenecks in the code that could ultimately be limiting performance. Intel Advisor, a component of both Intel Parallel Studio XE and Intel System Studio, can help you identify and diagnose memory performance issues, and suggest strategies to improve the efficiency of your code. [READ MORE…]

White Papers

The financial services and insurance sector is one of the most data-intensive industries in modern business. Unfortunately, that abundance of information has hindered the extraction of business value from data. However, improvements in technology can take data-related challenges that had, until recently, been considered impossible to overcome. Download the new white paper from Penguin Computing that highlights how financial services and insurance firms can benefit from GPU computing and spur innovation and future technological developments.