Why smart processor choices are key to AI success

“If all you have is a hammer, everything looks like a nail.”
-Traditional proverb

“It’s not a one-size-fits-all world. There are a lot of different ways to solve a whole variety
of AI problems.”– Lisa Spelman, Vice President, Data Center Group and General Manager, Intel Xeon
Processors

AI is here to stay. It’s already everywhere: tagging photos, answering voice commands, guiding
financial advisors, reading X-rays, reshaping thousands of business applications – even helping
to locate missing children. But the head-snapping velocity and variety that’s made artificial
intelligence a technology and investment phenomenon in just a few short years has now produced
growing pains.

Early enterprise adopters of AI and other AI builders are seeing model size grow, algorithms
become more complex and the volume and variety of data exploding, requiring new technologies
from devices to data centers. As the scope and complexity of AI-enabled workloads expands, so
does the need for workload optimized solutions to fuel them. So right when more data scientists
are sharing the wheel with everyday business leaders to new find new horizons, the computing
industry is challenged to ensure that underlying infrastructure delivery meets the requirements
of this rapidly advancing innovation.

The solution: Ditch outdated reflexes like looking at AI as a monolithic workload requiring one
specific hardware solution. Instead, take a fresh, holistic look at the opportunity that
hardware, software, and ecosystems represent when working in tandem across a wide range of
workloads, algorithms and customer advancements from data center to edge.

Organizations that will succeed with AI in the new era already upon us will be those that create
the most cost-efficient, capable, scalable, silicon infrastructures that can provide a solid
foundation for advancing AI. The logical beginning: understanding the importance of a portfolio
approach to AI chip architecture from AI-optimized CPUs to general purpose accelerators like
GPUs, and FPGAs; to purpose-built ASICs including Intel’s forthcoming neural network processors.
As Wei Li, Vice President of Intel Architecture, Graphics and Software, and General Manager of
Machine Learning and Translation at Intel, puts it: “AI problems demand a variety of silicon.”

To better understand why multiple architectures a strategic key to AI success is – today and
tomorrow – consider the biggest business and technology factors shaping adoption and
implementation. And how modern options can help enterprises handle these powerful global trends.

The growth of data

Organizations are drowning in data — an estimated 90% generated in just the past two years.
Analysts forecast worldwide data will grow tenfold by 2025, reaching 163 zettabytes. Yet only an
estimated 2% has been analyzed, leaving a great untapped opportunity to propel business and fuel
societal insights. In fact, much of the interest and activity in AI is driven by the desire to
unlock business value from these growing torrents. According to Garter, through 2023,
computational resources used in AI will increase 5x from 2018, making AI the top category of
workloads driving infrastructure decisions. “All apps,” says Lisa Spelman, VP of the Intel Data
Center Group and General Manager of Intel Xeon Systems, “will have AI built in.”

Indeed, a Deloitte & Touche global survey of 1,900 IT and Line of Business leaders in seven
countries found: 61% are using machine learning 60% have adopted NLP 56% are using computer
vision 51% are using deep learning

Inside the toolbox of new AI chips

Making smart, strategic architecture choices starts with understanding the full range of modern
silicon options available for enterprise AI tasks. Here’s a quick rundown of major choices and
how each helps AI.

Central Processing Units (CPU) are super-fast generalists. Traditionally,
they’re faster at executing a variety of tasks, but don’t have as many parallel execution units:
Manage all input and output. NEXT! Run virtual memory! NEXT! Send files to a disk! NEXT! These
flexible, multi-tasking generalists can be programmed to do basically anything very, very
quickly. In AI, these traits make CPUs ideal for inferencing tasks and the complete pipeline of
data processing pre-and post ML/DL inference, as well as the application that is using the
inference results. CPUs scale throughout the compute infrastructure from workstations to
affordable public cloud instances to deployments on edge servers, PCs and devices.

Graphic Processing Units (GPU) offer clock speeds that actually run slower than
a CPU, but they contain thousands of cores (compared to tens of cores for a server CPU). So
they’re very good at rapidly running a single mathematical operation over and over, on many,
many, many pieces of data, making GPUs ideal for video rendering, gaming and, in AI, a host of
functions, especially deep learning training.

Custom ASICs are highly optimized architectures for deep learning to improve
performance and minimize power. Indeed, the AI landscape has evolved to where the industry is
seeing today strong demand in in special purpose ASICs as AI is implemented. But moving forward,
the diversity of applications incorporating AI will require a diversity of solutions. The custom
ASIC category includes the forthcoming Intel Nervana Neural Network Processors for training
(NNP-T) and inference (NNP-I), both of which Intel is on track to deliver to customers later
this year. The category also features
vision processing units (VPUs) like
Intel’s Movidius solutions that deliver extreme, low-power inference for cameras and enable AI
at the edge.

FPGA (Field Programmable Gate Arrays) provide excellent throughput and low
latency for real-time inferencing. FPGAs have low latency, potentially more compute power for
lower precision data types, and flexibility for custom operations and data types and some memory
advantages. It’s one of the fastest-growing architectures, according to McKinsey, especially for
edge inference and training.

Spotlight on the CPU

Indeed, CPUs are the leading architecture for AI inference.; for example, Facebook runs nearly
all of its inference on CPU. In fact, because AI inference is being integrated across a large
array of workloads, AI inference is a naturally occurring workload for CPU architectures.

For organizations looking for modern AI platforms, updated general-purpose microprocessors
provide an incredibly robust foundation. New more powerful chips like the 2nd-Generation Intel
Xeon Scalable processors are ready and optimized for AI with huge memory, more cores, AI
acceleration instructions, and increasingly optimized software.

AI-enabled.
The 2nd Generation Intel Xeon Scalable processors contain Deep Learning Boost (DL
Boost) technology, for built in inference acceleration, reducing the need to bolt on additional
accelerators. For example, Microsoft has seen a 3.4X boost in image recognition, Target a 4.43X
improvement in ML inference, and JD.com a 2.4X boost in text detection using Intel's DL Boost.
Popular AI frameworks, like TensorFlow, PyTorch, Caffe, and MXNet, are being optimized for Intel
DL Boost. Additionally, Cooper Lake, the Intel Xeon Scalable processor following 2nd generation
Xeon Scalable (Cascade Lake), will be the first Xeon processor to deliver built-in high
performance AI training acceleration through new bfloat16 support added to Intel DL
Boost—further improving training performance.

Real-world benefits

New technology advances, along with a fast-widening set of use cases, have prompted many leading
organizations to choose modern CPUs as their foundational AI processor of choice. Here are
innovative examples of how CPUs are meeting the real-world demands of deep learning
applications.

For iFLYTEK, Cloud Computing Research Institute, openness to a new way of expanding AI capacity
produced a pleasant surprise: Intel Xeon Scalable Processors with DL Boost enabled on a real AI
cloud workload
matched or beat performance
of their general- purpose GPU solution, saving money in the process, according to Zhang
Zhiajang, vice dean.

For others, the pay-off is doing heavy-duty AI processing without buying new GPUs or other
accelerators. Siemens Healthineers, for instance,
leveraged its existing Intel CPU-infrastructure
to run AI-inference workloads, including segmentation and analysis, in new models for diagnosing
cardiovascular disease. “We can now develop multiple real-time, often critical medical imaging
use cases, such as cardiac MRI and others, using Intel Xeon Scalable processors, without the
added cost or complexity of hardware accelerators,” explains Dorin Comaniciu, senior vice
president, Siemens Healthineers.

The ultimate show of confidence in a modern AI upgrade may be bringing it to a customer-facing
software service. That’s
what Taboola did
for its publishing, marketing, and advertising clients, optimizing and speeding a custom
TensorFlow Serving application with the Intel Math Kernel Library for Deep Neural Networks
(Intel MKL-DNN) on Intel Xeon Scalable processors. Taboola evaluated GPUs and CPUs side by side
as the company planned to scale out inference across seven data centers. Taboola found that even
though performance was comparable, it lost precious time transferring data back and forth
between GPU and CPU architectures, plus it cost less to just keep things on CPU overall. Result:
Optimized, much faster delivery of AI apps via SaaS cloud platforms.

And the ability to use large-capacity memory and performant compute can open new deep learning
training doors for enterprises. Pharmaceutical maker Novartis, for instance, wanted to use deep
learning to accelerate the analysis of cell culture microscopy images, used to study the effects
of various treatments, and to discover new therapies. The images were more than 26 times larger
than those used in common deep learning benchmarks. Despite the size, more than 120
3.9-megapixel images per second were supported – in part due to the memory capacity, a key Xeon
advantage. Overall, the team achieved a 20x improvement in time to train on a system.

Close up: DataCubes

New architectures also give enterprises another choice: creating sophisticated AI on standard,
general-purpose CPU architectures, then moving to more specialized hardware when and if it makes
sense. That’s especially appealing for tech-driven start-ups like DataCubes, Inc.

The Illinois company, founded in 2015 by an insurance industry executive, wanted to modernize
commercial property and casualty insurance for small and medium business. “Underwriting is
soul-sucking work,” says Phil Alampi, vice president of customer engagement. “If you walked into
the typical office writing commercial P&C insurance today it wouldn’t look too different from
the 1990s. All manual.” Errors were common; the process could take weeks – or longer.

So DataCubes began building a ”frictionless” AI system that would help carriers and agents speed
the slow, tedious process of manually gathering, inputting and processing far-flung data needed
to complete an application.

In 2018 the company created an AI system atop existing Intel Xeon CPU infrastructure, using
sophisticated algorithms to automate data collection and the application process. DataCubes'
platform, d3 CORE®, automates the intake of submission documents including PDFs, scans and other
forms of unstructured data using machine learning models pretrained on thousands of sources.
More than 4 billion data points are included in DataCubes' data lake of information on
businesses drawn from government entities, public records, company websites and other sources.
Results are delivered to clients via an online portal, email or API.

The system can automatically answer underwriting questions and provide risk assessments, all
powered by machine learning. Carriers like that they can write policies quickly, which helps
make them the carrier of choice, while improving their customer experience, underwriting
productivity, and profitability. Today, DataCubes serves national and regional operators in 50
states under the motto: “Commercial Underwriting Powered by Data Science.” As the business grows
over the next few years beyond hundreds of customers, DataCubes expects to build on the
foundation created by Intel Xeon processors.

“We’ve been really successful with a generalist approach. We don’t have a hardware optimization
team,” Alampi says. “But we’ve been able to be part of transforming an entire industry on top
our existing infrastructure. That’s enabled a small start-up to make a big impact. You don’t
need a big company with a huge technical staff to do something like this.”

Conclusion: One size does not fit all

“It’s not an A or B world. It’s A and B. There are at least 3,4,5 ways of doing what you are
trying to do.”– Maribel Lopez, Principal, Lopez Research

While AI has enjoyed astonishing growth over the last decade, it’s still in its infancy. The way
we create and run AI systems today won’t efficiently take us into tomorrow.

For enterprises and the industry to advance and derive maximum benefit will require fresh
approaches to new challenges. Success in the new, data-driven, “AI everywhere” environment will
go to organizations that ignore simplistic and outdated “rules” about processor architecture,
treating AI infrastructure as a strategic advantage and adopting a flexible, modern, portfolio
approach to building a solid silicon foundation.