The global electronic design automation (EDA) market relies heavily on Computer Aided Design
(CAD)
software to aid in the creation and production of semiconductor devices, integrated circuits
(ICs) and
printed circuit boards (PCBs). This multi-billion-dollar industry is fueled by the need for
faster
development cycles, low power designs, lower costs and the complexities driven by Moore’s
Law, which
states that transistor density doubles every 18 months. The
need to rapidly develop new systems-on-chips (SoCs), embedded microprocessors, and analog
circuits for
industrial and consumer products are critical to EDA, and create significant engineering
costs. In chip
design, EDA allows architects and designers the ability virtually create, build and test
simulations of
their end products before committing to the cost of validating a physical prototype
(semiconductor wafer
starts, packaging and physical test). To accomplish this task, IT must provide
computational, networking
and storage capacity capable of handling not only the scale of millions to billions of files
and fast
processing, but also to handle many iterations of a design throughout its development
cycle.

THE EDA STORAGE BOTTLENECK

It is well documented that storage is the biggest bottleneck for EDA application performance.
Common EDA
deployments take advantage of large compute farms using an HPC (high performance computing)
approach to
distribute parallel simulation runs to verify logic designs, which creates intense demands
on storage.
When ASIC designers open a support IT ticket for slow performance, at times it relates to a
problem with
the EDA application software, but most of the time the issues are tied to bottlenecks in the
infrastructure that are caused by the types of and the amount of I/O traffic generated.
Modern EDA
applications are file-based and best deployed with network file system (NFS) based storage
solutions. To
help build an appropriate infrastructure for EDA applications, the Standard Performance
Evaluation
Corporation (SPEC) created standardized benchmarks profiling many workloads such as
Database, Software
Build, Video Data Acquisition and Virtual Desktop Infrastructure. Recently, proposals are in
the works
to add EDA workloads to SPEC SFS® 2014 SP2. The SPEC working group has profiled many of the
EDA vendors’
products showing heavy metadata-intensive workloads. As shown in the EDA Workload diagram
from the
working group presented at the 2016 Storage Developers Conference, the EDA software jobs run
concurrently producing a combined and random mix of workloads across multiple design phases
where about
60% is metadata, approximately 15% is read and approximately 25% is write NFS operations.
Armed with
this type of knowledge, Research and Development (R&D) IT teams can take advantage of
DataSphere’s
performance-enhanced metadata engine, scale-out NAS and data tiering features for EDA
workloads.

OPTIMIZING FOR EDA WORKLOADS

Unlike traditional corporate IT, R&D IT must accommodate compute, networking and storage
configurations
to
handle environments with millions of small and large files, according to the type of R&D
workflow being
deployed
during chip development. The workloads vary depending not only on the design process, but
also according
to
which EDA software products have been installed. Use of the NFS protocol and NAS storage are
very
popular with
the majority of EDA application deployments. Although workloads differ depending on the
design and the
development phase, EDA applications ubiquitously use NFS remote procedure calls (RPCs) to
access NAS
storage.
The challenge here is not only the serialization of RPCs, but also the mixing of metadata
and file
requests and
the constant demand for low latency with high bandwidth file accesses. Generically speaking,
the design
process
can be broken into two specific development workflows, frontend and backend, as discussed in
the section
below.

FRONTEND IC DEVELOPMENT FLOW (GENERIC)

The frontend integrated circuit (IC) design workflow describes the process of translating a
product’s
requirements to a set of specifications, which in turn is used by engineer(s) to develop a
functional
design
using a specialized hardware description language (HDL) that can be synthesized or complied
into the
logic
design.

The last stage of the frontend workflow places the most demand on storage. In this
verification stage, an
engineer must verify their design in a simulation. The storage demands for the frontend
activity
consists of
millions of small files requiring fast storage for the transient output of large amounts of
data, which
can be
terabytes in size. Again, the SPEC working group profiled several EDA vendor solutions, each
consistently
showing similar storage transaction attributes: a high percentage of writes over read file
operations,
with
nearly 50% of the transactions relating to metadata operations. Metadata performance is
easily blocked
behind
file accesses in the storage controller’s queue. In an effort to try and balance these
storage calls,
R&D IT
will leverage multiple storage controllers across multiple storage arrays, which is costly
and does not
guarantee consistent metadata operation performance as well as each storage array being its
own storage
island.

ACCELERATING METADATA ACCESSES

DataSphere is a metadata engine that collects performance telemetry from clients accessing
data within
its
global namespace. With this information, DataSphere runs actuary code to determine data
management
analytics and
ensure business objectives for applications are always met. By separating the metadata from
the data
itself,
DataSphere is able to offload metadata activity from NAS storage arrays allowing it to
utilize all
resources for
file accesses. This out-of-band approach solves a key performance bottleneck for frontend
EDA
applications or
any metadata heavy workloads. Once a functional design is complete, it moves to the next
phase of
development,
such as the physical design and manufacturing, or the backend flow.

BACKEND IC DEVELOPMENT FLOW (GENERIC)

In contrast to the frontend IC process, the backend IC development workflow moves a design
from the
virtual to
the physical world. In other words, it translates the schematic of a digital design or the
netlist of
logic to
transistors, gates, memory, I/O cells, etc. The backend process is similar to building a
house. First,
you start
with a floorplan. Floor planning determines where to best place the functional logic blocks
on a silicon
chip
for speed and the smallest die, or in the house example, where the architect will place the
entrance,
living
room, dining room, bathrooms, kitchen and bedrooms for the best look and optimization of the
square
footage.
After this, the logic needs to be electrically connected for communications and power, or
again in a
house, this
is synonymous to adding plumbing and electrical outlets. At this point, engineering must
make sure
everything
theoretically works, and begins the process of physical verification to check the design
fits into the
floorplan, all the logic and I/O are connected correctly, and then estimate power
consumption and timing
for the
targeted operating speeds.

The SPEC group also profiled typical EDA backend storage operations for the purposes of
simulated loads.
Across
several vendors’ EDA solutions, files sizes were two to three times larger for backend than
frontend
loads, with
more sequential access patterns and heavier write loads. The backend flow points R&D IT
towards a
storage
solution with large bandwidth and huge write caches. However, IT budgets usually are not
capable of
supporting
dedicated resources for the different phases of development, and are thus overprovisioned in
performance
or
capacity to handle the different demands throughout the engineering project. Fixing this
requires the
ability to
combine a broad range of storage solutions, from high performance flash based storage arrays
to low cost
object
storage in the cloud, in order to manage the changing demands of an EDA workload
automatically.

Source: SNIA.org

MEETING DATA DEMANDS IN REAL-TIME

Figure 3 — DataSphere enables IT to easily tier data to the resource that
meets current
application demands.

With its metadata engine and DSX extended services, DataSphere can match the performance,
cost and
reliability
attributes of a storage resource with an application’s requirements in real-time. This
enables IT to
overlay
various storage options and configurations to meet EDA development needs. For example, R&D
IT can easily
tier
storage, and with objectives, allow automated movement of latency-sensitive data accesses to
a high
performance
all-flash array. Transient datasets can be placed on PCI-Express attached or NVMe-based
flash drives.
With many
EDA jobs running in parallel, clusters of storage devices under DataSphere allows scale-out
configurations for
file load balancing across any number of NAS arrays, giving EDA applications simultaneous
access to
multiple
files. Less frequently used data can remain on lower cost centralized hard disk drive-based
or hybrid
NAS
arrays. Snapshots, older simulations or verification results can be pushed to the cloud to
retain at the
lowest
cost of ownership. DataSphere’s unique architecture even allows file movement between tiers
while files
are open
and data is actively being accessed. DSX Data Movers will ensure all reads and writes to
data are
completed
atomically even while the data is in flight to the target storage tier.

Figure 4 — DataSphere enables admins to set objectives across attributes
including
performance, cost and reliability to ensure application service levels are met
automatically.

RIGHT DATA, RIGHT PLACE, RIGHT TIME WITH DATASPHERE

With DataSphere, admins can ensure the varying demands of EDA applications are met throughout
the
engineering
development project by offloading metadata operations. This delivers higher performance,
with existing
storage
as well as with new forms of scale-out NAS storage solutions built from existing scale-out
NAS
deployments or
tiered storage types. DataSphere makes it possible to combine NAS arrays from different
vendors for cost
savings
and agility, create logical storage tiers for improved capacity efficiency, define
performance tiers for
increased application throughput, automate demands using objectives, upgrade storage without
disruption
and
leverage the use of the cloud today – seamlessly and without changing applications.
DataSphere expands
architectural storage choices to meet both IT’s budget constraints and the application
demands of the
business.