How the ‘C’ in HPC can now Stand for Cloud

Most IaaS (infrastructure as a service) vendors such as Rackspace, Amazon and Savvis use various virtualization technologies to manage the underlying hardware they build their offerings on. Unfortunately the virtualization technologies used vary from vendor to vendor and are sometimes kept secret. Therefore, the question about virtual machines versus physical machines for high performance computing (HPC) applications is germane to any discussion of HPC in the cloud.

This whitepaper from IBM examines aspects of computing important in HPC (compute and network bandwidth, compute and network latency, memory size and bandwidth, I/O, and so on) and how they are affected by various virtualization technologies. The benchmark results presented will illuminate areas where cloud computing, as a virtualized infrastructure, is sufficient for some workloads and inappropriate for others. In addition, it will provide a quantitative assessment of the performance differences between a sample of applications running on various hypervisors so that data-based decisions can be made for datacenter and technology adoption planning.

The conversation starts with a business case for HPC clouds

HPC architects have been slow to adopt virtualization technologies for two reasons:

The common assumption that virtualization impacts application performance so severely that any gains in flexibility are far outweighed by the loss of application throughput.

Utilization on traditional HPC infrastructure is very high (between 80 – 95 percent).

In many cases, however, HPC architects would be willing to lose some small percentage of application performance to achieve the flexibility and resilience that virtual machine based computing would allow.

Security

Application stack control

High value asset maximization

Utilization improvement

Large execution time jobs

Increases in job reliability

Here’s the important takeaway – As with most legends, there is some truth to the notion that VMs are inappropriate for HPC applications. The benchmark results demonstrate that latency sensitive and I/O bound applications would perform at levels unacceptable to HPC users. However, the results also show that CPU and memory bound applications and parallel applications that are not latency sensitive perform well in a virtual environment.

HPC architects who dismiss virtualization technology entirely may therefore be missing an enormous opportunity to inject flexibility and even a performance edge into their HPC designs. Download this whitepaper today to learn how the power of Platform Cluster Manger – Advanced Edition and IBM Platform LSF is their ability to work in consort to manage both of these types of workload simultaneously in a single environment.

These tools allow their users to maximize resource utilization and flexibility through provisioning and control at the physical and virtual levels. Only IBM Platform Computing technology allows for environment optimization at the job-by-job level, and only Platform Cluster Manager – Advanced Edition continues to optimize that environment after jobs have been scheduled and new jobs have been submitted. Such an environment could realize orders of magnitude increases in efficiency and throughput while reducing the overhead of IT maintenance.

Resource Links:

Latest Video

Industry Perspectives

Often, it’s not enough to parallelize and vectorize an application to get the best performance. You also need to take a deep dive into how the application is accessing memory to find and eliminate bottlenecks in the code that could ultimately be limiting performance. Intel Advisor, a component of both Intel Parallel Studio XE and Intel System Studio, can help you identify and diagnose memory performance issues, and suggest strategies to improve the efficiency of your code. [READ MORE…]

White Papers

With the exponential growth of data that needs to be analyzed and the data resulting from ever-more complex workflows, the need for faster data movement has never been more challenging and critical to the worlds of High Performance Computing (HPC) and machine learning. Mellanox Technologies is once again moving the bar forward with the introduction of and end-to-end HDR 200G InfiniBand product portfolio. Download the new white paper, courtesy of Mellanox, that explores in-network computing and the benefits of the switch from 100G to 200G Infiniband.