Filling in the gaps among Linux clusters, part 2

While Linux clusters can handle the demands of high performance computing, they're still lacking in some key features, said Eric Pitcher, vice president of technical marketing for Linux Networx, a cluster systems provider in Bluffdale, Utah. In part one of this interview, he championed Linux clusters, citing their productivity and scalability. In part two, he discusses pricing and points out Linux clusters' current shortcomings.

What shortcomings exist in Linux HPC clustering technologies today?
: Although clusters are enjoying increasing popularity, there are several issues that could use improvement, such as cluster wide job control, checkpoint/restart and job migration. However, these issues can be overcome through additional software development.

Linux Networx and other open source communities are actively developing solutions to increase the usability of clusters. For example, we are a key contributor to several open source projects, such as LinuxBIOS, Lustre and SLURM to increase the manageability, efficiency and capabilities of Linux clusters.

Application availability on a Linux cluster is another area that can be improved. In this regard, Linux Networx partners with several ISVs [independent software vendors] like Fluent Inc., MSC.Software and many others to develop cluster solutions optimized for specific applications. Besides price, why would a company choose a Linux-clustered HPC solution and a supercomputer-based solution? Flexibility in the architecture is one of the main reasons why companies choose Linux clusters. Traditional supercomputers provide great performance for some applications, but don't provide the price/performance that many customers need to support the majority of their applications. However, clusters do excel in certain applications such as Fluent [Computational Fluid Dynamics], where clusters routinely exceed 70% parallel efficiency. This matches or exceeds the parallel efficiency of FLUENT on SMP systems, which allows customers to achieve HPC performance at greatly reduced cost.

Familiarity with Linux, including availability of source code, also makes Linux clusters a popular choice. Many companies don't like being locked into a proprietary OS (like Unix or Windows) and have the technical expertise to make changes to the OS to meet their specific needs. Does the need for value-added software, training and technical services when implementing an HPC Linux cluster reduce the difference in price between it and a supercomputer solution? No. These services are required on traditional supercomputers as well, and are typically more expensive than support for a Linux cluster system. For example, MPI is a popular library for parallelizing applications on both Linux clusters and supercomputers. If a company is deploying an application using MPI, a technical resource that understands MPI is necessary in both instances. System administrators also need to be familiar with the system regardless of the platform.

Also, the price of a Linux cluster system is often comparable to the annual maintenance fees charged by traditional supercomputer vendors. Supercomputers are known to be expensive to obtain and even more expensive to maintain. What factors differentiate cluster vendors from one another, besides the prices of their systems? Cluster vendors that have the ability to provide total cluster management; validate the latest components and integrate them successfully; design the optimal system to meet the customer's applications profile; and offer cluster expertise and training –- these are some of the items that differentiate a white box solution from a high productivity machine.

The real value of an HPC system is the amount of production an organization is able to get from the cluster during its lifetime. For example, the LANL [Los Alamos National Laboratory] Lightning system (2,816 processors) was delivered in just three months from when it was ordered, and the lab was running production jobs on the cluster. To guarantee high productivity on your cluster and maximum sustained performance, organizations need to look for vendors with these added-value items, along with Linux and clustering expertise. Are there enough high-powered, high performance applications for Linux, as opposed to the broad assortment for supercomputers? For the last decade, distributed memory systems have dominated HPC. As a result, there are more HPC codes written in portable MPI and able to run on a clustered architecture (whether proprietary or x86) than any other architecture. Similarly, Linux is explicitly supported on more hardware platforms than any other operating system in common use today. Indeed, the ease of application portability to Linux clusters is a key reason that Linux clusters are the fastest growing segment of the HPC industry today.

0 comments

E-Mail

Username / Password

Password

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy