Creating virtual clusters with Rocks

Administering Virtual Compute Nodes

After a virtual compute node is created, you can query the VM state from the front-end node to make sure the virtual machine boots successfully. To query the state of the compute nodes, execute the command rocks list host compute-0-0-1; output similar to Listing 1 will provide information about the currently installed virtual machines and their states on all VM containers. (The status output is blank because processes have not been scheduled to run on the newly created VM.) Also, you can check the state of the VMs on the VM container itself with Xen xm commands (Listing 2).

Listing 1

Checking Vitual Machine Status

Listing 2

The xm Command

The rocks command is the primary administration command for the entire Rocks management system. The basic command-line structure is rocks <command> <arguments>. A full list of Rocks commands is available at the rocksclusters.org website or by typing rocks list help at the command line. The rocks command allows you to start and stop compute nodes, change configurations, and query configuration entries.

Rocks includes distributed administration utilities for executing commands against an entire cluster or group of cluster nodes. The default command for distributed command execution is tentakel, which is described as a program for executing the same command on many hosts in parallel. Tentakel is simple to use, and Rocks automatically adds all cluster nodes to /etc/tentakel.conf in various groupings. To execute a command against all nodes in /etc/tentakel.conf, type tentakel <command>. To execute a command against a group of nodes, execute tentakel -g <group_name> <command>.

Clustered Applications

To utilize all of the cluster nodes in a single application, it must be designed for the cluster. OpenMPI is a software library that is commonly used to design distributed applications for running on clusters. Sun Grid Engine and Torque are queuing systems for distributing jobs among the elements of the cluster. Although the problem of designing and implementing applications to use all cluster resources is an extensive topic, you will find resources online about the use of Sun Grid Engine, Torque, and Condor for these purposes.

Conclusion

Rocks' ease of use and support of the most common distributed applications makes it a favorite among scientific supercomputing facilities working on unlocking the mysteries of the universe.

Matthew Sacks is a Systems Administrator and writer from Los Angeles, CA. Check out his blog at http://matthewsacks.com/techblog/. Special thanks go to the Rocks teams at NSF and to the UCSD Supercomputing Center.

Cloud computing has become a viable option for highperformance computing. In this article, we discuss the use case for cloud-based HPC, introduce the StarCluster toolkit, and show how to build a custom machine image for compute nodes.