ElastiCluster is an open source tool to create and manage compute clusters on
cloud infrastructures. The project was originally created by the Grid
Computing Competence Center from the University of Zurich.

SLURM is a highly scalable cluster management and resource manager, used by
many of the world’s supercomputers and computer clusters (it is the workload
manager on about 60% of the TOP500 supercomputers).

The following video outlines what you will learn in this tutorial. It shows a
SLURM HPC cluster being deployed automatically by ElastiCluster on the Catalyst
Cloud, a data set being uploaded, the cluster being scaled on demand from 2 to
10 nodes, the execution of an embarrassingly parallel job, the results being
downloaded, and finally, the cluster being destroyed.

Warning

This tutorial assumes you are starting with a blank project and using your VPC
only for ElastiCluster. You may need to adjust things (e.g. create a dedicated
elasticluster security group), if you are doing this in a shared VPC.

The following commands are provided as examples of how to use ElastiCluster to
create and interact with a simple SLURM cluster. For more information on
ElastiCluster, please refer to https://elasticluster.readthedocs.org/.

Deploy a SLURM cluster on the cloud using the configuration provided:

elasticluster start slurm -n cluster

List information about the cluster:

elasticluster list-nodes cluster

Connect to the front-end node of the SLURM cluster over SSH:

elasticluster ssh cluster

Connect to the front-end node of the SLURM cluster over SFTP, to upload (put
file-name) or download (get file-name) data sets:

There is an options to use elasticluster with server group anti-affinity groups to
ensure best load distribution in Opentack cluster.
To use this feature clone elasticluster from the repository shown below, this is a
temporary step until the feature gets merged upstream.