At RIT’s Research Computing, we use a piece of software called SLURM to manage the many users we have contending for access to our limited physical resources. SLURM has a number of commands you may be unfamiliar with; the purpose of this document is to introduce to their basic usage.

Prerequisites:Getting a Research Computing Account – http://apply.rc.rit.eduConnecting to Research Computing Systems with SSHFile Management - Getting your files to and from RC systemsUsing the bash shell and running software

Note

You might also want to check out these screencasts of the workshops we ran introducing how to use tropos:

Here you can see that there are 27 jobs currently known by the SLURM scheduler.

The 12 jobs that have an R in the ST column are currently in the running state. They are executing on three different remote nodes, h1, h2, and h3. The rest of the jobs have a PD in the ST column meaning they are in a pending state. They are pending for different reasons – some do not have sufficient priority to be running yet whereas another is marked as requesting resources that are not yet available.

The USER column (perhaps obviously) indicates what user owns the submitted job. Here the abc1234 user owns most of the submitted jobs; another xyz5678 user owns one other that is waiting for access.

The TIME column indicates how long the jobs have been running.

Note

The squeue command is one command of many provided by the SLURM scheduler. You can break it down phonetically as s-queue. All SLURM commands begin with an s (for SLURM). The queue part means that this command will display the queue of jobs waiting for or currently consuming resources provisioned by the scheduler.

So the cluster at this point looks ‘pretty full’, meaning that there are jobs waiting in the queue to get access; the computing resources look fully occupied.

You may have noticed that there was a PARTITION column in the squeue output and that all the jobs listed were marked as under the work partition.

We’ll use another command, the sinfo command, to get another look at the cluster’s status and find out more about these partitions:

Here we see that the cluster is divided into two partitions, debug and work.

The debug partition has a small time limit and a small number of cores (8). It’s purpose (perhaps obviously) is to handle jobs for debugging – when you are first writing your scripts to submit work.

The work partition is the main partition. The time limit for jobs is 14 days. You can see from the sinfo output above that all of the nodes in that partition are currently allocated.

Submitting your first job

Submitting jobs to the cluster requires you to have written a script that defines your workload and metadata about it. Lucky for you, we’ve written a handy example-creator called slurm-make-examples.sh. It will copy some examples into your home directory.

Note

This script is currently broken but the NON modified files live in /shared/slurm/examples. Be sure to edit these examples and replace any instance of USER with your uername!

The file we’re going to be working with first is slurm-single-core.sh. It is a SLURM job file that describes...

Metadata about the job we’re going to submit

The payload of the job; the actual work we want to get done.

Let’s take a look at it. Run the following command, lessslurm-single-core.sh:

[abc1234@tropos example-1-simple-jobs []]$ less slurm-single-core.sh
#!/bin/bash -l
# NOTE the -l flag!
#
# This is an example job file for a single core CPU bound program
# Note that all of the following statements below that begin
# with #SBATCH are actually commands to the SLURM scheduler.
# Please copy this file to your home directory and modify it
# to suit your needs.
#
# If you need any help, please email rc-help@rit.edu
#
# Name of the job - You'll probably want to customize this.
#SBATCH -J test
# Standard out and Standard Error output files
#SBATCH -o test.output
#SBATCH -e test.output
#SBATCH --mail-user abc1234@rit.edu
# notify on state change: BEGIN, END, FAIL or ALL
#SBATCH --mail-type=ALL
# Request 5 minutes run time MAX, anything over will be KILLED
#SBATCH -t 0:5:0
# Put the job in the "debug" partition and request one core
# "debug" is a limited partition. You'll likely want to change
# it to "work" once you understand how this all works.
#SBATCH -p debug -n 1
# Job memory requirements in MB
#SBATCH --mem=300
#
# Your job script goes below this line.
#
echo "(${HOSTNAME}) sleeping for 1 minute to simulate work (ish)"
sleep 60
echo "(${HOSTNAME}) Ahhh, alarm clock!"

You’ll see by the first line, #!/bin/bash, that this is a bash script. As you might already know, any line in a bash script that begins with a # is a comment and is therefore disregarded when the script is running.

However, in this context, any line that begins with #SBATCH is actually a meta-command to the SLURM scheduler that informs it how to prioritize, schedule, and place your job.

The last three lines are the ‘payload’ of the job. In this case it just prints out a statement, goes to sleep for 60 seconds (pretending to work) and then wakes up and prints one last statement. Very important scientific work, don’t you agree?

Let’s give this script a run. We’ll submit it to the SLURM scheduler using the sbatch command, but we need one more piece of information before we do.

Research Computing divvies out resources to users by way of Qualities-of-Service (or QOSes). If you don’t know what QOS your account is in, you can run the show-my-qos command. If things are still unclear, you can email rc-help@rit.edu to ask, but you are most likely in the rc or free QOS. For each grouping of users, we define two different priority-levels under which you can submit jobs.

Neat! This is the output that would normally be printed to the screen, printed instead to the contents of the output file we specified in our SLURM job script slurm-single-core.sh. Our code was executed on the remote compute node called einstein and its results were redirected over NFS back to us.