User Tools

Site Tools

Queueing System

All jobs on AHPCC clusters which require a significant amount of CPU or memory should be submitted through the queueing system. In general, two types of jobs may be passed into the queue:

A batch job - a specific command is executed on the node(s) assigned to the job without the need for user interaction. A vast majority of jobs ran on the HPC clusters are batch jobs.

An interactive job - a login shell is started on the first node assigned to the job. The user, in turn, specifies the commands to execute at the command prompt.

A compute node is an individual computer which can be used to execute jobs. Compute nodes are grouped into queues. All nodes assigned to a particular queue are identical. The queues differ from each other by the following factors:

type of cpu and number of cores on each node

number of nodes assigned

the maximum number of nodes allowed to be used by a single job

amount of memory

walltime - the maximum amount of execution time for a single job

Node to Queue Assignment

All compute nodes are divided into groups called partitions. A node can only belong to one partition. A queue is made up of a collection of partitions. A given partition can be assigned to multiple queues. As a result most nodes are not exclusively assigned to a single queue, but are shared between multiple queues. This configuration improves queue flexibility, but conceptually complicates the view of the queueing system for the user, i.e. makes it difficult to predict how many free nodes are there for a given queue. To help to determine the number of available nodes per queue, a script max_job_size is available: