Advanced SGE: Submitting Mixed Mode Jobs

The ‘mixed mode’ (MPI+OpenMP) programming model is supported on ARC2, ARC3 and Polaris (the N8 shared cluster). This typically involves MPI processes running across nodes and OpenMP threads oneach node with the total number of processes (MPI*OpenMP) equaling the number of physical processor cores.

Your code will need to call MPI_Init and make use of OpenMP directives. You will compile your code using an MPI wrapper and enabling OpenMP support, for example

1

2

3

mpif90-openmp example.f90-omixed.exe

You will need to determine
ppn , the number of MPI processes per node, and
tpp , the number of OpenMP threads per MPI process.

Additionally, you can either ask for a given number of nodes
nodes or for the total number of MPI processes
np . Note that
ppn is related to
np since
ppn=np/nodes .

Your submission script would then need to contain:

1

2

3

4

5

6

#$ -V

#$ -l hr_t=01:00:00

#$ -l nodes=$nodes,ppn=$ppn,tpp=$tpp

mpirun./a.out

or

1

2

3

4

5

6

#$ -V

#$ -l hr_t=01:00:00

#$ -l np=$np,ppn=$ppn,tpp=$tpp

mpirun./a.out

Given there are 16 cores per node, you would typically ensure
ppn*tpp=16

Example

To run an MPI+OpenMP executablemixed.exe with 64 MPI processes each launching 4 OpenMP threads, the following submission script would be needed:

1

2

3

4

5

6

7

8

#$ -V

#$ -cwd

#$ -b y

#$ -l hr_t=01:00:00

#$ -l np=64,ppn=4,tpp=4

mpirun./mixed.exe

This will allocate 16 nodes (=16*16=256 cores).
Each node will have 4 MPI processes, each of which will have 4 OpenMP threads (so 4*4=16 processes per node in total, and 16*16=256 (=64MPI*4OpenMP) processes in total.

Alternatively, the same effect can be achieved by:

1

2

3

4

5

6

7

8

#$ -V

#$ -cwd

#$ -b y

#$ -l hr_t=01:00:00

#$ -l nodes=16,ppn=4,tpp=4

mpirun./mixed.exe

Note that the
OMP_NUM_THREADS environment variable is automatically set by the batch system and so you do not need to set this in your environment.