GridEngine FAQ

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

This is a short introduction on HPCVL installation of Grid
Engine and its basic usage commands. This shows to you how to use
Sun Grid Engine (GE) to submit and control jobs on the HPCVL Sun
Fire cluster. Note that the use of this software is
mandatory. Please familiarize yourself with Grid Engine by
reading this FAQ file, and the documentation listed in it.

1 General Overview
of Grid Engine

What is Grid
Engine?

Sun Grid Engine is a Load Management System
(LMS) that allocates resources such as processors (CPU's),
memory, disk-space, and computing time. Grid Engine like
other LMS's enables transparent load sharing, controls the
sharing of resources, and also implements utilization and
site policies.

It has many characteristics including batch
queuing and load balancing, as well as giving
the users the ability to suspend/resume jobs and check the
status of their jobs.

Grid Engine can be used through the command line or
through a Graphical User Interface (GUI) called
qmon, which both have the same set of commands.

Additional information about Grid Engine features will
follow in the next sections and the documents referenced in
this FAQ.

Which version of Grid Engine is
currently in use in HPCVL machines?

The
present version of Grid Engine in HPCVL machines
is:

Sun Grid Engine 6.1 (update 4)

This version of the software was
designed for grid computing, ie, to allow the
distribution of a workload over a network of computers
that extends over several sites. This version allows
the allocation of priorities, and the implementation
of Utilization Policies. Please check this FAQ
occasionally to stay informed about changes in the
usage of Grid Engine.

How do I Setup my Environment to use
Grid Engine?

When you first log in you will already
have the proper setup for using Gridengine. This is because
Gridengine is included in the default settings for
usepackage. If for some reason Gridengine
seems not to be part of your environent setup, you can add
it by issuing the

use sge6

command.

Part of the setup that is done automatically by
usepackage is to source a setup-script that is
located in the directory

/opt/n1ge6/default/common/

Depending on your login shell, you can also
"source" those scripts manually:

For csh or tcsh:

/opt/n1ge6/default/common/settings.csh

For ksh, bash, etc:

/opt/n1ge6/default/common/settings.sh

The setup script modifies your search
PATH and sets other environment variables
that are required to get Grid Engine running. One of those
variables is SGE_ROOT which contains the
directory in which the Grid Engine-related programs are
located.

How do I start using Grid
Engine?

Grid Engine provides two ways to run your
jobs, the first is directly from the command line or
through the QMON GUI, and it's up to the user to
choose what is convenient for her.

However, if the job is simple and consists on only few
commands then the submission is more easily done done via
the command line. If the job requires the setup of many
options and special requests, the use of the GUI is helpful
(at least first time when you are writing your script), and
facilitates the navigation through the available options.

What are the most commonly used Grid
Engine Commands?

Sun Grid Engine has a large set of
programs that let the user submit/delete jobs, check job
status, and have information about available queues and
environments. For the normal user the knowledge of the
following basic commands should be sufficient to get
started with Grid Engine and have full control of his jobs:

qconf:

Shows (-s) the user the
configurations and access permissions only.

For example qconf -sqlwill
give you a list of all available queues.

qdel:

Gives the user the ability to delete his own
jobs only.

qhost:

Displays status information about Sun Grid
Engine execution hosts.

qmod:

Modify the status of your jobs (like
suspend/resume).

qmon:

Provides the X-windows GUI command
interface.

qstat:

Provides a status listing of all jobs and
queues associated with the cluster.

qsub:

Is the user interface for submitting a job to
Grid Engine.

All these commands come with many options and switches
and are also available with the GUI QMON. They all
have detailed man pages (e.g. ">man qsub"),
and are documented in the Sun Grid Engine 6 User's Guide. (about 2.2 MB)

2 Submitting your
job with Grid Engine

What are
the different kinds of jobs that I can run with Grid
Engine?

You can submit to Grid Engine all kinds of
jobs, starting from a simple UNIX command like
date to more elaborated batch scripts like
shared-memory parallel jobs or MPI jobs.

You can also open interactive sessions to use e.g.
visualization programs.

You can submit an array of jobs which is a job
consisting of a range of independent identical tasks, which
may be helpful in certain applications that involve
repeated execution of the same set of tasks.

What are the Grid Engine Queues in
HPCVL system?

Grid Engine uses the notion of a
queue to distinguish between the different different
types of jobs and the different components of the HPCVL
cluster. Grid Engine queues can allow execution of many
jobs concurrently, and Grid Engine tries to start new jobs
in the queue that is most suitable and least loaded.

Note, that a job is always associated with its queue,
and depends on the status of this queue, but, users do not
need to submit jobs directly to a queue. You only need to
specify the requirement profile of the job, which includes
memory, available software and type of job (parallel or
not, MPI,...).

Although you don't submit jobs directly to a queue you
still need to know which queue is handling your job and
what are the characteristics of this queue. On the HPCVL
system, we have presently five different queues that are
used for different purposes. If you type

qconf -sql

you will see a list of all available queues. In
particular, you'll find the following:

m9k.qThis
is the default queue. All jobs other than simple short
test jobs are sent to this queue automatically. It is
associated with
the
M9000 Cluster of Fujitsu Sparc64-VII based Sun
Enterprise M9000 servers. It is used to schedule
serial and parallel jobs to these high-memory
dually-threaded nodes m9k00[1-8].

vf.q This
queue is associated with
the
Victoria Falls Cluster of Niagara-2 based Sun
T5140 servers. It is used to schedule serial and
parallel jobs to these highly multi-threaded
nodes vf00[01-73].

abaqus.q
This queue is exclusively used to run Version 6.7,
6.8, or 6.9 of the finite-element
software Abaqus
on a "mini-cluster" of 5 8-core Sunfire X4140 Opteron
nodes running Linux. It is used to schedule serial and
parallel jobs to these specialized nodes
sw000[1-4,10].

How do I write and submit batch
jobs?

To run a job with grid engine you have to
submit it from the command line or the GUI. But first, you
have to write a batch script file that contains all the
commands and environment requests that you want for this
job. If, for example, test.sh is the name
of the script file (a sample script file can be found
here ), then use the command qsub to submit the job:

qsub test.sh

And, if the submission of the job is successful, you
will see this message:

your job 1 (``test.sh'') has been submitted.

After that, you can monitor the status of your job
with the command qstat or the GUI qmon.

When the job is finished you will have two output files
called "test.sh.o1" and "test.sh.e1".

Now, let's take a look at the structure of a Grid Engine
batch job script. We first recall that a batch job is a
UNIX shell script consisting of a sequence of UNIX
command-line instructions (or interpreted scripts like
perl,...) assembled in a file.

And in Grid Engine, it is a batch script that contains
additionally to normal UNIX command special comments lines
defined by the leading prefix ``#$''.

The first line of the batch file starts with

#! /bin/bash

which is default shell interpreter for Grid Engine. But
you can force Grid Engine to use your preferred shell
interpreter (bash for example) by adding this line at your
script file

#$ -S /bin/bash

to tell GE to run the job from the current working
directory add this script line

#$ -cwd

if you want to pass some environment variable VAR (or a
list of variables separated by commas) use the -v
option like this

#$ -v VAR
or
#$ -V

The former sets a specific variable, while the latter
passes all variables listed
in env.

Insert the full path name of the
files to which you want to redirect the standard
output/error respectively (the
full pathname is actually not necessary if the #$
-cwd option was used).

#$ -o {file for standard output}
#$ -e {file for standard error}

The prefix #$ has
many options and is used the same way you use
qsub, so check qsub man
pages to take a look at those options.

Here is a
serial sample script that has to be modified to
fit your case. All entries enclosed in {} must be
replaced.

Insert your email-address after #$ -M.

Note that that qsub usually
expects shell scripts, not executable files. To
submit the job you simply type

qsub serial.sh

Note that from the command line
you can issue options and type, for instance

qsub -cwd -v VAR=value -o /home/tmp -e /home/tmp serial.sh

How do I submit an Array of Jobs?

An array of jobs is a job consisting of a
range of independent identical tasks.

You submit an array of jobs by using the
qsub command with the -t
option like this:

qsub -t 2-10:2 serial.sh

where the -t option defines the task
index range (check qsub manpages for more
details).

How do I Submit Jobs to other than the default queues?

Our main production environment consists of 8 Sun Enterprise M9000 servers.
When you submit jobs, by default this is the set of machines on
which your job will run. The associated queue is m9k.q

However, we have two other large clusters, namely
the Victoria Falls cluster and the Sunfire 25K
cluster. Both of these have their own
queues, vf.q and production.q,
respectively.

It is possible that code compiled on the login node
(and therefore optimized for the US IV+ chip) will not
run efficiently on the Niagara 2 chips of the VF
cluster, or on the Sparc64-VII chips of the M9000
cluster. See
our Parallel
Programming FAQ for suggestions on how to optimize
code for architectures other than US IV+.

How to add another cluster to your job request

Only do this if your code can run on either machine!

Let's say you want to include the Sunfire 25K machines in
the list of possible machines to run your job. Here is
what you can do:

(simplest) the job can run on either machine:

#$ ... other directives ...
#$ -q production.q

ensure the job must run on the newly
added cluster (in this case, the Sunfire cluster):

#$ -clear
#$ ... other directives ...
#$ -q production.q

The -clear removes any defaults for subsequent Grid Engine
directives in this job (and only in this job), in particular
the default production queue setup.

In this case, the defaults are not removed, but the
-l option selects the M9000 queue m9k.q as the only
acceptable choice, and ensures that the job is
scheduled there.

3 Monitoring and
Controlling Jobs

How do I
monitor my jobs?

After submitting your job to Grid
Engine you may track its status by using either the
qstat command, the GUI interface
QMON, or by email.

Monitoring with qstat

The
qstat command provides the status of all
jobs and queues in the cluster. The most useful options
are:

qstat:Displays list of all jobs of the current user with no queue status
information.

qstat -u
hpc1***: Displays list of all jobs
belonging to user hpc1***

qstat -u
'*': Displays list of all jobs
belonging to all users.

qstat
-f: gives full information about jobs and
queues.

qstat -j [job_id]: Gives the
reason why the pending job (if any) is not being
scheduled.

You can refer to the man pages for a complete
description of all the options of the qstat
command.

Monitoring Jobs by
Electronic Mail

Another way to monitor your jobs is
to make Grid Engine notify you by email on status of the
job.

In your batch script or from the command line use the
-m option to request that an email should
be send and -Moption to precise the
email address where this should be sent. This will look
like:

#$ -M myaddress@work
#$ -m beas

Where the (-m) option can select after
which events you want to receive your email. In particular
you can select to be notified at the beginning/end of the job, or when the job
is aborted/suspended (see the sample script
lines above).

And from the command line you can use the options
(for example):

qsub -M myaddress@work job.sh

How do I Control my jobs ?

Based on the status of the job displayed, you can
control the job by the following actions:

Modify a
job: As a user, you have certain rights that
apply exclusively to your jobs. The Grid Engine
command line used is qmod. Check
the man pages for the options that you are allowed
to use.

Suspend (or Resume) a job: This uses
the UNIX killcommand, and applies
only to running jobs, in practice you type

qmod -s (or -r) job_id

where job_idis given byqstatorqsub.

Delete a job: You can delete a job
that is running or spooled in the queue by using
the qdel command like this

qdel job_id

where job_id is given by
qstat or qsub.

Note that if your job is not on the waiting
queue, but is already executing, you need to issue
the -f (force) option with the
qdel job_id command to terminate
the job.

Monitoring and controlling with
QMON

You can also use the GUI QMON, which gives a
convenient window dialog specifically designed for
monitoring and controlling jobs, and the buttons are
self explanatory.

4 Parallel jobs with
Grid Engine

What are the
Parallel Environments available under HPCVL Grid
Engine?

A Parallel Environment is a programming
environment designed for parallel computing in a network of
computers, which allows execution of shared memory and
distributed memory parallelized applications. The most
commonly used parallel environments are Message Passing
Interface (MPI) for distributed-memory machines, and OpenMP
for shared-memory achines.

For MPI there is a SUN implementation which is part
of Sun HPC ClusterTools. It's located under
/opt/SUNWhpc directory, (check the HPCVL Parallel
Programming FAQ for more details)

For
OpenMP, no separate runtime environment is
required. Details about shared-memory programming and
multi-threading with OpenMP may be found in the HPCVL Parallel
Programming FAQ.

Grid Engine provides an interface to handle parallel
jobs running on the top of these parallel
environments. For the users convenience HPCVL has
predefined parallel environment interfaces for them.
You can check the list of available PE by the command
qconf -spl, which gives the environments
described hereafter:

# qconf -spldist.peshm.pe

dist.pe

This environment is intended for distributed memory
applications using the Sun HPC ClusterTools libraries, in
particular MPI. Grid Engine will assign the
dist.pejobs to the
production.q queue and try to use fastest
connection available between the slots and nodes. Although
the system will try to allocate processes on as few nodes
as possible, it will be allowed to spread them out over the
cluster, since this parallel environment is meant to handle
distributed-memory jobs.

shm.pe

This
environment is intended for shared-memory
applications. Grid Engine will assign the processors
in a single node to take advantage of the fastest
connection available between the slots.

shm.pe jobs are submitted to the
production.q queue, i.e. to nodes
hpcvl[0-9]. It is permissable to use
shm.pe for distributed-memory (e.g. MPI)
jobs, if the intention is to keep them within a single
node. Note that this might speed up communication, but also
lead to longer waiting periods.

vfdist.pe

This environment serves the same purpose
as dist.pe, but is designed for the Victoria
Falls cluster, and restricts the scheduling of
processes to a 40-node sub-cluster that is internally
connected through 10 Gig Ethernet.

m9kdist.pe

This environment serves the same purpose
as dist.pe, but is designed for the M9000
cluster. It will employ 10 Gig Ethernet.

abaqus.pe, fluent.pe, matlab.pe

These are specialized environments that are used
for parallel runs for the application software
packages Abaqus, Fluent,
and Matlab, respectively. These
applications need their own parallel environments to
keep track of available licenses, and to run auxillary
commands.

How do I submit a multi-threaded job?

You need to specify the parallel environment to use,
which is shm.pe in our case, and how many
processors are going to be used. This is done via the
script line:

#$ -pe shm.pe 16

if you want to use 16 processors. This sets and
environment variable
NSLOTS and requests the corresponding
number of processes.

There is no request for parallel queues or
special complexes, but like in an interactive run of
multi-threaded program you need to set the variables
PARALLELand also
OMP_NUM_THREADS (in case of OpenMP
applications) to the number of processors to be used. Add
the following lines to your mt_job.sh
script file (bash syntax):

export PARALLEL=$NSLOTS
export OMP_NUM_THREADS=$NSLOTS

Here is
a multi
threaded sample script with these environment
variables predefined, in which all entries enclosed in
{} need to be replaced by the appropriate values (for
instructions, see the serial job section). In that
case to run the job you simply type

qsub mt_job.sh

How do I submit a parallel MPI
job?

A specific parallel environment needs to be
specified, to let the system know which environment and how
many processors are going to be used. This is done via the
script line:

#$ -pe dist.pe 16

where the number of processors is 16 in this case.

In the standard mpirun command, you do not
have to specify the number of processes through
the -np option, because the Cluster
Tools runtime system knows that resource allocation
will be done by Grid Engine and determines the number
of processes from the -pe directive.

Here is
an mpi
sample script , in which all entries enclosed in
{} need to be replaced by the appropriated values (for
instructions, see the serial job section).

To run this job you simply type

qsub mpi_job.sh

5
Where can I get more help and documentation?

Grid Engine
has a lot more options and possibilities for every kind of
jobs. Here, we gave the user only the basic steps to get started
using GE. Detailed documentation is available. First, there is
The User's Guide which should answer almost all of your questions.

For specific commands, the man pages are very comprehensive and
should be consulted.

HPCVL also offers user support; for questions about this FAQ and
the usage of Grid Engine in HPCVL machines contact us.