The current production version of the Gaussian 09 software package is installed
on lewis under /share/apps/gaussian/g09.
Example input files for the Gaussian 09 test jobs are in
/share/apps/gaussian/g09/tests/com/.

Before you can run Gaussian 09 jobs on lewis, you must be authorized as a member of the
gaussian group. To determine if you
are authorized, enter the
id command, and verify that gaussian is listed as one
of the groups to which your account belongs.
If you are not a member of the
gaussian
group, please contact the system administrator via email at
support@rnet.missouri.edu.

Gaussian 09 on lewis must be invoked via the run_g09
command, and
Gaussian 09 jobs must be run as batch jobs under LSF, submitted via the
bsub
command. For general information about how to
submit LSF batch jobs on lewis, see Submitting
Jobs via LSF.

A special LSF queue, named "gaussian" has been created for Gaussian jobs.
The compute nodes that are scheduled by the gaussian queue have been configured
specifically to support efficient execution of Gaussian jobs, including large
locally attached scratch disk space for fast I/O to the work files used by Gaussian jobs.
Gaussian jobs can be submitted to other queues, including the idle queue,
but users are urged to submit Gaussian jobs to the gaussian queue.

You can submit jobs to the gaussian queue by using the "-q"
option of the bsub command:

#BSUB -q gaussian (in the job script file)

or

bsub -q gaussian ... (on the command line)

When you submit Gaussian jobs to the gaussian queue, please be aware of the fact
that each of the gaussian nodes has its own unique /scratch
file system that is used by Gaussian for temporary work files. Each gaussian
node has about 11 TB of usable locally attached disk space for /scratch.
There is a global /scratch file system that is accessible from the lewis login node
and from all of the other compute nodes, but it is not the one that is used by
the gaussian nodes. If you need to access any of the work files in
/scratch belonging to a Gaussian job that was submitted to the gaussian
queue, while that job is running or after it has run, you must log in via SSH to
the specific compute node where the job ran in order to access the files. The node where
the job ran is displayed by the bjobs command, and it is also shown in the
standard output file for the job. For example, if the job runs on compute node
c14a-05, then you can log in to node c14a-05 from the lewis
login node using the following command:

$ ssh c14a-05

When you are finished, log out of the compute node to return to the lewis login node.
But remember, each gaussian node has its own unique /scratch file system that is
separate from the global /scratch file system that is used by all other nodes.

An LSF job script file named like "jobfile"
for a Gaussian 09 job should look something like this:

This job script specifies that the job name will be
test1, and that the standard
output and error output files will be named
test1.onnnnnn and
test1.ennnnnn, respectively,
where nnnnnn is the job
number assigned by LSF. It also specifies that the job needs 2 CPUs on the same
host, and 2048 MB of memory.
The run_g09
command sets up the Gaussian 09 environment, and then invokes
g09 with the specified input file,
in this example test001.com.
See
Submitting Jobs via LSF for a more detailed description of the
#BSUB parameters and how to
submit the job via the
bsub command.

The -n parameter specifies how many
CPUs should be used for the job.
You must not specify anything in your Gaussian 09 input file about how many
processors your job will use.
See the section
Running Gaussian 09 with Linda below
for the proper way to specify the use of multiple CPUs/nodes for a Gaussian 09 job on lewis.
The mem= parameter specifies how much memory
(in megabytes) LSF should allocate for the job. This amount of memory must match
the amount of memory specified in the Gaussian 09 input file.
(Note: The value specified for mem=
when submitting the job to LSF must always
be given as the number of megabytes required, regardless of what units are used to
specify the memory amount in the Gaussian 09 input file.)

All of the parameters on the
#BSUB lines can also be specified with the
bsub command on the command line, but it is
convenient to put them in the job script file so that they are not forgotten.

Then, you can use the following command to submit your job on lewis, assuming you
have prepared your input file properly

bsub < jobfile

Use the bjobs command
to see which jobs are queued/executing/ended:

bjobs -a

The directory to use for temporary work files on lewis is /scratch. The
run_g09 command uses that directory by default. If
you explicitly specify the pathnames for work files in your .com files, please
modify the pathnames accordingly. Files in the /scratch
directory will be automatically deleted if they have not been accessed in more than 5 days.

If you run Gaussian 09 utilities such as newzmat
or formchk, you will need to set up the
Gaussian 09 environment for your interactive login session.
Place the following lines in the .bash_profile file in your home directory:

The version of Gaussian 09 on lewis includes Linda, which allows Gaussian 09 jobs
to be run across multiple nodes in the cluster, instead of being limited to just a single node.
Gaussian 09 with Linda is still invoked by executing the
run_g09 command, as you have been doing
previously on lewis. But there are some things that you will need to change.

First, because of the unique way (compared to other parallel applications on lewis) that Linda
uses SSH to launch processes on multiple nodes, and if you intend to run Gaussian 09 jobs
across multiple nodes, you will need to modify your account's
SSH configuration by doing the following:

Copy the system-level SSH configuration to your SSH configuration directory with the following commands:

$ cd ~/.ssh
$ cp /etc/ssh/ssh_config config

Edit the config
file that you just copied and make the following changes: uncomment
(remove the # at the beginning of)
the StrictHostKeyChecking line and change
"ask" to "no".

You must not specify anything in your Gaussian 09 input file about how many
processors or Linda workers your job will use. So, do not specify any of the
following Link 0 commands in the input file:

%NProc
%NProcShared
%NProcLinda
%LindaWorkers

The number of processors to be used and the number and names of Linda worker
nodes will be passed from LSF to Gaussian 09 via the
Default.Route file that
will be created by the run_g09
command in the current working directory from
where the job is submitted. That means that you cannot use the
Default.Route
file for your own purposes, because the run_g09
command will overwrite your file
with a new one. Any option that you might want to specify in the
Default.Route
file must be put into your Gaussian 09 input file instead.

There are now three ways to run Gaussian 09 using multiple processors (CPUs) on lewis:

On a single node, using multiple threads with shared memory,
and with all processors allocated on the same node.

Across multiple nodes, with multiple processes communicating via TCP over the
high speed Infiniband network, and with processors being allocated wherever they are available.
Even if multiple processors happen to be allocated on the same node, the processes
on that node will not use shared memory for communication.

Across multiple nodes, using multiple threads with shared memory on each node,
and an equal number of processors allocated on each node. Communication between
threads on the same node will be via shared memory,
and communication between processes on different nodes will be via TCP
over the high speed Infiniband network.

In general, communication between threads using shared memory is faster than
communication between processes via TCP. So, option 3 should perform better than
option 2 for the same number of processors. However, option 2 provides LSF with
more scheduling flexibility than option 3, and jobs using option 3 may have to wait
longer in the queue for nodes and processors to become available.

The option to be used is determined entirely by how you specify the number of processors
and the processor spanning requirements to LSF when you submit the job.
Remember, you must not specify the number of processors to be used in your Gaussian 09 input file.

For option 1, with all processors on the same node, you would specify something like the following in your job script file:

#BSUB -n 4
#BSUB -R "span[hosts=1] rusage[mem=2048]"

For option 2, there is no particular spanning requirement, so you omit the
span specification.
In the following example, 16 processors will be allocated across multiple nodes, wherever they are available:

#BSUB -n 16
#BSUB -R "rusage[mem=2048]"

For option 3, you must specify how many processors are to be allocated on each node via
the ptile spanning specification.
In the following example, a total of 12 processors will
be allocated on 3 nodes, with 4 processors on each node:

#BSUB -n 12
#BSUB -R "span[ptile=4] rusage[mem=2048]"

Please note that for option 3, the number of processors must be an even multiple of the
ptile value.
If the value of ptile is equal to the number of
processors requested, then the effect is the same as option 1.

You may see messages like the following in the job's error output file:

These messages are not indicative of a problem.
They indicate that the Linda work in a Gaussian link is finished, and that Gaussian
is continuing with a new link. They can be ignored.

Not all Gaussian 09 calculation can be parallelized via Linda.
HF, CIS=Direct, and DFT calculations on molecules are Linda parallel, including energies,
optimizations and frequencies. TDDFT energies and gradients and MP2 energies and gradients
are also Linda parallel. Portions of MP2 frequency and CCSD calculations are Linda parallel,
but others are only SMP (shared memory) parallel, so they see some speedup from using a few
nodes but no further improvement from larger numbers of nodes.

Also, the amount of speedup that you will see depends upon how much parallelism
can be used for various types of calculations.
Gaussian 09 may not be able to keep all processors busy all of the time.
There is also additional overhead due to network communications between nodes.
So, doubling the number of nodes used on a job will not reduce the execution time by half.
In general, as you increase the number of processors and nodes used for a job, you can
expect to see diminishing returns on the amount of speedup achieved.
You will need to run tests with the types of calculations that you normally perform
in order to determine how much attempted parallelism will result in the most effective use of resources.