The GC3Utils are lower-level commands, provided to perform common
operations on jobs, regardless of their type or the application they run.

For instance, GC3Utils provide commands to obtain the list and status
of computational resources (gservers); to clear the list of jobs from
old and failed ones (gclean); to get detailed information on a
submitted job (ginfo, mainly for debugging purposes).

This chapter is a tutorial for the GC3Utils command-line utilities.

If you find a technical term whose meaning is not clear to you, please
look it up in the Glossary. (But feel free to ask on the
GC3Pie mailing list if it’s still unclear!)

All jobs managed by one of the GC3Pie scripts are grouped into
sessions; information related of a session is stored into a
directory. The gsession command allows you to show
the jobs related to a specific session, to abort the session or to
completely delete it.

The gsession accept two mandatory arguments: command and
session. command must be one of:

list

list jobs related to the session.

log

show the session history.

abort

kill all jobs related to the session.

delete

abort the session and delete the session directory from disk.

For instance, if you want to check the status of the main tasks of a
session, just run:

If you have never submitted any job, or if you have cleared
your job list with the gclean command, then gstat will
print nothing to the screen!

A job can be in one and only one of the following states:

NEW

The job has been created but not yet submitted: it only exists on
the local disk.

RUNNING

The job is currently running – there’s nothing to do but wait.

SUBMITTED

The job has been sent to a compute resource for execution – it
should change to RUNNING status eventually.

STOPPED

The job was sent to a remote cluster for execution, but it is stuck
there for some unknown reason. There is no automated procedure in
this case: the best thing you can do is to contact the systems
administrator to determine what has happened.

UNKNOWN

Job info is not found, possibly because the remote resource is
currently not accessible due to a network error, a misconfiguration
or because the remote resource is not available anymore. When the
root cause is fixed, and the resource is available again, the status
of the job should automatically move to another state.

TERMINATED

The job has finished running; now there are three things you can do:

Use the gget command to get the command output files back
from the remote execution cluster.

Use the gclean command to remove this job from the list.
After issuing gclean on a job, any information on it is
lost, so be sure you have retrieved any interesting output with
gget before!

If something went wrong during the execution of the job (it did
not complete its execution or -possibly- it did not even
start), you can use the ginfo command to try to debug the
problem.

The list of submitted jobs persists from one session to the other: you
can log off, shut your computer down, then turn it on again next day
and you will see the same list of jobs.

Note

Completed jobs persist in the gstat list until they are
cleared off with the gclean command.

Once a job has reached RUNNING status (check with gstat), you
can also monitor its progress by looking at the last lines in the job
output and error stream.

An example might clarify this: assume you have submitted a
long-running computation as job.16 and you know from gstat that
it got into RUNNING state; then to take a peek at what this job is
doing, you issue the following command:

gtail job.16

This would produce the following output, from which you can deduce how
far GAMESS has progressed into the computation:

By default, gtail only outputs the last 10 lines of a job
output/error stream. To see more, use the command line option -n;
for example, to see the last 25 lines of the output, issue the command:

gtail -n 25 job.16

The command gtail is especially useful for long computations: you
can see how far a job has gotten and, e.g., cancel it if it’s gotten
stuck into an endless/unproductive loop.

To “keep an eye” over what a job is doing, you can add the -f option
to gtail: this will run gtail in “follow” mode, i.e.,
gtail will continue to display the contents of the job output and
update it as time passes, until you hit Ctrl+C to interrupt it.

Once a job has reached RUNNING status (check with gstat),
you can retrieve its output files with the gget command. For
instance, to download the output files of job.15 you would use:

gget job.15

This command will print out a message like:

Job results successfully retrieved in '/path/to/some/directory'

If you are not running the gget command on your computer, but
rather on a shared front-end like ocikbgtw, you can copy+paste the
path within quotes to the sftp command to get the files to your
usual workstation. For example, you can run the following command in
a terminal on your computer to get the output files back to your
workstation:

sftp ocikbgtw:'/path/to/some/directory'

This will take you to the directory where the output files have been stored.

In case a job failed for accidental causes (e.g., the site where it
was running went unexpectedly down), you can re-submit it with the
gresub command.

Just call gresub followed by the job identifier
job.NNN. For example:

gresub job.42

Resubmitting a job that is not in a terminal state (i.e.,
TERMINATED) results in the job being killed (as with gkill)
before being submitted again. If you are unsure what state
a job is in, check it with gstat.

The gservers command prints out information about the configured resources.
For each resource, a summary of the information recorded in the configuration
file and the current resource status is printed. For example:

The title of each box is the “resource name”, as you would write it
after the -r option to gsub.

Access mode / type: it is the kind of software that is used for
accessing the resource; consult Section Configuration File for more
information about resource types.

Authorization name / auth: this is paired with the Access mode /
type, and identifies a section in the configuration file where authentication information for this
resource is stored; see Section Configuration File for more
information.

Accessible? / updated: whether you are currently authorized to
access this resource; note that if this turns False or 0 for
resources that you should have access to, then something is wrong
either with the state of your system, or with the resource itself.
(The procedure on how to diagnose this is too complex to list here;
consult your friendly systems administrator :-))

Total number of cores: the total number of cores present on the
resource. Note this can vary over time as cluster nodes go in and
out of service: computers break, then are repaired, then break
again, etc.

Total queued jobs: number of jobs (from all users) waiting to be
executed on the remote compute cluster.

Own queued jobs: number of jobs (submitted by you) waiting to be
executed on the remote compute cluster.

Own running jobs: number of jobs (submitted by you) currently
executing on the remote compute cluster.

Max cores per job: the maximum number of cores that you can
request for a single computational job on this resource.

Max memory per core: maximum amount of memory (per core) that you
can request on this resource. The amount shows the maximum
requestable memory in MB.

Max walltime per job: maximum duration of a computational job on
this resource. The amount shows the maximum time in seconds.

The whole point of GC3Utils is to abstract job submission and
management from detailed knowledge of the resources and their hardware
and software configuration, but it is sometimes convenient and
sometimes necessary to get into this level of detail...

The gselect command allows you to select Job IDs from a
GC3Pie session that satisfy the selected criteria. This command is
usually used in combination with gresub, gkill,
ginfo, gget or gclean, for instance:

$ gselect -l STOPPED | xargs gresub

The output of this command is a list of Job IDs, one per line. The
criteria specified by command-line options will be AND’ed together,
i.e., a job must satisfy all of them in order to be selected.

Use option –state STATE[,STATE...] to select jobs in one of the
specified states, for instance to select jobs in either STOPPED
or SUBMITTED state, run gselect –state STOPPED,SUBMITTED.

exitstatus

You can select jobs that terminated with exit status equal to 0
with –ok option. To select failed jobs instead (exit status
different from 0), use option –failed

Submissiontime

Use option –submitted-before DATE and –submitted-after DATE to select
jobs submitted before or after a specific date. DATE must be in
a human readable format recognized by the parsedatetime
<https://pypi.python.org/pypi/parsedatetime/> module, for
instance in 2 hours, yesterday or 10 November 2014, 1pm.

This command will show various information, if available, including
the number of jobs currently running (or in TERMINATED state) on
those VM, so that you can easily identify if there is a VM which is
not used by any of yours script and you can safely terminate it.

If you want to terminate a VM run the gcloud terminate command. In
this case, however, you also have to specify the name of the resource
with the option -r, and the ID of the VM you want to terminate:

$ gcloud terminate -r hobbes i-0000053e

An empty output is a signal that the VM has been terminated.

The EC2 backend keeps track of all the VM it created, so that if a
VM is not needed anymore it is able to terminate it
automatically. However, sometimes you may need to keep a VM up&running
and thus you need to tell the EC2 backend to ignore that VM.

This is possible with the gcloud forget command. You must supply the
correct resource name with -rRESOURCE_NAME and a valid VM ID, and
if the command succeeds then the VM will never be used by the EC2
backend. Please note also that after running gcloud forget, the VM
will not be shown in the output of gcloud list:

$ gcloud list -r hobbes
====================================
VMs running on EC2 resource `hobbes`
====================================
no known VMs are currently running on this resource.

You can also create a new VM using the default settings using the
gcloud run command. In this case too you have to specify the -r
command line option. The output of this command contains some basic
information about the created VM: