Contents

PBS Introduction

PBS is an alternative to our current method of managing machines on the cluster where jobs are managed by a scheduler with hopes of better resource utilization. It will enable a number of new workflows since it can handle the submission of a large number of jobs. Additionally, this means that it should be harder to overload a machine CPU/RAM-wise (assuming cooperative participation). We are experimenting using PBS to manage a portion of the nodes on the cluster (currently some subset of the jude nodes). Since PBS doesn't know about non-PBS usage of a machines, you will not be able to ssh to machines managed by PBS.

nlpsub --tail xyzzy (run 'xyzzy' in batch mode but tail the output as it appears -- note that NFS will cause to not be realtime)

Queues and priorities can be abbreviated, so "nlpsub -ppreemptable -qverylong xyzzy" could also be typed as:

nlpsub -pp -qv xyzzy

Other critical commands:

showq (display status of the cluster)

qdel <jobnumber> (kill a running job or unqueue a scheduled job)

Basics

There are a couple commands you should know: nlpsub (submitting jobs), showq (monitoring the cluster), qdel/qhold/qrls (managing running jobs). All commands can be run from any machine on the cluster.

nlpsub

nlpsub helps you submit jobs to the grid. For those who remember it, it is similar to qqqsub (though is not a drop-in replacement). nlpsub includes a help menu which you should familiarize yourself with (type nlpsub -h or just nlpsub with no arguments to display it). There are two classes of jobs -- interactive and batch. Interactive jobs are just like ssh'ing to a machine -- if you type nlpsub -i, nlpsub will put you on a free core. Note that you do not get exclusive use of a machine unless you ask for it. Batch jobs run outside of a terminal. For these, nlpsub will create a directory inside the current directory to store the stdout/stderr of your command. To run a command on PBS, just type nlpsub [command] [arguments to command]. Note that if you're running a Java command, nlpsub will automatically detect memory use from -Xmx flags. For running non-Java commands, you will need to specify memory use.

At present, nlpsub only supports submitting a single command at a time (except for [array jobs]) so you'll need to run nlpsub over each command. If there's demand, it will support submitting a whole list of commands at once.

qdel/qhold/qrls

These commands operate on PBS job IDs. When you submit a job with nlpsub, nlpsub will report the jobs PBS job ID. Additionally, [showq] will list these as well.

qdel deletes jobs from the queue and kills running jobs. qhold will put a hold on a job in the queue (thus causing it to not be run). This can be used when you want to let someone else run ahead of you or if you've made a mistake in your job that you'd like to correct first. qrls removes the hold on a queued job.