Accounting for Bridges use

Accounting for Bridges use varies with the type of node used, which is determined by the type of allocation you have: "Bridges regular", for Bridges' RSM (128GB) nodes); "Bridges GPU", for Bridges' K80 and P100 GPU nodes; Bridges GPU-AI for Bridges' Volta GPU nodes and DGX-2 system; or "Bridges large", for Bridges LSM and ESM (3TB and 12TB) nodes.

Usage is defined in terms of "Service Units" or SUs. The definition of an SU varies with the type of node being used.

Bridges regular

The RSM nodes are allocated as "Bridges regular". This does not include Bridges' GPU nodes. Each RM node holds 28 cores, each of which can be allocated separately. Service Units (SUs) are defined in terms of "core-hours": the use of one core for 1 hour.

1 core-hour = 1 SU

Because the RM nodes each hold 28 cores, if you use one entire RM node for one hour, 28 SUs will be deducted from your allocation.

28 cores x 1 hour = 28 core-hours = 28 SUs

If you use 2 cores on a node for 30 minutes, 1 SU will be deducted from your allocation.

2 cores x 0.5 hours = 1 core-hour = 1 SU

Bridges large

The LSM and ESM nodes are allocated as "Bridges large". Accounting for the LM and ESM nodes is done by the memory requested for the job. Service Units (SUs) are defined in terms of "TB-hours": the use of 1TB of memory for one hour. Note that because the memory requested for a job is set aside for your use when the job begins, SU usage is calculated based on memory requested, not on how much memory is actually used.

1 SU = 1 TB-hour

If your job requests 3TB of memory and runs for 1 hour, 3 SUs will be deducted from your allocation.

3TB x 1 hour = 3TB-hours = 3 SUs

If your job requests 8TB and runs for 6 hours, 48 SUs will be deducted from your allocation.

8TB x 6 hours = 48 TB-hours = 48 SUs

Bridges GPU

Bridges contains two kinds of GPU nodes: NVIDIA Tesla K80s and NVIDIA Tesla P100s. Service Units (SUs) for GPU nodes are defined in terms of "gpu-hours": the use of one GPU Unit for one hour.

Because of the difference in the performance of the nodes, SUs are calculated differently for the two types of nodes.

K80 nodes

The K80 nodes hold 4 GPU units each, each of which can be allocated separately. Service Units (SUs) are defined in terms of gpu-hours:

For K80 GPU nodes, 1 GPU-hour = 1 SU

If you use 2 entire K80 nodes for 1 hour, 8 SUs will be deducted from your allocation.

4 GPU units/node x 2 nodes x 1 hour = 8 gpu-hours = 8 SUs

If you use 2 GPU units for 3 hours, 6 SUs will be deducted from your allocation.

2 GPU units x 3 hours = 6 gpu-hours = 6 SUs

P100 nodes

The P100 nodes hold 2 GPU units each, which can be allocated separately. Service Units (SUs) are defined in terms of GPU-hours. Because the P100s are more powerful than the K80 nodes, the SU definition is different.

For P100 GPU nodes, 1 GPU-hour = 2.5 SUs

If you use an entire P100 node for one hour, 5 SUs will be deducted from your allocation.

2 GPU units/node x 1 node x 1 hour = 2 gpu-hours

2 gpu-hours x 2.5 SUs/gpu-hour = 5 SUs

If you use 1 GPU unit on a P100 for 8 hours, 20 SUs will be deducted from your allocation.

Service Units (SUs) for GPU-AI nodes are defined in terms of "gpu-hours": the use of one GPU Unit for one hour.

DGX-2 node

The DGX-2 node holds 16 GPU units, each of which can be allocated separately. Service Units (SUs) are defined in terms of gpu-hours:

For the DGX-2 node, 1 GPU-hour = 1 SU

If you use 2 GPUs on the DGX-2 node for 1 hour, 2 SUs will be deducted from your allocation.

2 GPU units x 1 hour = 2 gpu-hours = 2 SUs

If you use the entire DGX-2 for 3 hours, 48 SUs will be deducted from your allocation.

16 GPU units x 3 hours = 48 gpu-hours = 48 SUs

Volta 16 nodes

The Volta 16 nodes hold 8 GPU units each, each of which can be allocated separately. Service Units (SUs) are defined in terms of GPU-hours.

For Volta 16 GPU nodes, 1 GPU-hour = 1 SU

If you use an entire Volta 16 node for one hour, 8 SUs will be deducted from your allocation.

8 GPU units/node x 1 node x 1 hour = 8 gpu-hours = 8 SUs

If you use 4 GPU units on a Volta 16 for 48 hours, 196 SUs will be deducted from your allocation.

4 GPU units x 48 hours = 196 gpu-hours = 196 SUs

Accounting for file space

Every Bridges grant has a pylon storage allocation associated with it. If you exceed your storage quota, you will not be able to submit jobs to Bridges.

Each grant has a Unix group associated with it. Every file is "owned" by a Unix group, and that file ownership determines which grant is charged for the file space. See "Managing multiple grants" for a further explanation of Unix groups, and how to manage file ownership if you have more than one grant.

Managing multiple grants

If you have multiple grants on Bridges, you should ensure that the work you do under each grant is assigned correctly to that grant. The files created under or associated with that grant should belong to it, to make them easily available to others on the same grant.

There are two fields associated with each grant for these purposes: a SLURM account id and a Unix group.

Unix groups determine which pylon5 allocation the storage space for files is deducted from, and who owns and can access a file or directory.

For a given grant, the SLURM account id and the Unix group are identical strings.

One of your grants has been designated as your default grant, and the account id and Unix group associated with the grant are your default account id and default Unix group.

When a Bridges job runs, any SUs it uses are deducted from the default grant. Any files created by that job are owned by the default Unix group.

Find your default account id and Unix group

To find your SLURM account ids, use the projects command. It will display all the grants you belong to. It will also list your default account id (called charge id in the projects output) at the top. Your default Unix group is the same.

In this example, the user has two grants with account ids account-1 and account-2. The default account id is account-2.

Use a secondary (non-default) grant

To use a grant other than your default grant on Bridges, you must specify the appropriate account id with the -A option to the SLURM sbatch command. See the Running Jobs section of this Guide for more information on batch jobs, interactive sessions and SLURM.

Note that using the -A option does not change your default Unix group. Any files created during a job are owned by your default Unix group, no matter which account id is used for the job, and the space they use will be deducted from the pylon allocation for the default Unix group.

Change your Unix group for a login session

To temporarily change your Unix group, use the newgrp command. Any files created subsequently during this login session will be owned by the new group you have specified. Their storage will be deducted from the pylon allocation of the new group. After logging out of the session, your default Unix group will be in effect again.

newgrp unix-group

Note that the newgrp command has no effect on the account id in effect. Any Bridges usage will be deducted from the default account id or the one specified with the -A option to sbatch.

Change your default account id and Unix group permanently

You can permanently change your default account id and your default Unix group with the change_primary_group command. Type:

change_primary_group -l

to see all your groups. Then type

change_primary_group account-id

to set account-id as your default.

Your default account id changes immediately. Bridges use by any batch jobs or interactive sessions following this command are deducted from the new account by default.

Your default Unix group does not change immediately. It takes about an hour for the change to take effect. You must log out and log back in after that window for the new Unix group to be the default.

Tracking your usage

There are several ways to track your Bridges usage: the xdusage command, the projects command, and the Grant Management System.

The projects command shows information on all Bridges grants, including usage and the pylon directories associated with the grant.

For more detailed accounting data you can use the Grant Management System. You can also track your usage through the XSEDE User Portal. The xdusage and projects commands and the XSEDE Portal accurately reflect the impact of a Grant Renewal but the Grant Management System currently does not.

Managing your XSEDE allocation

Most account management functions for your XSEDE grant are handled through the XSEDE User Portal. You can search the Knowledge Base to get help. Some common questions:

Changing your default shell

The change_shell command allows you to change your default shell. This command is only available on the login nodes.

To see which shells are available, type

change_shell -l

To change your default shell, type

change_shell newshell

where newshell is one of the choices output by the change_shell -l command. You must use the entire path output by change_shell -l, e.g. /usr/psc/shells/bash. You must log out and back in again for the new shell to take effect.

Quick Links

PSC

Bridges

... is a uniquely capable resource for empowering new research communities and bringing together HPC and Big Data. Bridges is designed to support familiar, convenient software and environments for both traditional and non-traditional HPC users.

Pittsburgh Supercomputing Center is a joint effort of Carnegie Mellon University and University of Pittsburgh.