An Incomplete Guide to ICS-ACI for SCRiM Users

Robert Nicholas (ren10@psu.edu), Earth and Environmental Systems Institute, Penn State University

updated 19 October 2018

NOTE: This guide documents the ICS-ACI 3.1 system, which was brought online in late May 2017. ICS-ACI 1.0 has been retired is no longer accessible.

Systems and Services

SCRiM’s resources with ICS-ACI comprise a computing allocation of 400 “standard-memory” cores and a 400 TB storage allocation, as codified in a 5-year service level agreement (SLA) that ends on 31 January 2020. Services available to SCRiM researchers under this agreement include:

ICS-ACI 3.1 batch system (cluster) aci-b:

replacement for now-retired ACI 1.0 and legacy Lion-X systems

shell access via SSH: ssh -Y username@aci-b.aci.ics.psu.edu

time-averaged usage of up to 400 cores (averaged over a 90-day moving window) and instantaneous (a.k.a. “burst”) usage of up to 1600 cores with a guaranteed response time of one hour or less on the “kzk10_a_g_sc_default” queue (this is a change from ACI 1.0)

unlimited use of the open queue up to 100 cores (total, not per job) with a maximum wall clock time of 48 hours and a maximum of 100 queued jobs (these jobs may be preempted at any time)

up to 400 TB in our group storage pool, /gpfs/group/kzk10/default, shared with all other SCRiM users (this is a change from ACI 1.0)

up to 1 million files in your scratch directory, /gpfs/scratch/<username>; this storage resource is intended for temporary files and is not backed up; files residing here for more than 30 days may be automatically deleted

all filesystems are accessible from both aci-b and aci-i

move files to and from these storage pools via special file transfer node datamgr.aci.ics.psu.edu using scp from the command line or a graphical SFTP client (note that transfers to/from woju should use hostname woju-rn.scrim.psu.edu to bypass a currently-unresolved network issue)

These systems are managed separately from other SCRiM computing resources (woju, mizuna, napa, RStudio Server, and the web-based download server), which are hosted by the Penn State Meteorology Computing Facility and documented here.

Getting an Account

The above resources are available to all SCRiM researchers, including non-PSU SCRiM participants. Contact SCRiM Managing Director Robert Nicholas (ren10@psu.edu) for details on how to gain access.

Submitting Jobs on the Batch System

qsub -A kzk10_a_g_sc_default <yourrunscript>

alternately, add #PBS –A kzk10_a_g_sc_default to your run script

to use the open queue, replace kzk10_a_g_sc_default with open

Using the Modules System

The modules system controls which software packages are available in your current environement.

to list all available modules: module avail

to search available modules: module avail <keyword>

to load a specific software package: module load <modulename>

to show modules currently loaded in your environment: module list

to show paths for executables and libraries associated with a particular module (whether it is loaded or not): module show <modulename>