1.Pegasus Supercomputer

The Pegasus cluster is the University of Miami’s 350-node high-performance supercomputer, available to all University of Miami employees and students. Pegasus resources such as hardware (login and compute nodes) and system software are shared by all users.

We encourage new users to carefully read our documentation on Pegasus and available resources, especially users who may be unfamiliar with high-performance computing, Unix-based systems, or batch job scheduling. Understanding what your jobs do on the cluster helps keep Pegasus running smoothly for everyone.

Do not run resource-intensive jobs on the Pegasus login nodes. Submit your production jobs to LSF, and use the interactive queue (link) – not the login nodes – for resource-intensive command-line processes. You may compile and test jobs on login nodes. However, any jobs exceeding 30 minutes of run time or using excessive resources on the login nodes will be terminated and the CCS account responsible for those jobs may be suspended.

Reserve an appropriate amount of resources through LSF for your jobs. If you do not know the resources your jobs need, use the debug queue to benchmark your jobs. More on Pegasus Queues (link) and LSF Job Scripts (link) below. Jobs with insufficient resource allocations interfere with cluster performance and the CCS account responsible for those jobs may be suspended.

Stage data for running jobs exclusively in the /scratch file system, which is optimized for fast data access. Any files used as input for your jobs must first be transferred to /scratch. The /nethome file system is optimized for mass data storage and is therefore slower-access. Using /nethome while running jobs degrades the performance of the entire system and the CCS account responsible may be suspended.

1.a.Pegasus Environment

Connecting to Pegasus (link): To access the Pegasus supercomputer, open a secure shell (SSH) connection to pegasus.ccs.miami.edu and log in with your active CCS account. Once authenticated, you should see the Pegasus welcome message — which includes links to Pegasus documentation and information about your disk quotas — then the Pegasus command prompt.

The landing location on Pegasus is your home directory, which corresponds to /nethome/username. As shown in the Welcome message, Pegasus has two parallel file systems available to users: nethome and scratch.

Do not stage job data in the /nethome file system. If your jobs read files from Pegasus, put those files exclusively in the /scratch file system.

File System

Description

Notes

/nethome

permanent, quota’d, not backed-up

directories are limited to 250GB and intended primarily for basic account information, source codes, and binaries

/scratch

high-speed purged storage

directories should be used for compiles and run-time input & output files

Transferring files (link): Whether on nethome or scratch, transfer data with secure copy (SCP) and secure FTP (SFTP) between Pegasus file systems and local machines. Use Pegasus login nodes for these types of transfers. See the link for more information about transferring large amounts of data from systems outside the University of Miami.

Software on Pegasus (link): To use system software on Pegasus, first load the software using the module load command. Some modules are loaded automatically when you log into Pegasus. The modules utility handles any paths or libraries needed for the software to run. You can view currently loaded modules with module list and check available software with module avail package.

Do not run production jobs on the login nodes. Once your preferred software module is loaded, submit a job to the Pegasus job scheduler to use it.

Job submissions (link): Pegasus cluster compute nodes are the workhorses of the supercomputer, with significantly more resources than the login nodes. Compute nodes are grouped into queues and their available resources are assigned through scheduling software (LSF). To do work on Pegasus, submit either a batch or an interactive job to LSF for an appropriate queue.

In shared-resource systems like Pegasus, you must tell the LSF scheduler how much memory, CPU, time, and other resources your jobs will use while they are running. If your jobs use more resources than you requested from LSF, those resources may come from other users’ jobs (and vice versa). This not only negatively impacts everyone’s jobs, it degrades the performance of the entire cluster. If you do not know the resources your jobs will use, benchmark them in the debug queue.

To test code interactively or install extra software modules at a prompt (such as with Python or R), submit an interactive job to the interactive queue in LSF. This will navigate you to a compute node for your work, and you will be returned to a login node upon exiting the job. Use the interactive queue for resource-intensive command-line jobs such as sort, find, awk, sed, and others.

Mac: Connect with X11 flag

Using either the Mac Terminal or the xterm window, connect using the -X flag:

bash-4.1$ ssh -X username@pegasus.ccs.miami.edu

Launch a graphical application

Use & after the command to run the application in the background, allowing continued use of the terminal.

[username@pegasus ~]$ firefox &

Connecting to Pegasus from offsite

Pegasus and other CCS resources are only available from within the University’s secure campus networks (wired or SecureCanes wireless). To access Pegasus while offsite, open a VPN connection first. CCS does not administer VPN accounts.

1.b.a.

1.b.b.

1.b.c.

1.b.d.

1.c.Transferring Files to Pegasus

Pegasus supports multiple file transfer programs such as FileZilla and PSFTP, and common command line utilities such as scp and rsync. Use Pegasus head nodes (login nodes) for these types of file transfers. For transferring large amounts of data from systems outside the University of Miami, CCS AC also offers a gateway server that supports SFTP and Globus.

Using command line utilities

Use cp to copy files within the same computation system. Use scp, sftp, or rsync to transfer files between computational systems (e.g. Pegasus scratch to Visx project space). When executing multiple instantiations of command line utilities like rsync and scp, please limit your transfers to no more than 2-3 processes at a time.

rsync

The rsync command is another way to keep data current. In contrast to scp, rsync transfers only the changed parts of a file (instead of transferring the entire file). Hence, this selective method of data transfer can be much more efficient than scp. The following example demonstrates usage of the rsync command for transferring a file named “firstExample.c” from the current location to a location on Pegasus.

An entire directory can be transferred from source to destination by using rsync. For directory transfers, the options -atvr will transfer the files recursively (-r option) along with the modification times (-t option) and in the archive mode (-a option). Consult the Linux man pages for more information on rsync.

Selecting Logon Type: Ask for password will prompt for a password each connection.

Click the “Connect” button. Once connected, drag and drop files or directories between your local machine and the server.

Using the gateway server

To transfer large amounts of data from systems outside the University of Miami, use the gateway server. This server supports Globus and SFTP file transfers. Users must be a member of a project to request access to the gateway server. E-mail hpc@ccs.miami.edu to request access.

SFTP

Open an SFTP session to the gateway server using your CCS account credentials: gw.ccs.miami.edu

[localmachine: ~]$ sftp username@gw.ccs.miami.edu..
sftp>

Globus

Users can log into Globus with UM CaneID account credentials (previous Globus IDs can also be linked with UM Globus accounts). Once logged in, create an endpoint with our Globus Server information to begin transferring files. Our Globus Server requires authentication. The identity provider is MyProxy, the Host and DN are the same as our Globus Server.

1.c.a.

1.c.b.

1.d.Projects on Pegasus

Access to CCS Advanced Computing resources is managed on a project basis. This allows us to better support interaction between teams (including data sharing) at the University of Miami regardless of group, school, or campus. Project-based resource allocation also gives researchers the ability to request resources for short-term work. Any University of Miami faculty member or Principal Investigator (PI) can request a new project. All members of a project share that project’s resource allocations.

To join a project, contact the project owner. Details can be found on the CCS Portal.

Using projects in computing jobs

To run jobs using your project’s resources, submit jobs with your assigned projectID using the -P argument to bsub: bsub -P projectID. For more information about LSF and job scheduling, see Scheduling Jobs on Pegasus (link).

For example, if you were assigned the project id “abc”, a batch submission from the command line would look like:

$ bsub -P abc < JOB_SCRIPT_NAME

and an interactive submission from the command line would look like:

$ bsub -P abc -Is -q interactive -XF command

When your job has been submitted successfully, the project and queue information will be printed on the screen.

The cluster scheduler will only accept job submissions to active projects. The CCS user must be a current member of that project.

1.d.a.

2.Software on the Pegasus Cluster

CCS continually updates applications, compilers, system libraries, etc. To facilitate this task and to provide a uniform mechanism for accessing different revisions of software, CCS uses the modules utility. At login, modules commands set up a basic environment for the default compilers, tools, and libraries such as: the $PATH, $MANPATH, and $LD_LIBRARY_PATH environment variables. There is no need to set them or update them when updates are made to system and application software.

From Pegasus, users can view currently loaded modules with module list and check available software with module avail package (omitting the package name will show all available modules). Some modules are loaded automatically upon login:

2.a.Application Development

A list of software modules installed on Pegasus, including MP libraries, is available on the CCS Portal. MPI and OpenMP modules are listed under Intel and GCC compilers. These MP libraries have been compiled and built with either the Intel compiler suite or the GNU compiler suite.

The following sections present the compiler invocation for serial and MP executions. All compiler commands can be used for just compiling with the -c flag (to create just the “.o” object files) or compiling and linking (to create executables). To use a different (non-default) compiler, first unload intel, swap the compiler environment, and then reload the MP environment if necessary. Only one MP module should be loaded at a time.

Compiling Serial Code

Pegasus has Intel and GCC compilers.

Vendor

Compiler

Module Command

Example

intel

icc (default)

module load intel

icc -o foo.exe foo.c

intel

ifor (default)

module load intel

ifort -o foo.exe foo.f90

gnu

gcc

module load gcc

gcc -o foo.exe foo.c

gnu

gcc

module load gcc

gfortran -o foo.exe foo.f90

Configuring MPI on Pegasus

There are three ways to configure MPI on Pegasus. Choose the option that works best for your job requirements.

Add the module load command to your startup files.
This is most convenient for users requiring only a single version of MPI. This method works with all MPI modules.

Load the module in your current shell.
For current MPI versions, the module load command does not need to be in your startup files. Upon job submission, the remote processes will inherit the submission shell environment and use the proper MPI library. This method does not work with older versions of MPI.

Load the module in your job script.
This is most convenient for users requiring different versions of MPI for different jobs. Ensure your script can execute the module command properly. For job script information, see Scheduling Jobs on Pegasus (direct link).

Compiling Parallel Programs with MPI

Pegasus supports Intel MPI and OpenMP for Intel and GCC compilers.

The Message Passing Interface (MPI) library allows processes in parallel application to communicate with one another. There is no default MPI library in your Pegasus environment. Choose the desired MPI implementation for your applications by loading an appropriate MPI module. Recall that only one MPI module should be loaded at a time.

2.b.a.

2.b.b.

2.b.c.

2.b.d.

2.b.e.

2.c.Installing Software on Pegasus

Pegasus users are free to compile and install software in their own home directories, by following the software’s source code or local installation instructions.

Source code software installations (“compilations”) can only be performed in your local directories. Users of Pegasus are not administrators of the cluster, and therefore cannot install software with the sudo command (or with package managers like yum / apt-get). If the software publisher does not provide compilation instructions, look for non-standard location installation instructions.

In general, local software installation involves:

confirming pre-requisite software & library availability, versions

downloading and extracting files

configuring the installation prefix to a local directory (compile only)

compiling the software (compile only)

updating PATH and creating symbolic links (optional)

Confirm that your software’s pre-requisites are met, either in your local environment or on Pegasus as a module. You will need to load any Pegasus modules that are pre-requisites and install locally any other pre-requisites.

Downloading and extracting files

If necessary, create software directories under your home directory:

[username@pegasus ~]$ mkdir ~/local ~/src

We suggest keeping your compiled software separate from any downloaded files. Consider keeping downloaded binaries (pre-compiled software) separate from source files if you will be installing many different programs. These directories do not need to be named exactly as shown above.

Navigate to the src directory and download files:

Some programs require configuration and compilation (like autoconf). Other programs are pre-compiled and simply need to be extracted (like Firefox). Read and follow all instructions provided for each program.

Make and install the software:

If there are dependencies or conflicts, investigate the error output and try to resolve each error individually (install missing dependencies, check for specific flags suggested by software authors, check your local variables).

Updating PATH

PATH directories are searched in order. To ensure your compiled or downloaded software is found and used first, prepend the software executable location (usually in software/bin or software directories) to your PATH environment variable. Remember to add :$PATH to preserve existing environment variables.

Check software version:

Creating symbolic links

To maintain multiple different versions of a program, use soft symbolic links to differentiate between the installation locations. Make sure the link and the directory names are distinct (example below). If local software has been kept in subdirectories with application names and version numbers, symlinks are not likely to conflict with other files or directories.

Create a distinctly-named symlink:

This symbolic link should point to the local software executable. The first argument is the local software executable location (~/local/firefox/36/firefox). The second argument is the symlink name and location (~/local/firefox36).

2.c.a.

2.c.a.a.

2.c.a.b.

2.c.b.

2.c.c.

2.c.d.

2.c.e.

2.d.Using Python Modules on Pegasus

Users are free to compile and install Python modules in their own home directories on Pegasus. Most Python modules can be installed with the --user flag using PIP, easy_install, or the setup.py file provided by the package. If you need a specific version of a Python module, we suggest using PIP with a direct link or downloading, extracting, and installing using setup.py. If you need to maintain multiple versions, see Python Virtual Environments (below).

The --user flag will install Python 2.7 modules here: ~/.local/lib/python2.7/site-packages
Note the default location ~/.local is a hidden directory. If the Python module includes executable programs, they will usually be installed into ~/.local/bin.

To specify a different location, use --prefix=$HOME/local/python2mods (or another path).
The above prefix flag example will install Python 2.7 modules here: ~/local/python2mods/lib/python2.7/site-packages

Checking Module Versions

2.d.a.

2.d.b.

2.d.c.

2.d.d.

2.d.e.Python Virtual Environments

Users can create their own Python virtual environments to maintain different module versions for different projects. Virtualenv is available on Pegasus for Python 2.7.3. By default, virtualenv does not include packages that are installed globally. To give a virtual environment access to the global site packages, use the --system-site-packages flag.

Creating Virtual Environments

These example directories do not need to be named exactly as shown.

Create a project folder, cd to the new folder (optional), and create a virtualenv:

Installing Python modules in Virtual Environments

Once the virtual environment is active, install Python modules normally with PIP, easy_install, or setup.py. Any package installed normally will be placed into that virtual environment folder and isolated from the global Python installation. Note that using --user or --prefix=... flags during module installation will place modules in those specified directories, NOT your currently active Python virtual environment.

(test1)[username@pegasus ~]$ pip install munkres

Deactivating Virtual Environments

(test1)[username@pegasus ~]$ deactivate
[username@pegasus ~]$

Comparing two Python Virtual Environments

PIP can be used to save a list of all packages and versions in the current environment (use freeze). Compare using sdiff to see which packages are different.

List the current environment, deactivate, then list the global Python environment:

Python virtual environment wrapper

Users can install virtualenvwrapper in their own home directories to facilitate working with Python virtual environments. Once installed and configured, virtualenvwrapper can be used to create new virtual environments and to switch between your virtual environments (switching will deactivate the current environment). Virtualenvwrapper reads existing environments located in the WORKON_HOME directory.

Install a local copy of virtualenv with --user:

Recall that --user installs Python 2.7 modules in ~/.local/lib/python2.7/site-packages
To specify a different location, use --prefix=$HOME/local/python2mods (or another path).

Set virtual environment home directory and source:

WORKON_HOME should be the parent directory of your existing Python virtual environments (or another directory of your choosing). New Python virtual environments created with virtualenv will be stored according to this path. Set source to virtualenvwrapper.sh in the same location specified during installation.

Create a virtual environment using virtualenvwrapper:

This will also activate the newly-created virtual environment.

[username@pegasus ~]$ mkvirtualenv test3PYTHONHOME is set. You *must* activate the virtualenv before using it
New python executable in test3/bin/python
Installing setuptools, pip...done.
(test3)[username@pegasus ~]$

Activate or switch to a virtual environment:

Deactivate the virtual environment:

2.d.e.a.

2.d.e.b.

2.d.e.c.

2.d.e.d.

2.d.e.e.

2.d.e.f.

2.d.e.g.

2.e.Using Perl Modules on Pegasus

Users are free to compile and install Perl modules in their own home directories. Most Perl modules can be installed into a local library with CPAN and cpanminus. If you need a specific version, we suggest specifying the version or downloading, extracting, and installing using Makefile.PL.

Configuring a Local Library

Local libraries can be configured during initial CPAN configuration and by editing shell configuration files after installing the local::lib module.
By default, local::lib installs here: ~/perl5.

[username@pegasus ~]$ cpan...
Would you like to configure as much as possible automatically? [yes] yes
...
Warning: You do not have write permission for Perl library directories.
...
What approach do you want? (Choose 'local::lib', 'sudo' or 'manual')
[local::lib] local::lib...
local::lib is installed. You must now add the following environment variables
to your shell configuration files (or registry, if you are on Windows) and
then restart your command line shell and CPAN before installing modules:
PATH="/nethome/username/perl5/bin${PATH+:}${PATH}"; export PATH;
PERL5LIB="/nethome/username/perl5/lib/perl5${PERL5LIB+:}${PERL5LIB}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/nethome/username/perl5${PERL_LOCAL_LIB_ROOT+:}${PERL_LOCAL_LIB_ROOT}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/nethome/username/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/nethome/username/perl5"; export PERL_MM_OPT;
Would you like me to append that to /nethome/username/.bashrc now? [yes] yes
...
cpan[1]> quit...
*** Remember to restart your shell before running cpan again ***
[username@pegasus ~]$ source ~/.bashrc

Configure after local::lib module installation:

If CPAN has already been configured, ensure local::lib is installed and the necessary environment variables have been added to your shell configuration files. Source your shell configuration before running cpan again.

Installing Perl Modules

Once a local library has been installed and configured, CPAN modules will install to the local directory (default ~/perl5). The format for installing Perl modules with CPAN or cpanminus is Module::Name.

Install with CPAN:

Install from the Pegasus prompt or the CPAN prompt. Run cpan -h or perldoc cpan for more options.

Deactivating Local Library Environment Variables

To remove all directories added to search paths by local::lib in the current shell’s environment, use the --deactivate-all flag. Note that environment variables will be re-enabled in any sub-shells when using .bashrc to initialize local::lib.

Once the interactive MATLAB graphical desktop is loaded, you can then run MATLAB commands or scripts in the MATLAB command window. The results will be shown in the MATLAB command window and the figure/plot will be displayed in new graphical windows on your computer. See examples below.

Graphical Interactive Mode with no graphical desktop window

Running MATLAB in a full graphical mode may get slow depending on the network load. Running it with -nodesktop option will use your current terminal window (in Linux/Unix) as a desktop, while allowing you still to use graphics for figures and editor.

To exit, type exit or quit. Again, remember to import the prepared LSF configuration file mentioned above if you want to use MATLAB parallel computing.

Batch Processing

For off-line non-interactive computations, submit the MATLAB script to the LSF scheduler using the bsub command. For more information about job scheduling, see Scheduling Jobs (link). Example single-processor job submission:

In this example, “my_script” corresponds to “my_script.m” in the current working directory.

After the job is finished, the results will be saved in the output file named “example.o######” where “######” is a jobID number assigned by LSF when you submit your job.

Parallel Computations

MATLAB has software products to enable parallel computations for multi-core computers as well as for multiple-node computer clusters. The latter case scenario requires a job scheduler, such as LSF on Pegasus cluster.

The MATLAB product for the parallel processing that uses the cores of the same node is the “Distributed Computing Toolbox/DCT” (also appears in MATLAB documentation under the name of “Parallel Computing Toolbox”). Licensed MATLAB software product for a computer cluster is called “Distributed Computing Engine/DCE” (also appears in documentation as “MATLAB Distributed Computing Server”).

Single-node parallel MATLAB jobs (up to 16 cpus)

For a single-node parallel job, MATLAB Distributed Computing Toolbox (licensed software) is used. It has a build-in default MATLAB cluster profile ‘local’, from which the pool of MatlabWorkers can be reserved for computations. The default number of MatlabWorkers is 12. You can specify up to 15 on a single Pegasus node using the general queue, and 16 cpus using the parallel queue. For more information about queue and parallel resource distribution requirements, see Scheduling Jobs (link).

Refer to MATLAB documentation on the ways to adapt your script for multi-processor calculations. One of the parallel tools in MATLAB is the parfor loop replacing the regular for loop, and in the example is given below:

Multi-node parallel MATLAB jobs (16-32 cpus)

For running multi-processor MATLAB jobs that involve 16+ cpus and more than a single node, MATLAB Distributed Computer Engine (licensed software) is used, with currently 32 licenses available on Pegasus. These jobs must be submitted to the parallel queue with the appropriate ptile resource distribution. For more information about queue and resource distribution requirements, see Scheduling Jobs (link).

The parallel LSF MATLAB cluster also needs to be configured. After loading the matlab module, import the default LSF parallel configuration as following:

This command only needs to be run once. It imports the cluster profile named ‘LSF1’ that is configured to use up to 32 MatlabWorkers and to submit MATLAB jobs to the parallel pegasus queue. This profile does not have a projetID associated with the job, and you may need to coordinate the project name for the LSF job submission. This can be done by running the following script (only once!) during your matlab session:

%% conf_lsf1_project_id.m
%% Verify that LSF1 profile exists, and indicate the current default profile:
[allProfiles,defaultProfile] = parallel.clusterProfiles()
%% Define the current cluster object using LSF1 profile
myCluster=parcluster('LSF1')
%% View current submit arguments:
get(myCluster,'SubmitArguments')
%% Set new submit arguments, change projectID below to your current valid project:
set(myCluster,'SubmitArguments','-q general -P projectID')
%% Save the cluster profile:
saveProfile(myCluster)
%% Set the 'LSF1' to be used as a default cluster profile instead of a 'local'
parallel.defaultClusterProfile('LSF1');
%% Verify the current profiles and the default:
[allProfiles,defaultProfile] = parallel.clusterProfiles()

The above script also reviews your current settings of the cluster profiles. You can now use the cluster profile for distributed calculations on up to 32 cpus, for example, to create a pool of MatlabWorkers for a parfor loop:

Find the information about numbers of licenses used for the “Users of MATLAB_Distrib_Comp_Engine”, “Users of MATLAB”, and “Users of Distrib_Computing_Toolbox”.

Note on Matlab cluster configurations

After importing the new cluster profile, it will remain in your available cluster profiles. Validate using the parallel.clusterProfiles() function. You can create, change, and save profiles using SaveProfile and SaveAsProfile methods on a cluster object. In the examples, “myCluster” is the cluster object. You can also create, import, export, delete, and modify the profiles through the “Cluster Profile Manager” accessible via MATLAB menu in a graphical interface. It is accessed from the “HOME” tab in the GUI desktop window under “ENVIRONMENT” section: ->”Parallel”->”Manage Cluster Profiles”

You can also create your own LSF configuration from the Cluster Profile Manager. Choose “Add”->”Custom”->”3RD PARTY CLUSTER PROFILE”->”LSF” as shown below:

… and configure to your needs:

2.f.a.

2.f.a.a.

2.f.a.b.

2.f.b.

2.f.c.

2.f.c.a.

2.f.c.b.

2.f.d.

2.g.Running R on Pegasus

To run the R programming language on Pegasus, first load an R module with module load R/version. Use the command module avail to see a list of available software, including R versions. For more information about Pegasus software, see Software on the Pegasus Cluster (link).

Batch R

To run a batch R file on Pegasus compute nodes, submit the file to LSF with R CMD BATCH filename.R. Remember to include the -P flag and your project, if you have one.

R packages

From the R prompt, install any R package to your personal R library with the standard install.package() R command. Choose y when asked about using a personal library, and y again if asked to create one.

> install.packages("doParallel", repos="http://R-Forge.R-project.org")Warning in install.packages("doParallel", repos = "http://R-Forge.R-project.org") :
'lib = "/share/opt/R/3.1.2/lib64/R/library"' is not writable
Would you like to use a personal library instead? (y/n) yWould you like to create a personal library
~/R/x86_64-unknown-linux-gnu-library/3.1
to install packages into? (y/n) y
...

Subversion Client

For Windows, we recommend TortoiseSVN. This Subversion client seamlessly integrates into Windows Explorer. A right-click provides with the most common Subversion commands.

For Mac, download the latest Subversion client from collab.net. Extract the SVN binary to /usr/local/bin. Consider SvnX, a good open source front-end for Subversion. As with all Mac apps, download the dmg file, double click the file if it does not auto mount, then drag the SvnX application to your system’s Application directory. See this tutorial for help configuring SvnX: Getting Started with SvnX.

Most Linux distributions already have the SVN client. If not, run sudo yum install subversion – more information here: CentOS Subversion HowTo. Note, this may take a good ten minutes.

Basic Subversion commands

Include the -m flag and a message with SVN commits, adds, and imports. For more information about Subversion commands, run svn help at the command-line prompt.

Command

Description

svn list repo_address

List files in a repository.

svn import /path/to/directoryrepo_address -m 'tree description'

Add and commit all content under directory to the specified repo, with comments (-m flag). Run svn checkout after import to create working copies on your machine.

svn checkout repo_address
svn co repo_address

Checkout a repository by creating a working copy (snapshot) on your machine.A repository must be checked out to run the below commands.

svn add filename_or_directory -m 'description'

Add a new file or the contents of a new directory to the current working copy, with comments (-m flag).Commit or check-in after adding files to update the repo.

svn delete filename_or_directory

Delete file or directory from the current working copy.Commit or check-in after deleting files to update the repo.

svn status optional filenames or directories

Review all modified files, or specify multiple file or directory names. Add the --verbose flag to see details.

Review differences between your current working copy and the snapshot, or specify revision numbers. Optionally specify multiple file names.

Basic SVN Usage

Once your repository is available, use svn import to populate it with content from a directory on your local machine. Remember to check out the repository after, to create SVN-managed working copies on your machine.

To create a working copy (private snapshot) of all files in a repository on your local computer (in a directory with the repository name) use svn checkout. This initial copy is your snapshot. Subversion will keep track of changes in your working copy, which are pending with respect to the repository until committed with svn commit (svn ci). It is good practice to review your local changes with svn status before committing them.

After editing your working copy of repository files, commit (upload) all or some changes to the Subversion server with svn commit (or svn ci). Take the time to write a decent comment explaining your changes.

Add your own files or directories to your local working copy with svn add. Run svn ci after adding, to commit the changes to the repository. Take the time to write a decent comment explaining your changes.

Delete files or directories from your local working copy with svn delete. Run svn ci after deleting, to commit the changes to the repository. Take the time to write a decent comment explaining your changes.

Review modifications made to your local working copy with svn status. Use the --verbose flag to show details, including revision and owner information. Specify files or directories with optional arguments.

In this example, file1.test has been modified (M), file3.test has been added to the working copy (not the repo), and file4.test has not been added to the working copy (?). file2.test matches the repository version (all files are shown with --verbose flag and no arguments):

Revert any un-committed changes to your local working copy with svn revert. This will return the specified files or directories in the working copy to the checked-out snapshot. Revert all with -R . (this will not delete any new files with ? status).

Update your local working copy to the current repository version with svn update (svn up). SVN will attempt to merge any changes on the server with committed changes to your local working copy. Specify files with optional arguments.

Review differences between two versions with svn diff. Without arguments, this shows the differences between your local working copies and the snapshot (your most recent retrieval from the repository). Specify revisions with -r rev1:rev2 and files or directories with optional arguments. Revision order matters for svn diff -r output.

In this example, file2.test starts empty. A line has been added to the local working copy. The differences betwen the local working copy and the snapshot are shown:

2.i.a.

2.i.b.

2.i.c.

2.i.d.

2.j.Allinea - debugging and profiling

Profile and Debug with Allinea Forge, the new name for the unified Allinea MAP and Allinea DDT tools. See the user guide PDFs below for Allinea Forge and Performance Reports, available as modules on Pegasus.

3.Scheduling Jobs on Pegasus

Pegasus currently uses the LSF resource manager to schedule all compute resources. LSF (load sharing facility) supports over 1500 users and over 200,000 simultaneous job submissions. Jobs are submitted to queues, the software categories we define in the scheduler to organize work more efficiently. LSF distributes jobs submitted by users to our over 340 compute nodes according to queue, user priority, and available resources. You can monitor your job status, queue position, and progress using LSF commands (link).

Reserve an appropriate amount of resources through LSF for your jobs. If you do not know the resources your jobs need, use the debug queue to benchmark your jobs. More on Pegasus Queues (link) and LSF Job Scripts (link). Jobs with insufficient resource allocations interfere with cluster performance and the CCS account responsible for those jobs may be suspended (Policies).

Stage data for running jobs exclusively in the /scratch file system, which is optimized for fast data access. Any files used as input for your jobs must first be transferred to /scratch. See Pegasus Resource Allocations (link) for more information. The /nethome file system is optimized for mass data storage and is therefore slower-access. Using /nethome while running jobs degrades the performance of the entire system and the CCS account responsible may be suspended (Policies).

Do not background processes with the & operator in LSF. These spawned processes cannot be killed with bkill after the parent is gone. Using the & operator while running jobs degrades the performance of the entire system and the CCS account responsible may be suspended (Policies).

Submitting Batch Jobs to LSF – Overview

Batch jobs are self-contained programs that require no intervention to run. Batch jobs are defined by resource requirements such as how many cores, how much memory, and how much time they need to complete. These requirements can be submitted via command line flags or a script file. Detailed information about LSF commands and example script files can be found later in this guide.

Create a job scriptfile

Include a job name -J, the information LSF needs to allocate resources to your job, and names for your output and error files.

3.a.

3.b.Pegasus Job Queues

Pegasus queues are organized using limits like job size, job length, job purpose, and project. In general, users run jobs on Pegasus with equal resource shares. Current or recent resource usage lowers the priority applied when LSF assigns resources for new jobs from a user’s account.

Parallel jobs are more difficult to schedule as they are inherently larger. Serial jobs can “fit into” the gaps left by larger jobs if serial jobs use short enough run time limits and small enough numbers of processors.

The parallel queue is available for jobs requiring 16 or more cores. Submitting jobs to this queue requires resource distribution -R "span[ptile=16]".

The bigmem queue is available for jobs requiring nodes with expanded memory. Submitting jobs to this queue requires project membership. Do not submit jobs that can run on the general and parallel queues to the bigmem queue. Jobs using less than 1.5G of memory per core on the bigmem queue are in violation of acceptable use policies and the CCS account responsible for those jobs may be suspended (Policies).

Scheduling Jobs

The command bsub will submit a job for processing. You must include the information LSF needs to allocate the resources your job requires, handle standard I/O streams, and run the job. For more information about flags, type bsub -h at the Pegasus prompt. Detailed information can be displayed with man bsub. On submission, LSF will return the job id which can be used to keep track of your job.

For details about your particular job, issue the command bjobs -l jobID where jobID is obtained from the JOBID field of the above bjobs output. To display a specific user’s jobs, use bjobs -u username. To display all user jobs in paging format, pipe output to less:

bhist

bhist displays information about your recently finished jobs. CPU time is not normalized in bhist output. To see your finished and unfinished jobs, use bhist -a.

bkill

bkill kills the last job submitted by the user running the command, by default. The command bkill jobID will remove a specific job from the queue and terminate the job if it is running. bkill 0 will kill all jobs belonging to current user.

[username@pegasus ~]$ bkill 4225Job <4225> is being terminated

On Pegasus (Unix), SIGINT and SIGTERM are sent to give the job a chance to clean up before termination, then SIGKILL is sent to kill the job.

bhosts

bhosts displays information about all hosts such as host name, host status, job state statistics, and jobs lot limits. bhosts -s displays information about numeric resources (shared or host-based) and their associated hosts. bhosts hostname displays information about an individual host and bhosts -w displays more detailed host status. closed_Full means the configured maximum number of running jobs has been reached (running jobs will not be affected), no new job will be assigned to this host.

bpeek

Use bpeek jobID to monitor the progress of a job and identify errors. If errors are observed, valuable user time and system resources can be saved by terminating an erroneous job with bkill jobID. By default, bpeek displays the standard output and standard error produced by one of your unfinished jobs, up to the time the command is invoked. bpeek -q queuename operates on your most recently submitted job in that queue and bpeek -m hostname operates on your most recently submitted job dispatched to the specified host. bpeek -f jobID display live outputs from a running job and it can be terminated by Ctrl-C (Windows & most Linux) or Command-C (Mac).

Examining Job Output

Once your job has completed, examine the contents of your job’s output files. Note the script submission under User input, whether the job completed, and the Resource usage summary.

3.c.a.

3.c.b.

3.c.c.

3.c.d.

3.c.e.

3.c.f.

3.c.g.

3.c.h.

3.d.LSF Job Scripts

The command bsub < ScriptFile will submit the given script for processing. Your script must contain the information LSF needs to allocate the resources your job requires, handle standard I/O streams, and run the job. For more information about flags, type bsub -h or man bsub at the Pegasus prompt. Example scripts and descriptions are below.

You must be a member of a project to submit jobs to it. If you are not a member of the specified project, your job will be submitted to the ‘default’ project which is assigned limited cluster resources. See Projects (link) for more information.

On submission, LSF will return the jobID which can be used to track your job.

Here is a detailed line-by-line breakdown of the keywords and their assigned values listed in this script:

ScriptFile_keywords
#!/bin/bash
specifies the shell to be used when executing the command portion of the script.
The default is Bash shell.
BSUB -J serialjob
assign a name to job. The name of the job will show in the bjobs output.
#BSUB -P myproject
specify the project to use when submitting the job. This is required when a user has more than one project on Pegasus.
#BSUB -e %J.err
redirect std error to a specified file
#BSUB -W 1:00
set wall clock run time limit of 1 hour, otherwise queue specific default run time limit will be applied.
#BSUB -q general
specify queue to be used. Without this option, default 'general' queue will be applied.
#BSUB -n 1
specify number of processors. In this job, a single processor is requested.
#BSUB -R "rusage[mem=128]"
specify that this job requests 128 megabytes of RAM per core. Without this, a default RAM setting will be applied: 1500MB per core
#BSUB -B
send mail to specified email when the job is dispatched and begins execution.
#BSUB -u example@miami.edu
send notification through email to example@miami.edu.
#BSUB -N
send job statistics report through email when job finishes.

Example scripts for parallel jobs

We recommend using Intel MPI unless you have specific reason for using OpenMP. Intel MPI scales better and has better performance than OpenMP.

Submit parallel jobs to the parallel job queue with -q parallel.

For optimum performance, the default resource allocation on the parallel queue is ptile=16. This requires the LSF job scheduler to allocate 16 processors per host, ensuring all processors on a single host are used by that job. Without prior authorization, any jobs using a number other than 16 will be rejected from the parallel queue.Reserve enough memory for your jobs. Memory reservations are per core. Parallel job performance may be affected, or even interrupted, by other badly-configured jobs running on the same host.

3.d.a.

3.d.b.

3.d.b.a.

3.d.b.b.

3.e.LSF Interactive Jobs

HPC clusters primarily take batch jobs and run them in the background—users do not need to interact with the job during the execution. However, sometimes users do need to interact with the application. For example, the application needs the input from the command line or waits for a mouse event in X windows. Use bsub -Is -q interactive command to launch interactive work on Pegasus.

Upon exiting the X11 interactive job, you will be returned to one of the login nodes.

To run an X11 application, establish an X tunnel with SSH when connecting to Pegasus. For example,

ssh -X username@pegasus.ccs.miami.edu

Note that by default, the auth token is good for 20 minutes. SSH will block new X11 connections after 20 minutes. To avoid this on Linux or OS X, run ssh -Y instead, or set the option ForwardX11Trusted yes in your ~/.ssh/config.

In Windows, use Cygwin/X to provide a Linux-like environment. Then run ssh -Y or set the option in your ~/.ssh/config file.

1.Pegasus Supercomputer

The Pegasus cluster is the University of Miami’s 350-node high-performance supercomputer, available to all University of Miami employees and students. Pegasus resources such as hardware (login and compute nodes) and system software are shared by all users.

We encourage new users to carefully read our documentation on Pegasus and available resources, especially users who may be unfamiliar with high-performance computing, Unix-based systems, or batch job scheduling. Understanding what your jobs do on the cluster helps keep Pegasus running smoothly for everyone.

Do not run resource-intensive jobs on the Pegasus login nodes. Submit your production jobs to LSF, and use the interactive queue (link) – not the login nodes – for resource-intensive command-line processes. You may compile and test jobs on login nodes. However, any jobs exceeding 30 minutes of run time or using excessive resources on the login nodes will be terminated and the CCS account responsible for those jobs may be suspended.

Reserve an appropriate amount of resources through LSF for your jobs. If you do not know the resources your jobs need, use the debug queue to benchmark your jobs. More on Pegasus Queues (link) and LSF Job Scripts (link) below. Jobs with insufficient resource allocations interfere with cluster performance and the CCS account responsible for those jobs may be suspended.

Stage data for running jobs exclusively in the /scratch file system, which is optimized for fast data access. Any files used as input for your jobs must first be transferred to /scratch. The /nethome file system is optimized for mass data storage and is therefore slower-access. Using /nethome while running jobs degrades the performance of the entire system and the CCS account responsible may be suspended.