Maverick, an HP/NVIDIA Interactive Visualization and Data Analytics System, is TACC's latest addition to its suite of advanced computing systems, combines capacities for interactive advanced visualization and large-scale data analytics as well as traditional high performance computing. Recent exponential increases in the size and quantity of digital datasets necessitate new systems such as Maverick, capable of fast data movement and advanced statistical analysis. Maverick debuts the new NVIDIA K40 GPU for remote visualization and GPU computing to the national community.

Maverick is intended primarily for interactive visualization and data analysis jobs to allow for interactive query of large-scale data sets. Normal batch queues will enable users to run simulations up to 6 hours for interactive jobs and 24 hours for GPGPU and HPC jobs. Jobs requiring run times and more cores than allowed by the normal queues will be run in a special queue after approval of TACC staff. Users will be able to run jobs using 132 of the NVIDIA Tesla K40s for both interactive graphics and for GPGPU jobs (at a lower priority).

Maverick has several different file systems with distinct storage characteristics. There are predefined directories in these file systems for you to store your data. Since these file systems are shared with others, they are managed either by quota limits. There is no purge policy on Maverick.

Two local file systems are available: an NFS $HOME and $WORK, a Lustre filesystem on the TACC backbone Stockyard. The $HOME directory has a 10GB quota. All file systems also impose an inode limit, which affects the number of files allowed.

The $WORK filesystem on Maverick is shared with Stampede, though a user's $WORK directory path on Stampede will differ from that on Maverick. For example a user's $WORK directory on Stampede will have the path format similar to: "/work/01158/janeuser", and on Maverick similar to: "/work/01158/janeuser/maverick".

Maverick is accessed either using the secure-shell ssh program (for batch-mode access, but which can be used to initiate interactive VNC access) or via the TACC Visualization Portal (formerly Longhorn Visualization Portal).

Maverick does NOT have a local parallel filesystem or additional nodes to run GridFTP services as on Ranch, Stampede, or Lonestar. Maverick shares the TACC backbone Stockyard's large $WORK parallel file system (1TB quota) with Stampede. Since users' Stampede and Maverick $WORK filesystems are NOT in the same location on Stockyard, users may transfer files with globus-url-copy to Maverick's $WORK filesystem by using Stampede's GridFTP endpoint (gridftp.stampede.tacc.xsede.org) with your Maverick $WORK directory path. For a full list of XSEDE endpoints please see XSEDE's Data Transfers & ManagementGridFTP Endpoints table.

The following Stampede session demonstrates Globus' globus-url-copy to copy "mybigfile" from PSC's Bridges to the user's Maverick $WORK directory:

Users often wish to collaborate with fellow project members by sharing files and data with each other. Project managers or delegates can create shared workspaces, areas that are private and accessible only to other project members, using UNIX group permissions and commands. Shared workspaces may be created as read-only or read-write, functioning as data repositories and providing a common work area to all project members. Please see Sharing Project Files on TACC Systems for step-by-step instructions.

In general, application development on Maverick is identical to that on Stampede, including the availability and usage of compilers, the parallel development libraries (e.g. MPI and OpenMP), tuning and debugging.

Additional visualization-oriented libraries available on Maverick are made accessible through the modules system. Library and include-file search path environment variables are modified when modules are loaded. For detailed information on the effect of loading a module, use:

Jobs are run on Maverick using one of two methods: Batch jobs can be submitted from the Maverick login node, maverick.tacc.utexas.edu, and interactively from a remotely accessed VNC desktop running on an allocated Maverick compute node.

The TACC Visualization Portal is available at https://vis.tacc.utexas.edu. It provides a very simple mechanism to run interactive sessions on Maverick. It presents two choices: to create a VNC desktop (essentially wrapping the above in a much simplified manner, though at cost of some flexibility), and the ability to run RStudio server and iPython Notebook sessions. Please see the TACC Visualization page for more information.

Pulldowns on this page enable a user choose either to create a Maverick VNC desktop or an RStudio Server, or an iPython Notebook session. When VNC is selected, the user is presented with pulldowns for setting the various parameters of a VNC session, including the wayness, number of nodes, and desktop dimensions. The portal will then submit a VNC job to the Maverick vis queue. When the job starts, a VNC viewer will be established in in the portal; alternatively, the Jobs tab will present the a URL and port number that the can be used to connect an external VNC viewer. Note that the portal provides access to only some of the options available through the qsub interface, and the previous method of creating a VNC session through the qsub interface will be necessary in some cases.

The TACC Visualization Portal jobs page also shows the current usage of Maverick; it is a very easy mechanism to find the status of jobs. All jobs submitted to Maverick - either via qsub or via the Portal, running or in various wait queues, will appear in the status information shown.

While batch visualization can be performed on any Maverick node, a set of nodes have been configured for hardware-accelerated rendering. The vis queue contains a subset of 132 compute nodes configured with one NVIDIA K40 GPU each.

Remote desktop access to Maverick is formed through a VNC connection to one or more visualization nodes.

You must have an account on Maverick in order to start a VNC session. University of Texas faculty, staff and affiliates may request an account by submitting a help desk ticket through the TACC User Portal. XSEDE users may submit a help desk ticket via the XSEDE User Portal (XUP) Help Desk.

Users must first connect to a Maverick login node (see System Access) and submit a special interactive batch job that:

allocates a set of Maverick visualization nodes

starts a vncserver process on the first allocated node

sets up a tunnel through the login node to the vncserver access port

Once the vncserver process is running on the visualization node and a tunnel through the login node is created, an output message identifies the access port for connecting a VNC viewer. A VNC viewer application is run on the user's remote system and presents the desktop to the user.

If this is your first time connecting to Maverick, you must run vncpasswd to create a password for your VNC servers. This should NOT be your XSEDE login or Maverick password! This mechanism only deters unauthorized connections; it is not fully secure, as only the first eight characters of the password are saved. All VNC connections are tunnelled through SSH for extra security, as described below.

Follow the steps below to start an interactive session.

Start a Remote Desktop

TACC has provided a VNC job script (/share/doc/slurm/job.vnc) that requests one node in the vis queue for four hours, creating a VNC session.

login1$ sbatch /share/doc/slurm/job.vnc

You may modify or overwrite script defaults with sbatch command-line options:

All arguments after the job script name are sent to the vncserver command. For example, to set the desktop resolution to 1440x900, use:

login1$ sbatch /share/doc/slurm/job.vnc -geometry 1440x900

The vnc.job script starts a vncserver process and writes to the output file, vncserver.out in the job submission directory, with the connect port for the vncviewer. Watch for the "To connect via VNC client" message at the end of the output file, or watch the output stream in a separate window with the commands:

login1$ touch vncserver.out ; tail -f vncserver.out

The lightweight window manager, xfce, is the default VNC desktop and is recommended for remote performance. Gnome is available; to use gnome, open the "~/.vnc/xstartup" file (created after your first VNC session) and replace "startxfce4" with "gnome-session". Note that gnome may lag over slow internet connections.

Create an SSH Tunnel to Maverick

TACC requires users to create an SSH tunnel from the local system to the Maverick login node (maverick.tacc.utexas.edu) to assure that the connection is secure. On a Unix or Linux system, execute the following command once the port has been opened on the Maverick login node:

"xxxx" is a port on the remote system. Generally, the port number specified on the Maverick login node, yyyy, is a good choice to use on your local system as well

"-f" instructs SSH to only forward ports, not to execute a remote command

"-N" puts the ssh command into the background after connecting

"-L" forwards the port

On Windows systems find the menu in the Windows SSH client where tunnels can be specified, and enter the local and remote ports as required, then ssh to Maverick.

Connecting vncviewer

Once the SSH tunnel has been established, use a VNC client to connect to the local port you created, which will then be tunneled to your VNC server on Maverick. Connect to localhost:xxxx, where xxxx is the local port you used for your tunnel. In the examples above, we would connect the VNC client to localhost::xxxx. (Some VNC clients accept localhost:xxxx).

Once the desktop has been established, two initial xterm windows are presented (which may be overlapping). One, which is white-on-black, manages the lifetime of the VNC server process. Killing this window (typically by typing "exit" or "ctrl-D" at the prompt) will cause the vncserver to terminate and the original batch job to end. Because of this, we recommend that this window not be used for other purposes; it is just too easy to accidentally kill it and terminate the session.

The other xterm window is black-on-white, and can be used to start both serial programs running on the node hosting the vncserver process, or parallel jobs running across the set of cores associated with the original batch job. Additional xterm windows can be created using the window-manager left-button menu.

ibrun: Enables parallel MPI jobs to be started from the VNC desktop. ibrun uses information from the user's environment to start MPI jobs across the user's set of Maverick compute nodes. This information is determined by the initial SLURM job submission, and includes the location of the hostfile created by SLURM (found in the $PE_HOSTFILE environment variable).

Running OpenGL/X applications on Maverick visualization nodes requires that the native X server be running on each participating visualization node. Like other TACC visualization servers, on Maverick the X servers are started automatically on each node.

Once native X servers are running, several scripts are provided to enable rendering in different scenarios.

vglrun: Because VNC does not support OpenGL applications, VirtualGL is used to intercept OpenGL/X commands issued by application code and re-direct it to a local native X display for rendering; rendered results are then automatically read back and sent to VNC as pixel buffers. To run an OpenGL/X application from a VNC desktop command prompt:

c442-0011$ vglrun [vglrun options] application [application-args]

tacc_xrun: Some visualization applications present a client/server architecture, in which every process of a parallel server renders to local graphics resources, then returns rendered pixels to a separate, possibly remote client process for display. By wrapping server processes in the tacc_xrun wrapper, the $DISPLAY environment variable is manipulated to share the rendering load across the two GPUs available on each node. For example,

c442-001$ ibrun tacc_xrun application application-args

will cause the tasks to utilize each node, but will not render to any VNC desktop windows.

tacc_vglrun: Other visualization applications incorporate the final display function in the root process of the parallel application. This case is much like the one described above except for the root node, which must use vglrun to return rendered pixels to the VNC desktop. For example,

c442-001$ ibrun tacc_vglrun application application-args

will cause the tasks to utilize the GPU for rendering, but will transfer the root process' graphics results to the VNC desktop.

VisIt is a free interactive parallel visualization and graphical analysis tool for viewing scientific data on Unix and PC platforms. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images for presentations. VisIt contains a rich set of visualization features so that you can view your data in a variety of ways. It can be used to visualize scalar and vector fields defined on two- and three-dimensional (2D and 3D) structured and unstructured meshes. VisIt was designed to handle very large data set sizes in the terascale range and yet can also handle small data sets in the kilobyte range.

VisIt was compiled under the Intel compiler and the mvapich2 and MPI stacks.

After connecting to a VNC server on Stampede, as described above, load the VisIt module at the beginning of your interactive session before launching the Visit application:

c221-102$ module load visit
c221-102$ vglrun visit

VisIt first loads a dataset and presents a dialog allowing for selecting either a serial or parallel engine. Select the parallel engine. Note that this dialog will also present options for the number of processes to start and the number of nodes to use; these options are actually ignored in favor of the options specified when the VNC server job was started.

In order to take advantage of parallel processing, VisIt input data must be partitioned and distributed across the cooperating processes. This requires that the input data be explicitly partitioned into independent subsets at the time it is input to VisIt. VisIt supports SILO data, which incorporates a parallel, partitioned representation. Otherwise, VisIt supports a metadata file (with a .visit extension) that lists multiple data files of any supported format that are to be associated into a single logical dataset. In addition, VisIt supports a "brick of values" format, also using the .visit metadata file, which enables single files containing data defined on rectilinear grids to be partitioned and imported in parallel. Note that VisIt does not support VTK parallel XML formats (.pvti, .pvtu, .pvtr, .pvtp, and .pvts). For more information on importing data into VisIt, see Getting Data Into VisIt; though this documentation refers to VisIt version 2.0, it appears to be the most current available.

The ability to add your own specialized routines to the library by writing procedures more quickly than other languages

Simple syntax, dynamic data typing, and array-oriented operations

Built-in functionality suitable for many data trends, with tools for 2- and 3-dimensional gridding and interpolation, routines for curve and surface fitting, and the ability to perform multi-threaded computations

To run IDL interactively in a VNC session, connect to a VNC server on Maverick as described above, then do the following:

load the vis and idl modules:

c203-112$ module load vis idl

launch IDL

c203-112$ idl

or launch the IDL virtual machine:

c203-112$ idl -vm

If you are running IDL in scripted form, without interaction, simply submit a SLURM job that loads IDL and runs your script.

If you need to run IDL interactively from an xterm from your local machine outside of a VNC session, you will need to run an interactive SLURM job in the vis queue to allocate a Maverick compute node. To do this, use the SLURM command srun to allocate an interactive shell. This command uses the same arguments as sbatch:

NVIDIA's CUDA compiler and libraries are accessed by loading the CUDA module:

login1$ module load cuda

Use the nvcc compiler on the login node to compile code, and run executables on nodes with GPUs-there are no GPUs on the login nodes. Maverick's K40 GPUs are compute capability 3.5 devices. When compiling your code, make sure to specify this level of capability with:

nvcc -arch=compute_35 -code=sm_35 ...

GPU nodes are accessible through the gpu queue for production work and the devel-gpu queue for development work. Production job scripts should include the "module load cuda" command before executing cuda code; likewise, load the cuda module before or after acquiring an interactive, development gpu node with the "srun" command.

The NVIDA CUDA debugger is cuda-gdb. Applications must be debugged through a VNC session or an interactive srun session. Please see the relevant srun and VNC sections for more details.

The NVIDIA Compute Visual Profiler, computeprof, can be used to profile both CUDA and OpenCL programs that have been developed in NVIDIA CUDA/OpenCL programming environment. Since the profiler is X based, it must be run either within a VNC session or by ssh-ing into an allocated compute node with X-forwarding enabled. The profiler command and library paths are included in the $PATH and $LD_LIBRARY_PATH variables by the CUDA module. The computeprof executable and libraries can be found in the following respective directories:

$TACC_CUDA_DIR/bin
$TACC_CUDA_DIR/lib

For further information on the CUDA compiler, programming, the API, and debugger, please see:

The OpenCL heterogeneous computing language is supported on all Maverick computing platforms. The Intel OpenCL environment will support the Xeon processors and Xeon Phi coprocessors, and the NVIDIA OpenCL environment supports the Tesla accelerators.

Use the g++ compiler to compile NVIDIA-based OpenCL. The include files are located in the $TACC_CUDA_DIR/include subdirectory. The OpenCL library is installed in the /usr/lib64 directory, which is on the default library path. Use this path and g++ options to compile OpenCL code: