NAMD simulates molecular motion, especially of large molecules so it’s often used to simulate molecular docking problems. One particularly interesting class of docking problem is the interaction of protein molecules with other molecules such as the cell membrane. The enormous number of atoms involved in these simulations confine the kinds of information we’re able to learn about how proteins interact with and shape their environments because more atoms require more computing power. So we’re investigating using GPU accelerated nodes in a shared memory cluster to speed up simulation time.

This describes running NAMD in a multi-node configuration at NERSC Dirac to determine if we want to build out a Pegasus workflow executing in this mode through the OSG compute element. The process is, as usual with MPI codes using cluster interconnects, highly cluster specific. The next step is to determine if it’s worth it and what our alternatives are.

Approach

If you’re having a hard time running NAMD in a PBS environment over an Infiniband interconnect, you are not alone. The NAMD release notes come right to the point:

“Writing batch job scripts to run charmrun in a queueing system can be challenging.”

These links, in addition to the release notes cited above provide useful insights:

And without further delay, here’s the approach that worked on Dirac. Mileage on your cluster may vary.

#!/bin/bash
set -x
set -e
# build a node list file based on the PBS
# environment in a form suitable for NAMD/charmrun
nodefile=$TMPDIR/$PBS_JOBID.nodelist
echo group main > $nodefile
nodes=$( cat $PBS_NODEFILE )
for node in $nodes; do
echo host $node >> $nodefile
done
# find the cluster's mpiexec
MPIEXEC=$(which mpiexec)
# Tell charmrun to use all the available nodes, the nodelist built above and the cluster's MPI.
CHARMARGS="+p32 ++nodelist $nodefile"

As an additional wrinkle, we want to run the GPU accelerated version. That’s why we use the +idlepoll argument to NAMD.

The -I parameter tells qsub to start an interactive job. The walltime parameter overrides the very low default walltime. Fnially, nodes tells PBS how many cluster nodes to use and ppn specifies the processes per node to start.

After debugging, I ran the script like this:

qsub -q dirac_reg -l walltime=06:00:00 -l nodes=4:ppn=8 ./callnamd

Results

I did three runs with 2, 4, and 8 nodes. The interesting performance number for a NAMD run is days/ns or days of computation time required per nanosecond of simulation.