This visualization of a NEKTAR 3D arterial-tree
simulation shows several arterial branching-sites with
velocity vectors (arrows) and isosurfaces indicating
pressure within the artery. (from a visualization by Joe
Insley, UC/ANL)

The three groups, who worked on a shared Transatlantic
Federated Grid, reported on their successes and lessons learned
on Nov. 17 at the TeraGrid booth at the Seattle Convention
Center. All three projects - NEKTAR (led by George Karniadakis,
Brown University), VORTONICS (led by Bruce Boghosian, Tufts
University) and SPICE (led by Peter Coveney, University of
London) - grappled with challenging large-scale research
problems that require grid computing to be solved.

“We believe we have shown that linked grids is a
benefit for these projects,” said Coveney, who coordinated
the effort from the U.K. side. “We made progress more
effectively by pooling resources and expertise, and we brought
some of the difficulties involved in grid computing more sharply
into focus.” SPICE (Simulated Pore Interactive Computing
Environment) won the HPC Analytics Challenge award, a first-time
SC award given for innovative techniques in rigorous data
analysis, advanced networks and high-end visualization to solve
a complex, real-world problem.

Using novel algorithms (based on a mathematical relation
known as Jarzynsky’s identity), SPICE uses steered
molecular dynamics to pull a strand of DNA through the
nanometer-sized pore of a channel protein embedded in a bilayer
membrane. With a total size exceeding 250,000 atoms, the problem
would require 25 years, says Coveney, with “vanilla
molecular dynamics.” It becomes tractable only with
grid-enabled computational resources, which make possible many
interactive simulations, first to dynamically explore the
parameter space of the DNA-protein system and then, having
reduced the search space, to efficiently farm-out around 100
large-scale non-equilibrium simulations across the federated
grid. The work identified a constriction in the pore structure,
says Coveney, that may have physical consequences.

SPICE linked systems at three TeraGrid sites (NCSA, SDSC and
PSC) and UK sites at Daresbury and CSAR (Computer Services for
Academic Research, University of Manchester). The steering
infrastructure was managed by RealityGrid middleware, which can
fire up simulations and visualizations remotely from any linked
sites. “The federated grid provides unparalleled
computational power,” said Coveney, “in a
coordinated and coherent fashion. This enables the heterogeneous
and geographically distributed resources to be marshaled in
service of a single scientific problem.”

The NEKTAR project (computational fluid dynamics) linked
supercomputing systems at four TeraGrid partner sites (NCSA.
SDSC, TACC and PSC) and a UK site (CSAR) with visualization at
the University of Chicago/Argonne National Laboratory (UC/ANL)
in real-time to the SC 05 showfloor in Seattle. With cross-site
runs on grid-linked resources, they simulated the flow dynamics
of the human arterial “tree” - the branched
structure of arteries in the human anatomy. In earlier work, the
NEKTAR team performed the first cross-site simulations on the
TeraGrid, and extended this work during the past year to include
a transatlantic component.

Atherosclerosis due to plaque formation is a major health
problem strongly related to blood-flow patterns. It occurs
preferentially at arterial branching sites, where blood-flow can
circle back on itself, like eddies on the outer slow-flowing
bank of a stream when it bends. With detailed 3D simulations of
these flow patterns, researchers hope to facilitate better
decisions about diagnosis and surgical intervention. The human
arterial tree model contains the largest 55 arteries in the
human body with 27 artery bifurcations at a fine-enough
resolution to capture the flow. This requires a total memory of
three to seven terabytes for the finite-element model, beyond
current capacity of any single supercomputing site.

“The challenge,” said Suchuan (Steve) Dong of
Brown, who ran the demonstration from Seattle, “was how to
adapt the application and devise algorithms to exploit ensembles
of supercomputers to achieve high performance.” The nature
of the simulation made it viable to divide the
“tree” among many processors at many sites.

“This was a success,” said Dong. “We have
gained significant experience in the sometimes arduous process
of cross-site debugging and in the co-scheduling of a large
Globus job with several subjobs on different machines. This is
important learning about the practical challenges of grid
computing.”

The VORTONICS project, led by Boghosian of Tufts, linked the
TeraGrid sites (UC/ANL, NCSA, SDSC, TACC and PSC) with CSAR
during SC05, and during their largest, most successful run (on
Nov. 17) they ran cross-site at UC/ANL, NCSA, SDSC, PSC and
CSAR. For this run, they linked 512 processors to carry out a
lattice-Boltzmann simulation on a 1,250^3 grid (nearly
two-billion lattice points). VORTONICS provides direct numerical
simulation of 3D Navier-Stokes flows, to address problems in
vortex interaction, important problems in which the time and
space scales of vortex stretching and reconnection isn’t
understood. “Our demand is geographically distributed
domain distribution,” said Boghosian. “We have an
enormous 3D lattice grid, and we want to be able chop it up into
pieces that reside on different SC sites.”

In their largest run from Seattle, the VORTONICS simulation
injected more than nine gigabytes of data into the network.

“This joint U.S.-U.K. effort illustrates a key benefit
to integrating major resources in a service oriented
architecture using grid computing capabilities,” said
Charlie Catlett, director of the TeraGrid, the NSF-sponsored
cyberinfrastructure program. “It shows that these tools
make it possible to solve problems that otherwise can’t be
solved.”

Coveney, Dong and Boghosian emphasized that the success
revealed the need for more sophisticated grid capabilities, such
as “co-scheduling” for simultaneous runs at multiple
sites. Lack of advanced scheduling capabilities led to an
inordinate amount of human intervention - to such a degree,
notes Coveney, that cross-site simulations would be a
“show stopper” for most computational scientists.
Scheduling tools to alleviate this, he also noted, are not a
daunting technical challenge.

The fundamental difficulty with cross-site scheduling on a
routine basis, said Sergiu Sanielevici of PSC, who leads user
services for the TeraGrid, and who helped to coordinate
scheduling for all three projects, is scarcity of resources.
“We need more nodes, so that we have a reasonable excess
capacity, and we need dedicated funding for grid-scheduling
algorithm development and implementation. Of course, we will be
analyzing the lessons of these experiments and should be able to
make some quick improvements.”

The project leaders also emphasized that this work is
on-going and the potential of grid computing to solve important
scientific problems has just begun to be opened up. “There
is a synergy of collaboration,” said Boghosian,
“that is part of the energy of grid computing.”

For both NEKTAR and VORTONICS, cross-site message passing
(via MPI) was coordinated with MPICH-G2 middleware, developed by
Nicholas Karonis of Northern Illinois University and UC/ANL, who
collaborated with both the NEKTAR and VORTONICS project.
“MPICH-G2 was essential to making this work,” said
Boghosian. “If you’ve written your code with MPI,
you don’t have to change your internal code to migrate to
the grid.”

Other collaborators in the NEKTAR group along with
Karniadakis, Dong and Karonis are Leopold Grinberg of Brown,
Michael E. Papka and Joseph A. Insley of UC/ANL, Alex Yakhot of
Ben Gurion University and Spencer Sherwin of Imperial
College.

Boghosian’s collaborators along with Karonis are Lucas
Finn and Christopher Kottke. Coveney collaborates with Shantenu
Jha of University College London and colleagues at the
University of Manchester, UK.

The TeraGrid, sponsored by the National Science Foundation,
is a partnership of people and resources that provides a
comprehensive cyberinfrastructure to enable discovery in U.S.
science and engineering research. Through high-performance
network connections, the TeraGrid integrates a distributed set
of very-high capability computational, data management and
visualization resources to make U.S. research more productive.
With Science Gateway collaborations and education and mentoring
programs, the TeraGrid also connects and broadens scientific
communities.

The National Grid Service aims to fulfil a similar role in
the United Kingdom.