Computing Support for ATLAS

As the sole Tier-1 computing facility for ATLAS in the United States
– and the largest ATLAS Tier 1 computing center worldwide –
Brookhaven's RHIC and ATLAS
Computing Facility provides a large portion of the overall computing
resources for U.S. collaborators and serves as the central hub for
storing, processing, and distributing ATLAS experimental data among
scientists across the country. Brookhaven also houses one of the three
U.S. ATLAS Analysis Support Centers, which organizes topical conferences
and periodic "Jamborees" that train researchers in numerous data
analysis techniques.

A Closer Look at the ATLAS Grid

The ATLAS grid computing system comprises a complex structure
analogous to the power grid that allows researchers and students around
the world to analyze ATLAS data.

The beauty of the grid is that a wealth of computing resources are
available for a scientist to accomplish an analysis, even if those
resources are not physically available close to them. The data,
software, processing power and storage may be located hundreds or
thousands of miles away, but the grid makes this invisible to the
researcher.

Organizationally, the grid is set up in a tier system, with the Large
Hadron Collider located at CERN, the Tier-0 center. CERN receives the
raw data from the ATLAS detector, performs a first-pass analysis, and
then distributes it among ten Tier-1 locations, also known as regional
centers, including Brookhaven. At the Tier-1 level, which is connected
to Tier-0 via a dedicated high-performance optical network path, a
fraction of the raw data is stored, processed, and analyzed. Each Tier-1
facility then distributes derived data to Tier-2 computing facilities
that provide data storage and processing capacities for more in-depth
user analysis and simulation.

The grid computing infrastructure is made up of several key
components. The "fabric" consists of the hardware elements – processor
"farms" comprising hundreds to thousands of compute nodes, disk and tape
storage, and networking. The "applications" are the software programs
that users employ, for example, to analyze data. Applications take the
raw data from ATLAS and reconstruct it into meaningful information that
scientists can interpret. Another type of software, called "middleware,"
links the fabric elements deployed within and across regions together so
that they form a unified system — the Grid. The development of the
middleware is a joint effort between physicists and computer scientists.

Outside of high-energy physics, grid computing is used on smaller
scales to manage data within other scientific areas such as astronomy,
biology, and geology. But the LHC grid is the largest of its kind.

Funding for middleware development is provided by the National
Science Foundation's (NSF) Information Technology Research program and
by the U.S. Department of Energy (DOE). DOE also funds the Tier-1 center
activities, while the Tier-2 centers are funded mostly by the NSF and
the DOE. DOE and NSF support the Open Science Grid, which is the
middleware used by all of the U.S. Tier-1 and Tier-2 sites.

Upgrades to ATLAS Computing for Run 2

The key to successfully managing ATLAS data to date has been highly
efficient distributed data handling over powerful networks, minimal disk
storage demands, minimal operational load, and constant innovation. The
scientists store the data they want to keep permanently on tape or disk
and use a workload distribution system known as PanDA to coherently
aggregate that data and make it available to thousands of scientists via
a globally distributed computing network. End users can access the
needed files, stored on a server in the cloud, by making service
requests.

The latest drive to accommodate the torrent of data expected as the
LHC begins collisions at higher energy is to move the tools of PanDA to
the realm of supercomputers. The challenge is that time on advanced
supercomputers is limited, and expensive. But just as there’s room for
sand in a ‘full’ jar of rocks, there’s room on supercomputers between
big jobs for fine-grained processing of high-energy physics data.

The new fine-grained data processing system, called Yoda, is a
specialization of an “event service” workflow engine designed for the
efficient exploitation of distributed and architecturally diverse
computing resources. To minimize the use of costly storage, data flows
would make use of cloud data repositories with no pre-staging
requirements. The supercomputer would send “event requests” to the cloud
for small-batch subsets of data required for a particular analysis every
few minutes. This pre-fetched data would then be available for analysis
on any unused supercomputing capacity—the grains of sand fitting in
between the larger computational problems being handled by the machine.

This system was constructed by a broad collaboration of U.S.-based
ATLAS scientists at Brookhaven Lab, Lawrence Berkeley National
Laboratory, Argonne National Laboratory, University of Texas at
Arlington, and Oak Ridge National Laboratory, leveraging support from
the DOE Office of Science—including the Office of Advanced Scientific
Computing Research (ASCR) and the Office of High Energy Physics
(HEP)—and the powerful high-speed networks of DOE’s Energy Sciences
Network (ESnet).

More on upgrades to the ESnet networking infrastructure that handles
LHC data

Brookhaven and ATLAS

Brookhaven physicists and engineers are participating in
one of the most ambitious scientific projects in the world—constructing,
operating, doing physics analysis of the data, and upgrading a 7-story
tall detector called ATLAS that has opened up new frontiers in the human
pursuit of knowledge about elementary particles and their
interactions. More...

Brookhaven and the LHC

The world's most powerful particle accelerator, the Large Hadron
Collider (LHC) in Switzerland, opens new avenues to explore the
deepest mysteries of the Universe. In addition to serving as the
U.S. host laboratory for the ATLAS experiment at the LHC, the
Brookhaven National Laboratory plays multiple roles in this
international undertaking, from construction and project management
to data storage and distribution.

One of ten national laboratories overseen and primarily funded by the Office of Science of the
U.S. Department of Energy (DOE), Brookhaven National Laboratory conducts research in the physical,
biomedical, and environmental sciences, as well as in energy technologies and national security.
Brookhaven Lab also builds and operates major scientific facilities available to university, industry
and government researchers. Brookhaven is operated and managed for DOE's Office of Science by Brookhaven
Science Associates, a limited-liability company founded by the Research Foundation for the State
University of New York on behalf of Stony Brook University, the largest academic user of Laboratory
facilities, and Battelle, a nonprofit applied science and technology organization.