NCSA Collaborations Awarded Over $6.1 Million to Accelerate Data-intensive Science

Software continues to eat the world, and these days, machine learning is as popular a realm as any under the software umbrella. Applying machine learning algorithms to various data sets has exploded in popularity, due mostly to a need to be able to quickly perform analysis and classification tasks. At NCSA, researchers are working to apply machine learning across multiple disciplines, making research more efficient than ever before.

In some science areas, however, there are additional challenges in bringing the data together quickly enough because datasets are often disparate and scattered across many sources. One of these areas with disparate datasets is the growing field of multi-messenger astrophysics (MMA), where instruments that observe different types of messengers (gravitational waves, electromagnetic waves, and cosmic neutrinos), located in varying geographies each individually detect interesting events.

The promise of fast and integrated detection is to find currently-undetected events hidden in existing data streams, and to better understand them by using multiple instruments together. To address these challenges, NCSA is participating in three new National Science Foundation (NSF) awards, two of which are led by NCSA teams, to advance multi-messenger astrophysics as a whole.

“The cyberinfrastructure needs for gravitational wave astrophysics, high energy physics, and large-scale electromagnetic surveys have rapidly evolved in recent years,” said Eliu Huerta, project PI, director of the Center for Artificial Intelligence Innovation, and lead of the NCSA Gravity Group. “The construction and upgrade of the facilities used to enable scientific discovery in these disparate fields of research have led to a common pair of computational grand challenges, namely, datasets with ever-increasing complexity and volume, and data mining analyses that must be performed in real-time with oversubscribed computational resources.”

“Solving the grand problems outlined in the grants will involve advanced computer systems based on the state-of-the-art IBM POWER9 processor, NVIDIA GPUs and Xilinx FPGAs, and will push the boundaries of how the science can be done using advanced machine learning techniques." says Volodymyr Kindratenko, NCSA Senior Research Scientist and Research Associate Professor in the Computer Science Department at the University of Illinois at Urbana-Champaign.

“In this project, NCSA will help to derive a blueprint for a national project that satisfies the needs for multi-messenger astrophysics science. The project will support not only focused notification of events detected by NSF major instrumentation projects, such as LIGO, IceCube, and LSST, but also support efficient scheduling of follow-up observations by diverse instruments, and the gathering of all the diverse data for analysis,” said Don Petravick, Senior Project Manager at NCSA.

“The project compliments Astronomy projects at Illinois and NCSA in the areas of Astronomical Event Brokers, Cosmic Microwave Background experiments, and astronomical surveys such as LSST and the Dark Energy Survey,” said Margaret Johnson, Assistant Director for Astronomy at NCSA. “In addition, we will be incorporating leading-edge data science at Illinois, such as the recently-funded machine learning projects at NCSA.”

Furthermore, NCSA will focus on the sustainability of the software that will be developed, a concern common across all three newly funded efforts.

"In addition to the overall science challenge, I'm interested in how we develop and maintain the software that enables the discoveries, particularly when funding is often tied to specific research goals while software is more general,” said Daniel S. Katz, NCSA Assistant Director for Scientific Software and Applications and leader of the NCSA part of the project.

“The software ideally can be developed and maintained collaboratively across multiple research projects and with contributions from both amateurs and experts who want to support its use," concluded Katz.

The demands for computing resources for MMA and other data-intensive signal detection projects, such as high-energy physics, are expected to outstrip the capabilities of existing computing infrastructure in the future. In light of this coming change, a radical rethinking of the cyberinfrastructure is needed to contend with these developments. With the onset of deep learning, parallelized processing architectures have emerged as a solution.

Combined with deep learning algorithms, parallelized processing architectures, in particular, Field Programmable Gate Arrays (FPGAs) have been shown to give large speedups in computing when compared with conventional CPUs. This project aims to bring machine learning based accelerated computing with FPGAs into the scientific community by targeting big-data physics experiments, in particular the Large Hadron Collider (LHC) and LIGO (Laser Interferometer Gravitational-Wave Observatory).

“This research will lead to new techniques for accelerating machine learning that will address important challenges in high-energy particle physics and gravity-wave astrophysics,” said Illinois physics professor Mark Neubauer. “We believe that our trans-disciplinary team of scientists and computing experts, along with partners from industry and support from the NSF through this award, are positioned to lead a fundamental change in the way that data-intensive scientific problems are approached.”

“This collaborative research is a multi-university effort with transdisciplinary teams for building new computing infrastructure and machine learning algorithms to advance data-intensive scientific discoveries, especially in high energy physics and gravity-wave astrophysics,” mentioned Zhizhen Zhao, professor of electrical and computer engineering at Illinois.

“These projects will push the frontiers of deep learning at scale, demonstrating the versatility and scalability of these methods to accelerate and enable new physics in the big data era,” said Eliu Huerta, project PI. “The computing paradigm to be spearheaded through these projects aims to significantly increase the processing capability at the LHC and LIGO, leading to an increased scientific output of these devices and, potentially, foundational discoveries. Because these methods are also applicable to many other parts of our national and global economy and society, this work will positively impact many fields.”

National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign