RENCI teams with Clemson and WSU on $2.95 million project to improve and simplify large-scale data analysis

Published: March 30, 2017

CHAPEL HILL, NC – RENCI researchers will work with scientists from Clemson University and Washington State University on a project funded by the National Science Foundation to develop cyberinfrastructure aimed at providing researchers around the nation and world with a more fluid and flexible system of analyzing large-scale data.

The NSF awarded $2.95 million for a collaborative project that will unite biologists, hydrologists, computer engineers and computer scientists to design a system called Scientific Data Analysis at Scale (SciDAS).

Claris Castillo, a senior systems researcher, will lead the SciDAS effort as RENCI principal investigator and will be assisted by co-PI Ray Idaszak, RENCI’s director of DevOps. Clemson scientist Alex Feltus is the lead PI on the project. Other co-PIs are Clemson’s Melissa Smith and Stephen Ficklin of Washington State.

Claris Castillo, RENCI

Ray Idaszak, RENCI

Stephen Ficklin, Washington State University (photo credit: Washington State U)

SciDAS seeks to help current researchers and future innovators discover data, move it smoothly across advanced networks, and improve flexibility and accessibility to national and global resources. It will enable a broad range of scientists to not only get information faster but also to use much larger data sets and tease out information that they might not even know exists.

“A key aspect of the SciDAS team is that we’ll be processing scientific data at the same time that we’re gluing together all the parts needed for a national cyberinfrastructure (CI) ecosystem,” said Feltus, associate professor of genetics and biochemistry in Clemson University’s College of Science. “We’re trying to avoid the problem of ‘if you build it they will come’ and instead enlist the input of a variety of scientists to join us on the ground floor and help us build it. Thus, our software will be refined by using real data by real users with real habits.”

RENCI will lead the effort to integrate existing cyber tools and technologies into the new SciDAS infrastructure that will be designed to support all aspects of distributed, data-driven research. Development of the SciDAS framework will involve integrating a number of NSF-funded CI systems into one package, including:

iRODS: the integrated Rule Oriented Data System, which federates distributed and heterogenuous data into a single virtual file system for easier, safer data sharing and data management.

NSF SSI Hydroshare, an open-source collaborative system for sharing hydrologic

View of the HydroShare application. HydroShare will be made interoperable with SciDAS.

data and models. Hydroshare enables scientists to easily discover and access data and models in the cloud or retrieve them to their desktops.

NSF CC-NIE ADAMANT (Adaptive Data-Aware Multi-Domain Application Network Topologies), which integrates the Pegasus workflow management system and the ORCAresource control framework. It leverages ExoGENI as well as national research and education networks to create elastic, isolated environments to execute complex distributed tasks.

NSF CICI SAFE, a project working to securely automate and monitor the creation of virtual super-facilities that link scientists to multiple resources. CICI-SAFE automates the authorization and security monitoring needed to keep these very fast and dynamic network links safe.

“We will build on successful cyberinfrastructure projects developed here at RENCI, most of them with funding from the National Science Foundation,” said Castillo. “Through NSF support, RENCI has developed a number of cyberinfrastructure tools and environments that make science more productive. SciDAS will integrate those tools and work environments into a unified cyberinfrastructure tailored to support science applications at scale. It is a win for scientists and a way to extend the value of our funded projects.”

“The 21st century presents huge problems for scientists to solve and it also offers great opportunities to create a better quality of life,” Castillo added. “Our mission is to streamline the process of discovery and data analysis by bringing together domain scientists and cyberinfrastructure experts. We are not building one solution to fit all needs. Instead, we see SciDAS as a nationwide, and someday worldwide, CI ecosystem that is flexible and scalable to meet the evolving computing and data analysis needs of many scientific communities.”

Archives

Archives

Contact RENCI

919-445-9640

919-445-9669

media at renci.org

100 Europa Drive Suite 540
Chapel Hill, North Carolina 27517

About RENCI

RENCI (Renaissance Computing Institute) develops and deploys advanced technologies to enable research discoveries and practical innovations. RENCI partners with researchers, government, and industry to engage and solve the problems that affect North Carolina, our nation, and the world. An institute of the University of North Carolina at Chapel Hill, RENCI was launched in 2004 as a collaboration involving UNC Chapel Hill, Duke University, and North Carolina State University.