New software needed to HAGGLE with a new kind of processor

Northwest researchers designing software for a new DARPA-funded processor in development that can tackle data arranged in graphs, which are made up of nodes, or data points, that are connected by lines called edges.

1 of 1

RICHLAND, Wash. —
Analysis of big data that can reveal early signs of an Ebola outbreak or the first traces of a cyberattack require a different kind of processor than has been developed for large-scale scientific studies. Since the data might come from disparate sources — say, medical records and GPS locations in the case of Ebola — they are organized in such a way that conventional computer processors handle them inefficiently.

Now, the military research organization DARPA has announced a new effort to build a processor for this kind of data — and the software to run on it. A group of computer scientists at the Department of Energy's Pacific Northwest National Laboratory will receive $7 million over five years to create a software development kit for big data analysis.

"Our software development kit will support a high-level, easy-to-use programming environment for both average and expert programmers," said computer scientist John Feo at PNNL. "We also expect it to achieve the program's goal of one thousand-fold improvement over current technology in data processing efficiency."

Conventional processors work best with structured data such as that found in science or an online store, with items arranged in tables of price, descriptions and other categories. But for applications such as cybersecurity, tracking disease outbreaks, or analyzing the power grid, data comes from a variety of sources: emails, webpages or social media apps in the case of cybersecurity or generating stations, transformers, and homes with the power grid.

This type of data — unstructured — are splayed out in nodes linked by edges, like stars in constellations. In this arrangement, the relationships among nodes — the computers in a network or power plants on the grid — are represented by the edges — the Wi-Fi links between computers or the power lines on the grid. The nodes and edges form an image called a graph, which the new hardware and software will be designed to process and analyze.

Pacific Northwest National Laboratory draws on signature capabilities in chemistry, earth sciences, and data analytics to advance scientific discovery and create solutions to the nation's toughest challenges in energy resiliency and national security. Founded in 1965, PNNL is operated by Battelle for the U.S. Department of Energy's Office of Science. DOE's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit PNNL's News Center. Follow us on Facebook, Instagram, LinkedIn and Twitter.