Intel Open Sources Tool for Rapid Big Data App Development

Intel Corp. this week released an open source tool called GraphBuilder for rapid big data application development.

Currently in beta, GraphBuilder is designed "to help data scientists in industry and academia to rapidly develop new applications that draw insights from big data," said Connie Brown in an announcement on the Intel Web site. "Developed by Intel Labs, GraphBuilder is the first scalable open source library to take large data sets and construct them into 'Graphs,' web-like structures that outline relationships among data."

In a blog post, Ted Willke, principal scientist at Intel, explained how GraphBuilder was developed after the company noticed a lack of available products to construct such graphs from unstructured data sources. "So, we set out to develop a demo of a scalable graph construction library for Hadoop," Willke said. "Hadoop is not good for graph-based machine learning but graph construction is another story."

Willke said GraphBuilder rapidly constructs such large-scale graphs and also eases many of the complexities involved with manipulation of the graph data. "This makes it easy for just about anyone to build graphs for interesting research and commercial applications," Willke said. "In fact, GraphBuilder makes it possible for a Java programmer to build an Internet-scale graph for PageRank in about 100 lines of code and a Wikipedia-sized graph for [machine learning] in about 130."

He explained that GraphBuilder was first demonstrated at a July workshop and just this week released as open source software under Apache 2.0 licensing. More information is available in a whitepaper about GraphBuilder, which is now ready for download.