Graph Analytics: The Other Big Data

Most big data solutions today focus on "the search problem." That's according to Arvind Parthasarathi, president of YarcData, a Cray subsidiary based in Pleasanton, Calif. "The issue is finding the right answer to a question. It's what we call 'search,' and that's what Hadoop to everything else does very effectively," said Parthasarathi in a phone interview with InformationWeek.

But there's another side to big data, one that Parthasarathi calls "discovery." "Discovery is not about finding the right answer to a question," he said. "It's about seeking the right question to ask."

Graphs, of course, are ubiquitous in government, commerce and science. They're very useful in visualizing relationships and patterns that might otherwise remain hidden in massive amounts of data. YarcData sells an enterprise-ready big data appliance called Ureka (as in, "Eureka, I've found it!"), a standalone box built for enterprises that use graph analytics to tackle discovery-related issues. The company sees a market here, as graph analysis of big data can bring commodity, x86-based hardware to its knees -- a problem that Ureka is designed to solve.

The technical problem involves locality of reference, an esoteric term that refers to data fields being in close proximity to each other. "Graphs have no locality of reference," Parthasarathi pointed out. "There's almost no relation between proximity and relation," he added. "Think of data as being spread out across a football field. You could be on the 50-yard line, and suddenly go all the way to the goal line to get one piece of data."

This poses a problem for today's x86 systems, which have processors that are significantly faster than memory. Ureka, by comparison, has a graph accelerator processor and a shared memory model. Its XMT technology runs 128 hardware threads on a chip. "This allows us to deal with this locality of reference issue on the graph," Parthasarathi explained. "Imagine ingesting the entire football field into memory, and then being able to travel anywhere to anywhere without a performance penalty. That's our secret sauce."

To a computer network, the Ureka appliance appears as just another Linux server, albeit one with up to 512 terabytes of memory onboard. "The good news for enterprises is that we're not trying to replace anything. Every one of our customers has Hadoop, relational data warehouses and data appliances," said Parthasarathi. "[Those traditional big data tools] are great for search," he added. "We're augmenting them for discovery."

Parthasarathi acknowledged that Ureka isn't the only solution for graph analytics. "Graphs have been around for ages, and there are lots of software approaches to graphs," he said. "But the main problem with them is that they don't scale. I can show you a really nice demo about graph analytics. But the moment I start putting in any real data -- like a normal enterprise's data -- I'm suddenly going from a response time in minutes to days."

YarcData recently announced winners of its $100,000 challenge, in which contestants demonstrated real-world uses of graph analytics to solve big data problems.

First prize ($70,000) went to Dr. Brady Bernard, Andrea Eakin, and Dr. Ilya Shmeulevich at the Seattle-based Institute for Systems Biology. Their entry researched more than 25 different types of cancers, as well as thousands of patients, to gain insight into biological networks that are disrupted or altered within a given cancer type, according to YarcData. The entry also identified potential drugs that could be repurposed to treat the given cancer.

YarcData is optimistic that Ureka will help solve other big data discovery problems as well. "We're seeing use cases in everything from cybersecurity to finding new cures for cancer, to patient treatment and personalized medicine, to new trading strategies for financial firms," said Parthasarathi.

E2 is the only event of its kind, bringing together business and technology leaders across IT, marketing, and other lines of business looking for new ways to evolve their enterprise applications strategy and transform their organizations to achieve business value. Join us June 17-19 for three days of 40+ conference sessions and workshops across eight tracks and discover the latest insights in enterprise social software, big data and analytics, mobility, cloud, SaaS and APIs, UI/UX and more. Register for E2 Conference Boston today and save $200 off Full Event Passes, $100 off Conference, or get a FREE Keynote + Expo Pass!

Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.

Why should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.