“The slice of the business world captured by BI analytics is extraordinarily narrow, “says Steve Swoyer. Sure, BI is able to show us things such as the total sales of Product A in Region A over a certain period of time. You may also be able to enhance BI results by adding demographic data and/or GIS data. Your BI applications may even have limited predictive power. But something important is missing. That something, says Swoyer, is “the human behavioral backdrop to the facts it discloses. It tells us next to nothing about the conditions or events of the world in which these facts are situated.” Also limiting is the fact that,” it isn’t easy to programmatically establish relationships across different data models.” As an example, he notes the data in HBase can’t easily be connected to the data in Hive which can’t easily be connected to text documents. The solution, he says, is found in graph database technologies.

Guy Harrison agrees, In his article The State of Big Data Management: Dispatches from the Database Revolution for Big Data Quarterly/Winter 2015 Industry Updates, he says, "Graph databases are an important alternative when the analysis of relationships between objects is of as much significance as the objects themselves.” Harrison notes that while a relational database can be used to model relationships between objects through the use of foreign keys and self-joins, a RDBMS generally hits performance issues when working with very large graphs, and SQL lacks an expressive syntax to work with graph data. SQL solutions are no better. With NoSQL solutions, a graph structure can be stored in a single document or object, but relationships between objects are not inherently supported because there are no joins.

What is a graph databaseA graph database is a collection of nodes and edges. Each node represents an entity (such as a person or thing) and each edge represents a connection or relationship between two nodes. For example, say we start off with two nodes, labeled Elizabeth and Andrew. Elizabeth is Andrew’s mother. Thus the “edge” is defined by an arrow pointing from Elizabeth to Andrew, labelled “Mother of”.

Thus, every node in a graph database is defined by a unique identifier, a set of outgoing edges and /or incoming edges. Each node is also defined by a set of properties expressed as Key/value pairs. For example, for the node “Elizabeth” we can add the property
Age: 35 and for the node identified as “Andrew” we can add the property
Age: 7. Likewise, each edge is define by a unique identifier, a starting-place and/or ending-place node and a set of properties. In our example, we can add an edge property to show that Andrew “lives with” Elizabeth. To see a visual representation of this concept: watch Neo4j’s video,
Intro to Graph Databases Series: Espisode 3—Graph Modeling.

Some use cases where graphic databases work bestGraph databases work best when you need to analyze interconnections among data, such as when you are mining data from social media. They are also suited for analytic applications that involve complex relationship and dynamic schema, such as supply chain management or creating recommendations for customers based on what others who made similar purchases bought.

Swoyer notes that because graphic databases can discover relationships that span disparate data models, they provide a way “to link transactional data stored in an operational system or data warehouse with a prediction of impending equipment failure, which could be derived from a Spark streaming analytics of telemetry data from multiple signalers. The graph database drives the relationships that link together the faulty part, that part’s inventory status, the locations of both faulty and replacement part, and the logistics of getting that replacement (and, if necessary, a technician) to the location.”

Michael Hunger points out that “instead of de-normalizing for performance, [graph databases let you] normalize interesting attributes into their own nodes, making it much easier to move, filter and aggregate along these lines. Content and asset management, job-finding, recommendations based on weighted relationships to relevant attribute-nodes are some use cases that fit this model very well.” He notes that another use for graph databases is to take advantage of their high performance online query capabilities. Graph databases “process large amounts or high volumes of raw data with Map/Reduce in Hadoop or Event-Processing (like Storm, Esper, etc.) and project the computation results into a graph. We’ve seen examples of this from many domains from financial (fraud detection in money flow graphs), biotech (protein analysis on genome sequencing data) to telco (mobile network optimizations on signal-strength-measurements).”