G-Store: A Storage Manager for Graph Data

Many modern applications are based on graph
data. Social networks, for instance, are based on graphs that describe
relationships among people. Paths of disease outbreaks form a graph, as do
airline routes, and citations among academic papers. These graphs often contain
a massive amount of data.

Relational databases are today’s system of
choice for storing and querying large amounts of data. Their use has been
backed up by decades of research and commercially successful database
management systems such as Oracle, IBM DB2, and PostgreSQL. Relational
databases are extremely fast in finding, filtering, inserting, deleting, and
updating information in a database table. Joining information from several
tables, on the other hand, often takes orders of magnitude longer.

From a theoretical point of view, the
relational database model is not a good fit for representing highly
interconnected data. Regardless, as businesses and processes around the world
get more interconnected, relational databases are increasingly used for storing
such data. Perhaps one reason is the lack of a suitable, stable, and
well-supported alternative.

G-Store is a prototype of a storage manager
for large vertex-labeled graphs. G-Store exploits the structure of the
graph to derive a data placement on disk that is optimized for access patterns found
in graph queries. The placement strategy is based on a multilevel algorithm that
partitions the graph into pages and arranges these pages on disk to
minimize the distance on disk between adjacent vertices. G-Store has a built-in query
engine that supports depth-first traversal, reachability testing, shortest path search,
and shortest path tree search.

System Prototype

G-Store consists of approximately 12,000 lines of C/C++ code and runs on Windows.
Visual C++ 2010 or this
redistributable package must be installed. The latest version of the G-Store executable (version 0.11c) can be
downloaded
here. An experimental 64-bit version is available here.
G-Store is alpha software and currently requires administrator privileges to run. Source code can be downloaded here. It would be great if you could let us know how you plan to use G-Store.

You can download this dataset to try out G-Store on the Simple English Wikipedia graph. The graph has 0.1 million vertices (articles including redirects)
and 1.3 million edges (links between articles). Each vertex is labeled with the title, the first sentence,
the creation date, the creation user, and several IDs for an article. To store the
graph with G-Store, extract the archive into the folder with the G-Store executable and enter RUN "schema.g"; into G-Store’s command line interface.
G-Store is now ready to answer queries. The following query finds eight paths with a length of four or less from article “Adam Smith” to either article “Bread” or article “Butter”: