Rewriting Queries over Summaries of Big Data Graphs

The availability of huge amount of graph-like data poses several data management challenges related to the representation, storage and querying of such data. On one hand, we have standards such as the Resource Description Framework and database solutions optimised for graph-like data. On the other hand, we have graph languages offering different trade-offs between expressiveness and complexity of query evaluation. However, even languages with a polynomial evaluation bound may result prohibitive in practice. In such cases, where the data is intrinsically big, a promising approach is that of resorting to optimisation techniques over the representation of the data.

The objective of this talk is twofold. First, to introduce extended property paths, a significant extension of property paths, the navigational core of SPARQL. Second, to discuss the benefits that navigational queries over existing graph stores (such as RDF databases) can gain from a class of optimisations based on summaries. To make both extended property paths and summaries readily available in RDF stores, a translation from extended property paths to SPARQL queries that can be executed over summaries directly represented in RDF is introduced.