Team presentation

The appearance of linked data on the web calls for novel database management technologies for linked data collections. The classical challenges from database research need to be now raised for linked data: how to define exact logical queries, how to manage dynamic updates, and how to automatize the search for appropriate queries. In contrast to mainstream linked open data, the LINKS project will focus on linked data collections in various formats, under the assumption that the data is correct in most dimensions. The challenges remain difficult due to incomplete data, uninformative or heterogeneous schemas, and the remaining data errors and ambiguities. We will develop algorithms for evaluating and optimizing logical queries on linked data collections, incremental algorithms that can monitor streams of linked data and manage dynamical updates of linked data collections, and symbolic learning algorithms that can infer appropriate queries for linked data collections from examples.

Research themes

We will develop algorithms for answering logical querying on heterogeneous linked data collections in hybrid formats, distributed programming languages for managing dynamic linked data collections and workflows based on queries and mappings, and symbolic machine learning algorithms that can link datasets by inferring appropriate queries and mappings. Our main objectives are structured as follows:

Querying heterogeneous linked data. We will develop new kinds of schema mappings for semi- structured datasets in hybrid formats including graph databases, rdf collections, and relational databases. These induce recursive queries on linked data collections for which we will investigate evaluation algo- rithms, static analysis problems, and concrete applications.

Managing dynamic linked data. In order to manage dynamic linked data collections and workflows, we will develop distributed data-centric programming languages with streams and parallelism, based on novel algorithms for incremental query answering, will study the propagation of updates of dynamic data through schema mappings, and will investigate static analysis methods for linked data workflows.

Linking graphs. Finally, we will develop symbolic machine learning algorithms, for inferring queries and mappings between linked data collections in various graphs formats from annotated examples.