Sources of semi-structured, overlapping, and semantically-related data on the Web are currently proliferating at a phenomenal
rate, which has created a demand for more powerful and flexible information systems (ISs). This new generation of ISs will
need to integrate incomplete and semi-structured information from heterogeneous sources, employ rich and flexible schemas,
and answer queries by taking into account both knowledge and data.

Ontology-based data access has recently
been proposed as an architectural principle for such systems. The main idea is to develop a unified view of the data by describing
the relevant domain in an ontology, which then provides the vocabulary used to ask queries. The IS can use ontological statements,
such as the concept hierarchy, to derive new facts and thus enrich query answers with implicit knowledge. This idea has been
incorporated into systems such as QuOnto, Owlgres, ROWLKit, and REQUIEM, and ontology reasoners such as RACER, FaCT++, Pellet,
and HermiT.

Such systems suffer from two main problems. First, the modelling capabilities of ontology languages
are often insufficient for practical use cases. In order to achieve favourable computational properties, ontology languages
are usually capable of describing only "tree-shaped" relationships; furthermore, (with some notable exceptions)
they usually support only unary and binary predicates. Finally, ontology languages typically employ the open world assumption;
however, when answering queries over large amounts of data, the closed world assumption (CWA) is often more appropriate.

Second, query answering facilities in existing ontology-based ISs typically do not scale to data sets commonly encountered
in practice. Up to now, approaches to addressing this problem have focused on reducing the expressivity of the ontology language
even further in order to obtain formal tractability guarantees. This obviously exacerbates the first problem (restricted modelling
capabilities), while not necessarily delivering robust scalability in practice.

Database theory and practice
can provide partial solutions to these problems. In databases, complex domains can be described using dependencies. Dependencies
are used in a number of different ways: they are often used as integrity constraints--checks that verify whether a database
instance includes all data specified in the domain description; however, dependencies can also be used similarly to ontologies
to derive implicit knowledge. Treating dependencies as integrity constraints and answering queries under CWA has allowed practical
relational database management systems (RDBMSs) to scale to very large data sets.

Database techniques alone
do not, however, satisfy all the requirements for an ontology-based IS. In particular, dependencies often cannot model arbitrarily
large structures and thus do not cover all practical modelling use cases. Furthermore, generalising the query answering techniques
used in practical RDBMSs to the case where information deriving dependencies must be taken into account is still an open problem.

We therefore believe that the next generation of ontology-based ISs should be based on a synthesis and an extension of ontology
and database systems and techniques, providing data handling capabilities similar to current RDBMSs, but with schemas that
are rich, flexible, and tightly integrated with the data. In order to achieve this ambitions goal, however, a number of challenging
fundamental problems must be solved. First, ontology and dependency languages need to be unified in a coherent theoretical
framework. Second, it will be necessary to identify fragments of the framework that are likely to exhibit robust scalability
but can still support realistic use cases. Third, it will be necessary to devise effective algorithmic techniques that can
form the basis of practical ISs.