This release took us longer than expected. We had to deal with multiple issues and included new data. Most notable is the addition of the NIF annotation datasets for each language, recording the whole wiki text, its basic structure (sections, titles, paragraphs, etc.) and the included text links. We hope that researchers and developers, working on NLP-related tasks, will find this addition most rewarding. The DBpedia Open Text Extraction Challenge (next deadline Mon 17 July for SEMANTiCS 2017) was introduced to instigate new fact extraction based on these datasets.

The English Wikipedia has more than a hundred edits per minute. A large part of the knowledge in Wikipedia is not static, but frequently updated, e.g., new movies or sports and political events. This makes Wikipedia an extremely rich, crowdsourced information hub for events. We have created a dat

The DBpedia data set uses a large multi-domain ontology which has been derived from Wikipedia. The English version of the DBpedia 2014 data set currently describes 4.58 million "things" with 583 million "facts".