We are happy to announce the public beta of the lobid API. After ten months of work by the lobid team (especially Fabian and Pascal who are doing the software development) the lobid API beta was deployed at the end of August. Our technology stack consists of Metafacture for the transformation of the raw data to N-Triples, Hadoop for enrichment and conversion to JSON-LD, Elasticsearch for indexing and data delivery, and the Play framework for the HTTP API. The software is developed on GitHub. For usage instructions and API details see the API documentation on the API homepage.

The lobid API technology stack and data flow

Providing easy access to authority data

Besides the data from the whole hbz union catalog now being available through the API
1
, we think that the lobid API will be an especially useful tool for interacting with the authority data provided by lobid. Continuing the original idea of lobid, the lobid API provides access to the data from the German ISIL registry (the MARC Organization Codes Database will be added to the API index later, see below). We decided to also add functionality to search for persons from the German Integrated Authority file (GND) and will add other GND data in the future.

In effect, libraries have been creating linked data for decades now, with subject, person, and title records referencing each other. Authority data has always constituted a hub in the library data world that many local datasets point to. Library authority data is relevant for other areas besides library catalogs (e.g. open access repositories which often do not make use of authority data yet, or a website for collecting job offers by libraries) — but to be attractive linking points, authority files should be easy to use. We hope to help library authority files move into this direction by enabling easy integration of the German ISIL registry as well as GND data into web forms via the lobid API. The API homepage provides JavaScript examples of how to use the API in web forms.

Screenshot of lobid API auto suggest use case

Some historical background

There already had been numerous ways of accessing lobid data before we developed the API: You can get the data in different RDF serializations via content negotiation
2
, you have the possibility to get a full dump of the bibliographic data, you can query the SPARQL endpoint at http://lobid.org/sparql, and the HTML pages are enriched with RDFa. Why add another data access mechanism? Besides the additional possibilities an API provides (e.g. the described auto-suggest functionality) and making the data more accessible to web developers by serving JSON-LD, there also were performance-related reasons that made us switch to provide LOD via Elasticsearch. In November 2012 Pascal Christoph presented at SWIB12 (slides, video) about our plans to move to a JSON-LD- and Elasticsearch-based approach for publishing LOD, as the triple-store-based approach had performance issues. Soon after that we broadened that approach to not only provide content negotiation and RDFa but to also offer an easy-to-use LOD-based web API using JSON-LD, which we now make available to public testing.

Prospects

We're planning to replace the current lobid.org site with the new implementation (since February 2012 lobid.org has been based on the data in our triple store (4store) being rendered by Phresnel). We're also going to add more data, e.g. we want to make all GND data available (not only on persons but e.g. subjects etc.), and integrate the open data from the Cultural Heritage Organizations vocabulary (which essentially is built from the MARC organization code database). For details and more plans, see our open issues and milestones on GitHub.

Footnotes

As of September 2013 the last two libraries from the hbz library network - ULB Düsseldorf and UB Paderborn - have decided to join the open data initiative that started in March 2010. Thus, after three and a half years of arguing for open data, the hbz catalogue in whole is now openly licensed with CC0.