Discoverable WW1 Content Providers

The Discovery programme Phase 1 report identified a number of key WW1 content providers whose metadata we should consider including in our aggregation API. On the flip side, as discussed in my previous post on preliminary strategy, we were keen to demonstrate the value of existing APIs. And so points one and two in our strategy, the hunt for which of the providers had APIs already — and what they were — began! In this respect, we were hoping for two or three existing APIs, and maybe to work with other providers towards getting their metadata into an API, which could be either hosted by us or handed back to them.

Another issue of importance is licensing, and the phase one report contained useful indications that not all of the providers in the recommendation carry an open license, which raises potential questions regarding our API offering content up for reuse.

Our preliminary ideas for provider inclusion as set out in http://ww1.discovery.ac.uk/prioritisation-matrix-draft-baseline-criteria-for-identifying-and-prioritising-discovery-datasets/ covered a number of different aspects of the data. From a licensing standpoint, we wanted some sort of open licensing statement and clear and reasonable terms and conditions. From an API perspective, we were looking for a documented access point which was well maintained and returned a widely understood data format. In terms of the underlying data, our target was documented information standards with persistent identifiers. Finally, as an afterthought, we were interested in whether data was being collected to measure use and beginning to think about what our aggregation API should do to honour that data, or indeed whether or not we wanted to capture our own.

So, with all that in mind, let’s take a look at the contenders! Just looking at the websites for the providers on the phase one list revealed two existing Solr APIs, an Opensearch/Solr API and talk of API developments with two more providers.

Europeana have an Opensearch API which supports Solr fielded search. Access is limited to Europeana partners for now, but there are changes coming at Europeana and they are working on moving their dataset to a CC0 license and opening up their API to everyone.

Closing off the preliminary findings, in house at MIMAS we also host the Archives hub and Copac, which both have SRU APIs and so could be included, although they don’t hold digital content.

At this point we felt there was no reason not to also kick off point five, which was contacting other organisations from the phase one list to ask them about any work they may have been doing already to get an API going, or whether any would like our help in getting off the ground. Welsh Voices of the Great War were happy to provide us with their metadata so that we could help with an API, although all their content is held at People’s collection (mentioned above). We’ve also been in touch with Manchester City Archives and hope to assist them in a similar manner. Happily, the simple act of making contact with providers and introducing what we are doing also led to some more hits:

Finally (for now), the big name which was conspicuous in its absence from the above list — The Imperial War Museum — were also happy to get involved with what we are doing and gave us access to a preliminary Solr API which they have in production.

Contacting other providers is still in progress, so this post is by no means a final list of our findings! However, with points one,two and five from our preliminary strategy well under way, in the next post I’ll be following through my thought process as a relative newcomer to the world of metadata APIs, during the course of my investigations into the APIs mentioned here. I’ll also talk about some of our ideas regarding helping providers onto the API ladder.