Open Repositories 2007 Plenary Session 2: Preservation

January 25, 2007

First this morning MacKenzie Smith described the PLEDGE project, which is a collaboration between MIT libraries and San Diego Supercomputer Center. PLEDGE involves encoding preservation policy in a machine readable form (RDF against a couple of defined ontologies) to manage preservation across content by replication distributed across a grid of computers (hence the SDCC tie in). I find the idea really appealing, and I hope it could be the basis for preservation service description between IRs and preservation server providers (e.g.).

Next up, Joan Smith of Old Dominion University described mod oai, an Apache module that generates an OAI feed of web content from an Apache server, with the aim of improving crawling (and hence archiving). I had two thoughts about this – firstly that much of this is overkill for the purposes of archiving; if the main problem is uncrawled content, then the Open Sitemap approach is a more appropriate technology. The second was hoping that it would be straightforward to use the code / approach to write mod_sitemap, and how useful that would be!

Last presentation in the first session is by Miguel Ferreira from the University of Minho in Portugal, who is presenting CRiB; a distributed framework and architecture for preservation services. The framework involves analysis of repo contents by format, and a service to advise on migration strategies. Good stuff, as ever, from Miguel and the rest of the team at Minho. I’m going to have to collar him at coffee to ask him how this might fit in with the Migration on Ingest work in the DSpace@Cambridge project (dev notes).

[…] I’m also unsure about syndication – I have a feeling that the resource representations in Atom / RSS feeds are unlikely to satisfy most repository clients’ needs. Isn’t a more resource-oriented approach to simply link to the resource and let the client negotiate with the resource for an appropriate representation? If so, Sitemaps fit the bill perfectly. […]