A round-up of a few pieces of digital goodness to cheer up a damp and dark start to October:

What looks like a bumper new issue of the Journal of the Society of Archivists (shouldn’t it be getting a new name?) is published today. It has an oral history theme, but actually it was the two articles that don’t fit the theme which caught my eye for this blog. Firstly, Viv Cothey’s final report on the Digital Curation project, GAip and SCAT, at Gloucestershire Archives, with which I had a minor involvement as part of the steering group for the Sociey of Archivists’-funded part of the work. The demonstration software developed by the project is now available for download via the project website. Secondly, Candida Fenton’s dissertation research on the Use of Controlled Vocabulary and Thesauri in UK Online Finding Aids will be of interest to my colleages in the UKAD network. The issue also carries a review, by Alan Bell, of Philip Bantin’s book Understanding Data and Information Systems for Recordkeeping, which I’ve also found a helpful way in to some of the more technical electronic records issues. If you do not have access via the authentication delights of Shibboleth, no doubt the paper copies will be plopping through ARA members’ letterboxes shortly.

Last night, by way of supporting the UCL home team (read: total failure to achieve self-imposed writing targets), I had my first go at transcribing a page of Jeremy Bentham’s scrawled notes on Transcribe Bentham. I found it surprisingly difficult, even on the ‘easy’ pages! Admittedly, my paleographical skills are probably a bit rusty, and Bentham’s handwriting and neatness leave a little to be desired – he seems to have been a man in a hurry – but what I found most tricky was not being able to glance at the page as a whole and get the gist of the sentence ahead at the same time as attempting to decipher particular words. In particular, not being able to search down the whole page looking for similar letter shapes. The navigation tools do allow you to pan and scroll, and zoom in and out, but when you’ve got the editing page up on the screen as well as the document, you’re a bit squished for space. Perhaps it would be easier if I had a larger monitor. Anyway, it struck me that this type of transcription task is definitely a challenge, for people who want to get their teeth into something, not the type of thing you might dip in and out of in a spare moment (like indicommons on iPhone and iPad, for instance).

I’m interested in reward and recognition systems at the moment, and how crowdsourcing projects seek to motivate participants to contribute. Actually, it’s surprising how many projects seem not to think about this at all – the build it and wait for them to come attitude. Quite often, it seems, the result is that ‘they’ don’t come, so it’s interesting to see Transcribe Bentham experiment with a number of tricks for monitoring progress and encouraging people to keep on transcribing. So, there’s the Benthamometer for checking on overall progress, you can set up a watchlist to keep an eye on pages you’ve contributed to, individual registered contributors can set up a user profile to state their credentials, chat to fellow transcribers on the discussion forum, and there’s a points system, depending on how active you are on the site, and a leader board of top transcribers. The leader board seems to be fueling a bit of healthy transatlantic competition right at the moment, but given the ‘expert’ wanting-to-crack-a-puzzle nature of the task here, I wonder whether the more social / community-building facilities might prove more effective over the longer term than the quantitative approaches. One to watch.

Finally, anyone with the techie skills to mashup data ought to be welcoming The National Archives’ work on designing the Open Government Licence (OGL) for public sector information in the U.K. I haven’t (got the technical skills) but I’m welcoming it anyway in case anyone who has hasn’t yet seen the publicity about it, and because I am keen to be associated with angels.

Last Thursday I was delighted to attend the culminating workshop for the Society of Archivists‘ (SoA) funded digital curation project at Gloucestershire Archives. As Viv Cothey, the developer employed by Gloucestershire Archives, has noted, “Local authority archivists may well be fully aware of the very many exhortations to do digital curation and to get involved but are frustrated by not knowing where to start”. Building upon previous work on a prototype desktop ingest packager (GAip), the SoA project set out to create a proof of concept demonstration of a ‘trusted digital store’ suitable for use by a local government record office. The workshop was an important outreach element of the project, aiming to build up understanding and experience of digital curation principles and workflow amongst archivists in the UK. I have been involved with the management board for the SoA project, so I was eager to see how the demonstration tools which have been developed would be received by the wider digital preservation and archivist professional communities.

Others are much better qualified than me to evaluate the technical approach that the project has taken, and indeed Susan Thomas has already blogged her impressions over at futureArch. For me, what was especially pleasing was to see a good crowd of ‘ordinary’ archivists getting stuck in with the demonstration tools – despite the unfamiliarity of the Linux operating system – and teasing out the purpose and process of each of the digital curation tools provided. I hope that nobody objects to my calling them ‘ordinary’ – I think they will know what I mean, and it is how I would describe myself in this digital preservation context.

Digital preservation research has hitherto clustered around opposite ends of a spectrum. At one end are the high level conceptual frameworks: OAIS and the like. At the other end are the practical developments in repository and curation workflow tools in the higher education, national repository, and scientific research communities. The problem here is the technological jargon which is frankly incomprehensible to your average archivist. Gloucestershire’s project therefore attempts to fill an important gap in current provision, by providing a set of training tools to promote experimentation and discourse at practitioner level.

I’ll be interested to see the feedback from the workshop, and it’d be good to see some attendee comments here…

Chris Prom‘s talk on his Fulbright research ‘Tools for implementing Digital Preservation Standards’ for the ‘under resourced’ archive at the Society of Archivists’ Data Standards Group meeting (presentation slides should be available here shortly) yesterday has finally spurred me into posting a roundup of projects which I’ve encountered over the last couple of months, which are specifically relevant to digital preservation in a small archives repository.

When I embarked upon my Churchill Fellowship in 2008, practical implementations of digital preservation research were only occurring in large repositories, usually at a national or sometimes state level. With the notable exception of the Paradigm project and related work at Oxford University, there had been few attempts to scale down the large programmes, or to package up the various tools available with the products of digital library/repository world, as envisaged by the 2007 UNESCO report Towards an Open Source Archival Repository and Preservation System. The smaller programmes I did visit were generally concentrating on a niche subset of digital archives (for example, email or web archives).

Dedicated followers of digital preservation issues are probably already aware of the RODA repository created on a Fedora base by the Portuguese National Archives, and may have read this review of the demo site from another UK local archivist. Chris Prom is now embarking on a more formal assessment, and his blog postings on RODA (and the evaluation criteria he is using) make for worthwhile reading. RODA is likely to be of particularly interest to UK-based archivists who use the collections management software package, CALM, since this is also in use at the Portuguese National Archives, although there doesn’t seem to have been any attempt to date to link the two together. What happens with a hybrid accession? is the obvious question.

Chris also introduced yesterday’s audience to a new project, Archivematica, which is packaging already available open source preservation tools into a Linux Ubuntu-based virtual appliance. As the project’s wiki explains, ‘This means an entire suite of digital preservation tools is now available to the average archivist from one simple installation’. This is a really exciting development and I am looking forward to seeing the results of Chris’s evaluation. Archivematica is developed by the same Canadian team, Artefactual Systems, who are behind the ICA-Atom archival description software commissioned by the International Council on Archives.

Closer to home, since I am involved on the board for one of the projects, it is remiss of me not to have mentioned before on this blog the digital curation work going on at Gloucestershire Archives, although the website itself has only been made available relatively recently. This work is the first real attempt to develop a practical digital curation architecture in a UK local authority archives setting (as opposed to simple re-use of existing tools, piecemeal). Plenty to explore here.

And finally, on a less technical level, but nevertheless, I think, an important development. At the sixth of the Society of Archivists’ roadshows in December 2009, I was delighted to hear of Kevin Bolton‘s work in drawing up simple accessioning checklists for digital archives at Manchester Archives and Local Studies, and – most importantly – how these are being developed regionally for the North West, in conjunction with Cheshire Archives and Local Studies. Particularly at this time of economic recession (or are we supposed to be out of that now?) I believe it is vital that smaller archives pool their resources and work in partnership to find solutions to digital archives issues, and it is good to see a framework for the future being mapped out here in the North West.

Presentations from the successful open consultation day held at TNA on 12 November on digital preservation for local authority archivists are now available on the DPC website – including my report on my Churchill Fellowship research in the US and Australia. Also featured were colleagues from other local authority services already active in practical digital preservation initiatives – Heather Needham on ingest work at Hampshire, Viv Cothey reporting on his GAIP tool developed for Gloucestershire Archives, and Kevin Bolton on web archiving work at Manchester City.