Libraries and archives carry the responsibility of capturing and preserving 'representative samples of society', covering both cultural and scientific production. In the digital world this obligation has extended to include not only diverse physical outputs (books, journals, music records, newspapers etc), but also digital preservation of both analogue and digitally born materials. Here we describe a subproject of a larger user study targeted to identify user requirements for the digital preservation of documents, records and data sets. The central message to be communicated from this study is that requirements reflect usage type rather than user type.

A preliminary model of general user requirements for preservation planning was developed based on qualitative studies, including cultural data probes and contextual design adjusted for this specific model (Snow et al., D-Lib Magazine, 14, (5/6) 2008). These initial studies focused on users of archives (in the Netherlands), users of data centres (in Scotland), and users of libraries (in Denmark), and assembled the identified general requirements into a first-round preservation planning model (see link below). In order to refine and improve this model, including identification of more specific requirements, a final qualitative approach was chosen, in which differences among user groups and collections were explored, instead of identifying common themes.

It might be anticipated that users of archives, data centres, and libraries would produce very different requirements for a preservation planning tool. However, our study suggests that this is in fact not the case. It is the type of usage, rather than the type of user that determines the specific requirements. For example, a scientist using a data centre of natural science information may have little concern for appearance, since the data may just be tables of numbers. This is contrary to an archival user, who is often dealing with scanned or digitized information, and who may be very concerned with the completeness or resolution of archival documents. An archivist or preservation officer at either of these organizations would take these intents into consideration when selecting requirements for preservation planning. Based on our findings of different types of usage, we propose a series of questions to be asked during the preservation planning process that, depending on the answers, will alter the priorities of the requirements. These questions are not focused directly on libraries, archives and data centres, as those boundaries are somewhat artificial; any type of usage can certainly occur in any type of institutional setting. We have found that the following six central questions pay regard to different usage needs and scenarios:

Is the content digital-born?

Is this content likely to be represented in paper/analogue format?

Is the appearance of this content relevant?

Should the content be searchable?

Should it be possible to alter or edit a personal copy of the content?

Should it be possible to verify the provenance of the content?

By combining these simple questions with a tree of abstract requirements, the requirements can be weighted to indicate their relevance, on the basis of which a decision support tool can be designed.

Archive users were presented with experimental sets of records, including three representations of an original Word Perfect 5.1 document and two representations of an e-mail, along with different representation methods for metadata. The users were asked to reflect and comment upon the representations during the 1½ hour session and to indicate, among other things, their preference for a specific representation, the reasons for this, and their preferences for availability of metadata.

Users of data centres in the initial phase of the study provided actual sample documents representing a cross-section of their daily work. The samples were subsequently manipulated, creating a series of variations to emulate potential migrations that may occur during preservation actions. These variations were presented to the participants at the second interview, where the alterations were discussed and commented on.

Users of libraries and their collections, represented by a group of active researchers, were invited to a three-hour workshop on the theme ‘original content and potentially migrated or altered content’. They were asked to comment on and discuss cases from three different collections using their individual professional experiences as the basis for their reflections. The collections targeted were: the Danish national collection of newspapers, represented by different digital and paper versions of an article in Politiken (a national newspaper), examples from the digitization of the collected works of Søren Kierkegaard, and examples from the Web Archive (a comprehensive archive of Danish web pages).

Figure 1: Preservation planning model. Three slightly different approaches were taken. In previous studies, affinities between user groups were important; in this study however, the focus was on specific clarification of existing requirements, discovering new requirements and differences in usage across the three target areas.

This article draws upon the ‘Report on usage models for libraries, archives and data centres: final results’ by John W. Pattenden-Fail (HATII, University of Glasgow, UK), Bart Ballaux (The Nationaal Archief of the Netherlands, The Hague, NL), Laura Molloy (HATII, University of Glasgow, UK), Jørn Thøgersen (State and University Library, Aarhus, DK), Filip Kruse (State and University Library, Aarhus, DK), and Annette Balle Sørensen (State and University Library, Aarhus, DK) [unpublished] under the auspices of PLANETS (Preservation and Long-term Access via NETworked Services).