Technology tips, tricks, and tidbits from a Systems Librarian

Category Archives: Digital Preservation

I was just browsing through the June/July 2013 issue of ALIA’s “Incite”, when I chanced across an article titled “Digital Heritage Collections”, which discusses the on-going efforts of collecting and preserving digital materials in National & State Libraries Australasia (NSLA).

Apparently, the Australian government is going to change legal deposit legislation to include digital materials, which is both exciting and challenging. It seems like the NSLA have and will continue to have a lot of work to do. You can read more about their on-going efforts here: http://www.nsla.org.au/projects/digital-preservation

I find digital preservation to be a very intriguing aspect of modern libraries and archives. Professionally, I have very little hands-on experience with digital preservation, although I know people involved in teaching preservation, practising conservation, helping with digital preservation in universities, and working on digital preservation software (primarily Archivematica https://www.archivematica.org/).

I have taken a full-length course in print and digital preservation, and have done research into digital preservation which included writing a paper on optimal file formats to choose for born digital materials, but that’s about as far as I’ve gone at this point.

On one hand, everyone – not only library and archives professionals but also anyone with any interest in short to long term preservation of their digital materials – should be thinking about digital preservation. How many of us can still access text documents that we wrote 10-20 years ago? I probably have book reports from primary school on a floppy somewhere but it is becoming more difficult to find floppy disk drives (although where there is a will there is a way – I’m confident of that) and how many programs can still read those formats? I don’t even know what format or what program I used back in the day (I think it was called WPP but that isn’t a very useful search term). Sure, you might think, those old book reports might not be important. However, how many people wrote their memoirs or creative and intellectual efforts using that same program at that time? Probably quite a few. What’s been lost already and what could be lost?

On the other hand, who has time or energy for digital preservation? As mentioned in the “Digital Heritage Collections” article, “the preservation of digital assets is an active process”. There is a reason why people are employed full-time to do this work. It’s constantly on-going and never ending work. There are many questions to ask:

1. Assessment: What is preserved and what is not preserved?

2. Format: Do you continually reformat (i.e. migrate file formats) or do you use software emulation to reproduce the material in its original environment with its original format? (where do you draw the line with emulation? Emulating the application? the operating system? the hardware? the bugs?)

3. Integrity: Are the materials retaining their integrity over time? Are they slowly corrupting or losing data? Did the reformat completely or partially mess up the binary data? Who (or what) is going to check the integrity of all the materials, especially when these materials can range in the thousands or millions of items? Is the format of the material considered stable and long-lasting?

4. Authenticity: Are the preserved materials true and authentic representations of the original? Are they experienced in the same way as the original? (both questions very important in archives).

5. Storage: Where are these digital materials stored? In a database of their own? On a file system? Physical servers are subject to chemical degradation and mechanical failures as well over time. Do you backup your database and/or file system? Do you copy your storage across multiple systems to ensure data redundancy?

6. Access: Once you’ve stored your files, how do you make them accessible? Do you make access copies (often PDFs in the case of text documents)? In the case of emulation, how do you expose the emulated system to end users in a way that ensures authenticity?

For more information on these questions and how to answer them, research digital preservation, ask digital preservation experts, and maybe take a look at some of these links:

The OAIS reference model (also known as the functional model) is a useful way of thinking about the process of digital preservation.

There’s also much more to digital preservation than what I’ve written here. I haven’t talked at all about metadata attachment or encapsulation. Review those later links (and the Archivematica website) for more information on the rabbit hole that is digital preservation.

It’s a fascinating area of library and archives work, and while it might not be very practical for the everyday person, it’s an absolute necessity at the institutional and national level.

The Submission Information Package (SIP) is what people give to the librarians and archivists.
The Archival Information Package (AIP) is what is preserved and stored.
The Dissemination Information Package (DIP) is what is shown to end users.