One of the greatest challenge for any archive is the multiplicity of file formats. For the United States National Archives and Records Administration (NARA), with several decades of history accessioning and managing electronic records, this is compounded. We received our first transfer of electronic records in 1970!

How do you plan a preservation strategy to account for decades of electronic files? I started by drawing a picture. Not a literal picture, of course, but I wanted to find a way to analyze and visualize what NARA has in its holdings.

NARA has recently completed a file format profile of its electronic records files. Why did we do this? Because we could not plan without first getting a better idea of what we really have. NARA operates under several different regulatory mandates, each with different restrictions on collection schedules and scope, as well as access controls. This led to the implementation of multiple systems--developed over more than 20 years with different technologies--which meant a real challenge in understanding the scope of the holdings.

I worked with the system owners and our IT operations to get the most granular reporting possible on each set: federal, legislative, and individual presidential administrations. The reporting didn’t always match in terms of granularity, given different tooling for the format analysis and report generation, but in the end I was able to compile a record of what we have, what formats we have, and counts. Could we identify every file format with complete certainty? No. Were there decisions in the past about format normalization that I had to take into account? Yes. Will it help me plan for preservation program and technology priorities? Absolutely.

My name is Hannah Smith and I work for Historic Environment Scotland, based in the digital archive team within the Heritage Directorate. For the first ever Digital Preservation Day I thought I would share some of the progress we have been making in terms of digital preservation at HES, as well as some of the more day to day work in the digital archive. We have been actively collecting digital archive since 2003, receiving both internally and externally generated material. Historic Environment Scotland currently holds more than 437,000 catalogued digital items which equates to around 32TB of archived data. Over the last 2 years, the digital archive has been making huge strides in renewing the technical infrastructure that underpins our work and to ensure the long term preservation of our digital records. Our goal is to provide the best possible care for our digital archives and we are looking to bench mark our services within the European accreditation framework. In 2015 HES invested in new trusted digital repository software, and work has focused on integrating this preservation system with our own repository. We have made huge advances in the standard of care we provide to our digital archive: 617,338 individual digital files have been audited and processed to ensure they conform to appropriate standards.

A first glimpse at the DPC ‘Save the Bits’ announcement on the compilation of a list of Digitally Endangered Species confused me when it passed my screen. Further scanning the text only increased this feeling as I encountered more ‘species’ related references, but it soon turned out I was misled by my own biologically biased search image.

It was especially the ‘IUCN Red List of Threatened Species’ that was steering me wrong. A list very familiar to me as a former coordinator of several large Biodiversity data programmes. But the DPC suggested list had nothing to do with plants, animals and microbes soon to disappear from our planet’s surface. It was all about their digital equivalents occupying binary niches and threatened by the lack of proper digital archives, outdated software formats, or insufficient human efforts to safeguard their existence.

(Episode 1) When unsuccessful digital preservation can be convenient

The year of 1998 was special. In May, it opened the Lisbon World Exposition! In June, it was held the “Sixth DELOS Workshop on Preservation of Digital Information” in the beautiful Tomar. Finally, in October, I became CIO of the National Library of Portugal.

In retrospective, 1998 was my definitive commitment with this great world of digital libraries and archives. A seed has been planted in 1996, when I got involved in the new DELOS Working Group on Digital Libraries, and it blossom in 2000 when I had the privilege of organizing the 4th ECDL conference in Lisbon. DELOS was a community that still brings special memories (“saudade” as we say in Portuguese - https://en.wikipedia.org/wiki/Saudade)

Today is the first International Digital Preservation Day. The aim of the day is to create greater awareness of digital preservation and the issues associated with preserving and providing access to digital material. There are particular challenges associated with the preservation of digital material, notably the fast pace of software and hardware developments, the increasing complexity of digital resources and the resulting impact on the stability of such media. If digital material is to remain accessible, both in the short-term for business continuity, research, economic and legal requirements and for preserving the historic record in the longer-term, measures have to be taken to ensure that this information is accessible.

The International Digital Preservation Day has been co-ordinated by the Digital Preservation Coalition http://www.dpconline.org/. The NLW is a long-term member of the DPC, the aim of which is to support its members to make digital information available in the future. It has published a 'Bit List' of the World's Endangered Digital Species http://dpconline.org/our-work/bit-list) which has been unveiled today as part of this campaign to raise awareness of the need to preserve digital materials.

Traditionally the major challenge in digital preservation has been seen to be technology obsolesence. However, arguably the organisational challenges, particularly funding (and advocacy for funding), have proved to be much more significant over time.

The first International Digital Preservation day allows us an opportunity to reflect on some of the milestones and significant events so far in the implementation of the University of Sheffield’s Digital Preservation programme. The Library became one of the early adopters of Rosetta, a Digital Preservation solution provided by ExLibris, in 2015. Following installation Rosetta was given the Sheffield brand name ArchiveUS and initial priority focussed on developing ingest routes for our valuable digital material; born digital and digitised collections from Special Collections and the National Fairground and Circus Archive.

In September 2016 the University's Festival of the Mind event gave the Library the opportunity to highlight the thinking behind Digital Preservation through a collaboration with local artist Paul Carruthers. ‘Memories in the digital age’ is a triptych film that featured difficult to access footage from the library’s collections. The piece, which was exhibited at Sheffield’s Millennium Gallery, explored some of the ideas underpinning Digital Preservation; such as the generation and use of digital information and its relationship to memory.

Prologue

October 2006. A workshop discussing digital information management. A known and respected IT visionary comes up and delivers a statement about file format obsolescence: “It is really not an issue to worry about. In ten years we will certainly have artificial intelligence which is able to render any bitstream there is”.

The digital society and the digital archivist

“Dear community! My name is Kuldar and I’m a digital archivist in Estonia.”

The particular thing to note about this confession is the country as such – just Google for it and you are guaranteed to get a fair number of hits which describe how in e-Estonia you can set up companies in 18 minutes, declare taxes in 3 minutes, or tell you that 99% of public services are available online. Digging a bit further you will find out how Estonia has implemented nice things like ‘once-only’, ‘digital by default’, and ‘no legacy’ – principles which, when spoken out aloud, will lead any reasonable archivist straight to a mental institution along with inflicting a serious heart condition.

Thursday was yet another busy and versatile day here at the section of Digital Preservation of the Danish National Archives. As usual there were the daily audit reports and the results of the quality assurance of the ingested SIPs which once again were spit out (pun intended) by our QA system. The producers of the SIPs were notified and given a new deadline for resubmitting SIPs which we can ingest and digest. Almost all of the rejected SIPs were produced by national authorities, but one or two were actually produced by a colleague. A taste of one's own medicine can be bitter and hard to stomach. The errors were the typical ones: lack of context documentation, missing explanation of code values, broken referential integrity and poor conversions to TIFF.

Having dealt with the ingest problems we turned our focus on the next item in the process, the packaging and storage of the AIPs. We are in the process of storing five AIPs from five similar authorities ranging in size from two to eight TB. At first we could not understand the huge size of these AIPs produced from ordinary digital case and document management systems. It seems that many incoming documents are an order of magnitude larger than the outgoing. Apparently, quite a few citizens seem to reply to these authorities by printing out the documents they receive, adding handwritten comments on them, taking pictures of all the pages using their smart phone, and emailing them to the authorities. That is how an outgoing black and white document is transformed into an incoming document in full colour - and full size.