The ADS Investigates: #MythBustingMay

Throughout the month of May, the ADS has been investigating and debunking some of the myths and misconceptions that surround archives, digital preservation and the Archaeology Data Service.

You may have seen us using the Twitter hashtag #MythBustingMay to highlight some of these common misunderstandings, signpost useful resources and evoke the occasional PDF-related public outcry. The project has been well received and we hope has provided a useful insight into digital preservation best practice and the services the ADS provides.

As the month draws to a close and we hang up our deer-stalkers, we’ve decided to free ourselves of the shackles of 140 characters and compile a blog to discuss some of the key issues and ideas the project has highlighted.

Just a lot of Doom and Gloom?

Evaluating the current state of digital preservation and acknowledging the data that has already been lost due to poor or non-existent data management strategies is grim work. Furthermore, the sheer volume of digital data produced, even just in the discipline of archaeology, can make digital preservation seem a monumental task.

The leading causes of data loss

We tried not to simply lapse into scaremongering about the necessity of digital preservation, but the realities of data loss due to a lack of a data management plan are clear; and we made use of several case studies throughout May. Particularly dismal is the report that 80% of scientific data from the 1990s may be lost.

Considering these cautionary tales, how can we challenge the misconceptions that surround digital preservation in a way that is productive and encourages the implementation of data management strategies?

The Myths, The Legends

A stand out theme of #MythBustingMay and by far the most common feedback we received related to the idea that simply transferring data to a digital medium equates to its digital preservation. There appears to a common misconception that digital preservation is simply digitisation, and that by extension born-digital data does not need to be archived. The case studies above clearly indicate that having data in a digital format or uploading it online is not enough to safeguard it for the future, and we have hopefully gone some way towards highlighting this throughout May.

One of the myths we tried to debunk this month is that metadata is unimportant. The Newham Archive is an unfortunate example of data loss due to a lack of metadata; and although time-consuming to create, metadata is one of the most important factors in effective digital preservation. We also highlighted issues caused by proprietary storage formats and digital obsolescence, with proactive measures (e.g. file format migrations) being required to prevent data loss.

Somewhat controversial were our cautions against the use of PDFs as a preservation format, with there being some ardent defenders of the PDF on Twitter. Although an excellent dissemination format, there are various reasons why PDF (and even PDF/A) is not ideal for data storage and preservation, with this having been discussed in more detail previously on the ADS blog. One of the most significant problems with PDFs is how content (e.g. images) are embedded in the PDF format, and if not preserved in their original form may result in a significant loss of data quality when retrieved at a later date. It is considered best practice to instead preserve data in its original format (e.g. as a text file and separate original image files) to ensure its safeguarding in the long term.

Mysteries of the ADS

Another aim of #MythBustingMay was to provide some clarity on the services the Archaeology Data Service provides. We have extensive guidelines for depositors and our FAQs provide support on navigating the ADS archives and using the ADS library. We highlighted the fact for example, that the ADS does not specify site recording techniques. Depositors are free to collect their archaeological data however they choose, as long as the data and metadata they submit eventually complies with our requirements. Previous blog posts also shed some light on questions people may have about the ADS, for example, why reports take so long to appear in OASIS?

Additionally, #MythBustingMay was an excellent opportunity to dispel some of the misconceptions that may discourage researchers from archiving their data. Our recently re-released ADS-Easy submission system is designed to reduce costs for depositors, meaning that archiving may not necessarily be as prohibitively expensive as people expect.

For more information on the ADS’s services, there are our FAQs pages and a detailed FAQs for OASIS that provides information on submitting OASIS forms. There are also several external resources available that are aimed towards community archaeologists and non-specialists, such as the newly published Jigsaw Cambridgeshire’s guide to completing OASIS forms.

Thank you to everyone who engaged with and contributed to #MythBustingMay this month. Looking outward and considering issues that concern the wider digital preservation community is always an enlightening endeavour and we hope you’ve enjoyed the project as much as we have, and perhaps even learned something in the process!