The 'Capstone' email appraisal approach

Alexander Roberts is Digital Humanities Manager/Research Data Manager at Swansea University and attended iPRES2018 with support from the DPC's Leadership Programme which is generously funded by DPC Supporters

Welcome to my second blog post relating to themes and projects which sparked my imagination as a result of attending iPres2018, the international digital preservation conference, last September in Boston, USA. As I mentioned in my previous blog post discussing Denmark’s national digital preservation legislation, I am looking backwards in reflecting on takeaways from iPres2018, whilst very much looking forward to what iPres2019 has to offer.

In this two-part post, I will briefly discuss the history that led to the creation of the 'Capstone' email appraisal approach, which I first learned about during a session entitled, 'Archiving Email: Strategies, Tools, Techniques. A tutorial by[:] Christopher John Prom, Patricia Patterson, Wendy Gogel, William Kilbride, Ricardo Ferrante, Glynn Edwards, and Camile Tyndall Watson' on day one of the conference. I will also discuss what the approach means from a practical point of view and what its potential application within a University environment might look like. In part two I will discuss some of the tools available to accomplish this type of preservation and touch on wider questions concerning corporate communications and recent changes in relation to the tools used in complex environments.

Background

The 'Capstone' email appraisal approach was developed by the US National Archives and Record Administration (NARA) to assist US federal agencies in complying with a Presidential Memorandum (Managing Government Records Directive, M-12-18), signed on November 28, 2011, by President Obama, and which came into effect in 2016.

'This memorandum marked the beginning of an Executive Branch-wide effort to reform records management policies and practices and to develop a 21st-century framework for the management of Government records. The expected benefits of this effort include:

• improved performance and promotion of openness and accountability by better documenting agency actions and decisions;

• further identification and transfer to the National Archives and Records Administration (NARA) of the permanently valuable historical records through which future generations will understand and learn from our actions and decisions; and

The rationale behind the creation of the memorandum was to create an environment where '[w]ell-managed records can be used to assess the impact of programs, to improve business processes, and to share knowledge across the Government. Records protect the rights and interests of people, and hold officials accountable for their actions.' [https://www.archives.gov/files/records-mgmt/m-12-18.pdf]

In complying with the memorandum, all US government agencies were required to work towards two central goals:

Email records are specifically mentioned in the memorandum and within section 1.2 it states:

'[...] Email records must be retained in an appropriate electronic system that supports records management and litigation requirements (which may include preservation-in­ place models), including the capability to identify, retrieve, and retain the records for as long as they are needed. [...]'

Such a statement must have been truly frightening to records managers reading this guidance the first time. Just trying to imagine the scope of this activity and the many millions of emails that might be included is enough to strike terror into any rational person! To try and alleviate some of the hard work that this would naturally involve there is further guidance in part II section A3:

'Investigate and stimulate applied research in automated technologies to reduce the burden of records management responsibilities

A3. I NARA, the Federal Chief Information Officers Council and the Federal Records Council will work with private industry and other stakeholders to produce economically viable automated records management solutions. By December 31, 2013, NARA will produce a comprehensive plan in collaboration with its stakeholders to describe suitable approaches for the automated management of email, social media, and other types of digital record content, including advanced search techniques. The plan will detail expected outcomes and outline potential associated risks.'

Phew! So, according to the memorandum it was expected that some of the 'heavy lifting' would be done by automated tools [this didn't quite materialize, but more about this later...]. It was the prospect of using such tools that interested me, in particular, as a digital preservationist with an IT background working at Swansea University in South Wales, UK.

In this white paper NARA states that the benefits of the Capstone approach are:

Increasing the amount of email of permanent value transferred to NARA,

Reducing the burden on individual end-users within agencies,

Reducing reliance on print-and-file practices, and

Allowing for [the] systematic destruction of temporary email based on an approved NARA disposition authority, reducing the amount of email that has no further value being stored by agencies.

Interestingly, early on in this white paper, it makes reference to the challenge that archiving email presents to all agencies [4 years on since the original memorandum in 2011].

'Management of email [...] has remained a challenge to most, if not all, federal agencies. In what is often referred to as “traditional records management,” the email management burden is typically placed on each end-user to make a record or nonrecord decision and to determine retention and final disposition [...]. The end-user must also manage nonrecord email, including those of a personal nature.'

Thus far we have learnt about the reasons why we might choose to archive certain email accounts and challenges involved with archiving such accounts at scale. The Capstone approach is designed to make the process of archiving email practical with key points in the process decision-based. The NARA Bulletin 2013-02 describes the approach as:

'Simple' and 'automated' may sound both good and bad depending on one's perspective... The idea behind the approach is that one can categorize, and schedule email based on the work and/or position of the email account owner. In practical terms, this means that only email accounts from officials/directors/senior managers, etc. at or near the top of an agency or organizational unit should be considered for preservation.

There are exceptions to this rule, in that an organization may also wish to identify an employee as 'Capstone' if they are in positions likely to create or receive permanent email records of note. Quite what this means in practice I am not too sure, but I guess, as an example, if you have an employee who regularly provides information connected with open government such as Freedom of Information (FOI) requests, then this may also be an email account that gets preserved (in addition to the information provided under FOI).

One of the obvious challenges with adopting the approach in the first place is deciding who counts as a 'Capstone' user and whose correspondence is 'worthy' of preservation, and who does not? Such decisions can be quite political, giving rise to strong emotions and indicating implied positions of influence and power which may not be officially recognised. In a University considering email preservation, it might be easy to identify the Vice-Chancellor as a Capstone account. It might be equally agreed that senior directors or pro-Vice Chancellor accounts may also be 'Capstoned'. Should the email accounts of members of the University Senate, or even the Chancellor themselves (often an honorary role in itself) also be subject to the same archival decisions?

The other challenge is how to mitigate the risk of collecting personal email subject matter, along with the distinct possibility of collecting and having to store non-record email. In the first instance, European data protection laws, including GDPR, are somewhat stronger than those in the US, and the rights of an individual to be 'forgotten' present an even greater challenge to the digital preservationist. Such individuals can typically make an informed choice when completing a survey, or purchasing items online, as to what happens with their personal information, indicating by a tick of a box their preference. Whereas, no such mechanism exists (yet?) to indicate the preferences of a correspondent on one of half of an email exchange.

Leaving it up to end users to decide what gets included in their email archive before it gets deleted can also present a skewed, partial and sanitised record of their activities. They will undoubtedly feel the pressure to ensure any scrutiny of their activities is viewed only in the most positive light for posterity. An example of this is the now famous and controversial situation regarding former Secretary of State Hillary Clinton and her use of a private server for email purposes during her tenure as Secretary of State. It was reported that her staff deleted about 30,000 emails that were not work-related based on their own understanding of the guidance from NARA. An automated approach would potentially have reduced the risk of such unauthorized email destruction - although without necessarily reducing the number of deletions!

Of course, how email accounts are identified and prepared for archiving will depend on the technology available and policy requirements within any organisation and the approach can only inform such local decisions.

Perhaps the 'best' approach (from a digital preservation perspective), and one where the ideal tools exist to support the work would be to identify key organisational functions (individuals like the Chief Information Officer for example) and as a matter of course, automatically archive their email in real time. Brushing aside the issues of who actually owns the content of an email (yes, this issue is not resolved!), to conduct such automated archiving would require sophisticated software to identify transitory, personal and non-record email without user intervention leaving only a 'paired down' email record (is 'carcass' a better word?) for posterity.

So, what tools should I use?

As one would expect, NARA and the rest of the digital preservation community has conducted a huge amount of research and consultation in this area to date. NARA has developed a toolkit and guidance for those engaged in preserving emails at http://www.archives.gov/records-mgmt/email-mgmt.html and http://www.archives.gov/records-mgmt/toolkit/#list. Such mechanisms and their application to email archives in the context of other competing tools enabling more sophisticated group messaging and search is the subject of part two of this post.

Comments

We should remember that GDPR also gives a strong defence for archiving in the public interest, which qualifies the right of erasure. See http://www.nationalarchives.gov.uk/documents/information-management/guide-to-archiving-personal-data.pdf.

Though this doesn't mean that dealing with personal information is easy!