Meeting Agendas and Minutes

The Task Force will meet at 1 PM Eastern (US) on first and third Thursday of February, March, April and May. Agendas (and all other TF discussions) should be posted to the IG email list; subject line should be prefixed '[dpub-arch]'.

Goals

The Portable Web Publications for the Open Web Platform (PWP) document includes a limited description of how the PWP needs to support archiving. This task force will review that text, reach out to the library and archive community for more detailed requirements and
use cases, and offer feedback to the Digital Publishing Interest Group.

Information of interest will include finding out whether archival institutions have run into technical issues (e.g., missing information) when archiving large number of EPUB documents. What information, either in metadata or in the document format, should be provided to making archiving easier? What information is missing and what is necessary for a PWP to make the job of archivists easier? What prototypical archiving service use cases drive these requirements?

Some of the answers will point to areas entirely outside the scope of the PWP; capturing that information will be useful as well.

Scope

The TF will focus on formal archiving service and content preservation workflow use cases and requirements that potentially impact and/or inform practices for publishing on the Open Web Platform. This encompasses archiving done at the time of original publication as well as archiving done later in a publication's life cycle (e.g., by harvesting content available on the Web), including:

Digital Preservation Handbook, UK Digital Preservation Coalition From their Introduction: "This Handbook aims to identify good practice in creating, managing and preserving digital materials and also to provide a range of practical tools to help with that process. ... the Handbook aims to provide guidance for institutions and individuals and a range of tools to help them identify and take appropriate actions."

Glossary

Archival Quality: "adj. ~ 1. Records media · Resistant to deterioration or loss of quality, allowing for a long life expectancy when kept in controlled conditions. - 2. Records storage conditions · Not causing harm or reduced life expectancy. Notes: ANSI/AIIM deprecates the use of 'archival' because it is a highly subjective term. Rather, they suggest using measures of 'life expectancy', which are based on empirical tests. While no materials meet the ideal definition of 'archival', many archivists use the term informally to refer to media that can preserve information, when properly stored, for more than a century." [1]

Archivist: "n. ~ 1. An individual responsible for appraising, acquiring, arranging, describing, preserving, and providing access to records of enduring value, according to the principles of provenance, original order, and collective control to protect the materials' authenticity and context. - 2. An individual with responsibility for management and oversight of an archival repository or of records of enduring value." [2]

Digital preservation: "Digital preservation is the active management of digital content over time to ensure ongoing access." [3]

Immutability: "The quality of being unchanging. Notes: The content of a document is fixed in that it is stable and resists change, but it may not be immutable. Words may be erased or added. 'Immutability' connotes a significantly greater resistance to change, such that any change is clearly evident. In information technology, immutability is accomplished by creating a process to demonstrate that the record has not been altered. [4]

Life expectancy:

"n. ~ The length of time that an item is expected to remain intact and useful" [5]

"Life Expectancy (LE) is a term that describes the stability of imaging materials. The standard has always been "archival." But when computer folks say archival, they are talking about something that is usable in 2 months. When librarians say archival, they mean forever. Life expectancy is a new term that accommodates both ends of this continuum. The definition of Life Expectancy is the length of time that information is predicted to be [stable]" [6]

Persistent object preservation: "A technique to ensure electronic records remain accessible by making them self-describing in a way that is independent of specific hardware and software." [7]