Friday, October 12, 2012

Hello everyone! Since I
am new here, I better introduce myself, but briefly. My name is Sarah Kim. I
am a PhD student at the School of Information, the University of Texas at Austin. For several years, I
have been exploring people’s everyday digital record-keeping practices. Personal
digital archiving as a form of long-term digital record-keeping and value-determination practice is the phenomenon that I particularly focus on in my research. I have been asking
people how they live with their digital documents. Today, I would like to share
one of the questions that I have been thinking of based on my (personal) digital
archiving research: Cloud storage and digital archives.

During the interview, I
asked participants what they would
pick first to rescue if there were a fire at their home, besides living things. Many
participants mentioned an external hard drive or a personal computer. Although
their answers may be influenced by other record-keeping related interview questions, if someone asks me the same question, I would think of taking my external hard drive that functions as my own digital
archives (not as mere back-up storage). Digital documents (including pictures) stored
on that device are vital for me to rebuild and continue my life.

This makes me curious about
how my participants’ answers will
change once they actively start using cloud storage to keep and preserve
their personal digital documents and furthermore how our digital archiving
practices will change with the new technology?

(Well-known New Yorker Cartoon by Mick Stevens, Published November 21, 2011)

It is highly likely that
more people (and memory institutions) will be interested in using cloud storage
as their digital archives considering benefits associated with it and the overall
trend in IT industry. Cloud storage, (expected to be) maintained and monitored
by IT experts, could be a relatively more secure place, considering the technical vulnerability of more conventional digital storage media that
many people are using such as external hard drive. Cloud storage services offer
other useful functions such as sharing documents with others, synchronizing digital
materials between different devices, tagging, and so
forth. Also, cloud computing is still
in early stages of development in general.

There are, however, many questions
to ask to clear the cloud of cloud computing.For example, concerns for privacy
(We are very familiar with horror stories about data hacking, personal information selling, identity theft and so forth)
and feeling of losing
control over their personal documents (Who owns what on the Web?), and building
trust between users and services providers (How much can we trust the work ethics
and long-term sustainability of these commercial services?) remain vital issues that we need to think of.

From an archival perspective, I think, memory
institutions’ (especially archives) working experiences with cloud storage
service providers can offer a great insight into how we can inject archival
thinking (e.g., what archives means, values of documents, and so forth) and practices in the design and development of these services. Thank you for reading.Sarah Kim (Personal digital archives research blog:http://personaldigitalarchives.blogspot.com/)

Well, this year's Day of Digital Archives ha been much more successful for me than last year (I broke my elbow in a cycling accident that day and spent much of it loopy because of the pain pills. I did type out a one-handed blog post but I don't think it ended up being coherent.) This year I want to talk about an artist's collection we've been working on for a bit at UO.

The Tee A. Corinne papers are one of the many hybrid collections we have in Special Collections and University Archives at the University of Oregon. Tee Corinne was a lesbian visual artist, writer, and activist who explored female sexuality in her visual and written works. Upon her death in 2006 she left her entire estate, including the rights to her literary and artistic works, to the University of Oregon Libraries. Owning the rights is nice because, once we've done our initial processing and preservation work on the files, we don't have to worry about any rights issues when providing access to the digital objects.

However, before we can even start worrying about access to the materials we've had to devise a plan for working with the digital records. When UO received Tee's collection in 2006, it included a laptop and a desktop computer as well as removable media containing various works and papers. At that time, the UO did not have well-developed procedures of workflows in place for ingesting or otherwise processing digital objects. The files were pulled off Tee's computers and the various media and moved over to library servers, but nothing else happened to them for a number of years. In the meantime, there was a gap of more than a year between the time my predecessor (the first e-records archivist at UO) left and the time I was hired. The Tee Corinne e-records were left on the servers and until now I haven't been able to work with them at all.

When I started my initial assessment of the digital portion of Tee's papers, my first task was to try to gather all the digital objects from the collection into one place on the server. Because of the lack of workflows when the collection was taken in, the digital objects ended up in a number of different places on the server. Although I think I've managed to round up most of them now, I still run across stray files that have to be added in with the others. When we started the project this summer, we identified 65,328 digital files we knew came from Tee's computers or from the removable media in her collection. Although I would love to be able to declare that all those files in fact belong in Tee's collection, she shared her computers with Beverly Brown, her lover, whose collection the UO also owns. In addition, Bev Brown was the founder of and was heavily involved with the Jefferson Center, an organization whose records the UO holds as well. Once we started looking at the files from Tee's computers, we realized that her files, Bev's files, and files from the Jefferson Center were all mixed together. The organic file structure the women were using did not clearly distinguish among these three separate groups. Often a single directory will contain files from all three collections. This has slowed down our processing: we're trying to develop some content-based filters so we can do some batch sorting of the files. Most of the textual documents were created in version of WordPerfect, so we're also working on batch converting those files. In addition, of course, we're having to do a lot of renaming so that the file names of the preservation copies don't have any of the potential trip-ups you see in organically-named files.

The most interesting challenge in dealing with this collection, however, has been the photographs. Photography was one of the many media in which Tee worked, and she made extensive use of Photoshop. Sometimes she created prints of several digitally-altered versions of a single photograph; we are often able to match physical prints with digital files, but in some cases we have digital photographs for which no physical print exists or vice versa. Tee also tended to revise her photographic series depending on the context in which she was exhibiting or publishing them. This means we sometimes have several different series of a single image or group of images. The series may or may not be consistent; that is, sometimes a series of images was published in one form in on place and in a different form somewhere else. In the digital files, this means that in some cases we have many duplicate copies of a single image (if Tee organized the files based on the various publications) as well as multiple different versions of an image. We would prefer not to transfer multiple copies of a single image onto our preservation servers, but we do want to preserve the different versions of the images because we feel these are an important artistic statement. Sorting out the files themselves has proved to be an enormous challenge, however. Luckily I have a team of graduate students and volunteers who are working hard on this (as well as other) projects.

What have I learned from my work with this collection so far? Obviously, documentation is a hugely important factor when you're talking about a born-digital collection. One of my main problems right now is the lack of documentation from previous work that occurred with this collection (however cursory that work might have been). I'm trying to document every step I take with these records so that my successors have a clear picture of what has and hasn't been done with the materials. It's also important for the digital archivist to be involved in the donation process if at all possible; this helps lessen the amount of triage work you have to do when the born-digital records arrive on your doorstep.

Today was my 29th day of work as the first
Records Management Archivist at Johns Hopkins University.My job encompasses two areas that overlap
frequently but not perfectly: management of university records and management
of born-digital archival materials, regardless of whether they originate within
the university or with external donors. This combination of roles is relatively
common in our profession, but I haven’t personally experienced it long enough
to evaluate it critically; perhaps that will be a topic for next year’s Day of
Digital Archives post.

My first 6 weeks have coincided with the processes of annual
reviews and setting individual goals in our library. Although I was initially
wary of having to set annual goals so early in my tenure, the timing has been
fortuitous because I have been planning my activities for the next 12 months –
which I would be doing at this point in a new job anyhow – at a time when my
colleagues are all thinking similarly. And if there’s one thing that I can say
with certainty about the next year, it’s that it will involve a lot of
collaboration: with other archivists, with curators, with developers, with metadata
specialists and with project managers, just to name a few.

For the next few months, I will be assessing the current
state of our institutional climate, our capacities and our collections as they
relate to acquiring, preserving and providing access to born-digital archival materials.
Next, I will be working with my
colleagues to determine what capacities we want
to develop as an organization. Do we want to do forensic captures of
media-based accessions? What kind of preservation activities do we want to undertake?
What types of functionalities do we want to build into our digital repository? How
do we want to provide access to our materials? Although there will be many
details left unanswered at this stage, I hope to be able to address these and
similar questions at a very high level within the next six months.

Finally, I will spend the rest of the year developing a
three-year road map for how we can get from where we are to where we want to be –
or at least, from where we are to moving purposefully and surely toward where
we want to be. This will involve identifying gaps in our current technological
and human capital, and proposing ways to bridge them.

Of course, the day-to-day activities of the archives will
not stop for a year while I figure all this out. Prior to my arrival, no one in
our department was charged with focusing to this degree on all the issues
surrounding born-digital materials. However, like many institutions, we had
still been acquiring them for some time. So while I am doing high-level analysis
and planning, I will also be carrying out the day-to-day activities of
accessioning and caring for our materials as best I can with the resources
currently available.

I have already made a few changes that bring our activities more
in line with, for example, the minimal levels of digital preservation outlined in a recent proposal from NSDA. Specifically, I have
instituted the use of LOC’s Bagger tool to generate file manifests and fixity
information according to the Bagit specification at the time of acquisition,
and I am working with library systems to transfer our current holdings to a storage
space where they can be more appropriately managed.

However, I don’t anticipate
many other changes in our procedures in the next year. This means that for the
next 12 months, we will undoubtedly continue to do some things in ways that I know
could be improved. However, when I do begin
to make radical changes in our procedures, they will be guided both by best
practices and by our own organizational needs and goals.

I’m the Digital Collections Archivist at Kennesaw State University in Kennesaw, Georgia. Kennesaw State is the third largest public university in the University System of Georgia, with a current enrollment of 23,103 for the Spring 2012 semester. Founded in 2004, the Archives consists of one full-time Archivist (me), a part-time Archivist, who also works half-time in the Bentley Rare Book Gallery, and an Associate Director. From 2004 to 2008, when I was hired, the Archives was staffed solely by the Associate Director. For the 2011 Day of Digital Archives, I created a photo essay to illustrate the different roles and responsibilities in my position. It was appropriate for the time, because our department was growing and expanding. We merged with the art and history museums on campus to form a super-department: Museums, Archives & Rare Books. This year, though, feels like one of retrenchment. We lost a long-time member of staff at the beginning of the year. The redistribution of her workload among the remaining staff brought fresh eyes and energy to some long-standing issues. We were able to use it as an opportunity to make significant progress on projects that had been stalled. In light of the difficult economy and constant budget cuts, I think similar organizations will find our actions of interest.

Writing it down

At the beginning of the year, we hired a Records Manager, the first in the history of the university. She’s been working to understand and document the workflow of records creation and disposition across departments in the university. We don’t have an enterprise document management solution, so it’s been quite an undertaking. As part of her duties, the Records Manager has also inventoried records at our off-site storage vendor, identifying and transferring materials with historical significance to the Archives. Although the mission of the Archives is to collect and maintain university records that document its activities and history, we found that we had no recognized authority to transfer records without the consent of the department or division head. Trying to find someone who was willing to accept responsibility or to grant permission was an exercise in futility. After trying to track down one division head over the summer, it was decided that we needed to seek the authority to transfer records deemed to fall within our collecting policies. The problem was we had few written policies.

The department formed a policy committee with representatives from the museums, Rare Book Gallery, and the Archives to develop unified collection management policy. We found that we were able to use the same language and concepts, adding specific examples or language for situations unique to each unit. After several iterations, the committee was able to create a collections management policy over the summer. It’s currently awaiting final approval by the Chief Information Officer before being implemented. Once this is in place, we are ready to submit the transfer authorization proposal to the President’s Committee for approval. The completion of the collection management policy spurred interest and development in additional policies and procedures, including reproduction, access and use, and registration, as well as related forms. We’re currently working on creating copyright policies. My particular focus is developing guidelines to help users to understand copyright restrictions and to make responsible reproduction decisions.

Clearing it out

Looking at ongoing problems with storage space, both physical and digital, we created an ad hoc committee to review materials and make decisions regarding disposition. One of the first problems that we identified was a large amount of supplies and resources that had been amassed “just in case.” These included outdated or broken equipment, unnecessary or unusable supplies, and donations that did not meet our collecting areas or interests. The process of clearing out the space allowed us to reorganize supplies, to order new equipment, and to relish the sense of accomplishment. We used this momentum to tackle the shared drives and digital repository, both of which had become dumping grounds. The same committee developed policies to govern the shared drive, as well as file naming conventions. Using these new documents, we began a clean-up of the shared drive, which amounted to removing approximately 60 MB of duplicate or unnecessary files. The process turned out to be so easy that the committee offered the services to one of the museums. We were also able to incorporate elements into an outreach session on email best practices and plan to offer the service to other departments on campus.

Building it up

The university is coming up on its 50th anniversary in 2013 and the Archives was approached to digitize historic images and to make them available for users. Currently, we rely on Archon to provide public access to our records and small files, such as oral history transcripts and low-resolution images. It was decided that it is inadequate to provide access to the high-resolution images required for the anniversary. I was tasked with comparing systems and making a recommendation. After much research, I determined that DSpace would best meet our requirements. We’re currently working with campus IT to implement a DSpace instance. As part of the DSpace project, I mapped the workflows of the Archives and identified current and future technology needs. This plan can now be used to ensure that we make strategic decisions based on demonstrated needs.

In addition to implementing new systems, we’re also focused on improving our current products and services. Archon was originally populated by importing data from our old CMS. It contains many records with minimum information. As part of the general commitment to bring consistency to our records, I’ve initiated a project to enhance the catalog on a record-by-record basis. This also allows me to check new accessions and add them to existing collections when appropriate, as well as to verify location and beef up the MARC record in the library’s OPAC. The project has already revealed some MARC mistakes and location errors.

By mapping the Archives’ core functions and relating them to technology needs, we are able to offer products and services of higher quality and with greater efficiency. We were also able to use our clean-up as a template for new services. While retrenchment may not seem as exciting as rapid expansion, it can still be an opportunity for growth and improvement. Please feel free to contact me if you'd like to ask any questions or follow up. You can reach me at agraha31 (at) kennesaw (dot) edu.

In my forty years (1964-2004) teaching art history at Carleton College I took thousands of slides of architecture for use in my classes. At the time I never thought they would have a life beyond their physical existence as filmstrips in a plastic mount. Then the Society of Architectural Historians established SAHARA, a digital image archive. I contributed 4869 images. Since then, it has been gratifying when other scholars have mentioned seeing or using some of those images. One episode stands out.

SAHARA forwarded to me a request from an American (I believe) scholar in Beijing asking for permission to reproduce in a Chinese-language journal a slide I had taken of a detail of the Allen Art Museum in Oberlin, Ohio. Apparently, I had taken the slide at just the right angle to support her argument (for architectural historians, it is Robert Venturi's "ironic ionic" column). It turned out that the SAHARA image was not the right size but the Carleton slide curator, Heidi Eyestone, was able to adjust from the original slide.

Perhaps this sort of thing is an everyday occurrence to most scholars now. But to this professor, still living in an analog world, it was just amazing that an image taken in Ohio by a professor from Minnesota could, 30 years later, come to the attention of a scholar in China through accessing a digital archive, and that image, corrected in Minnesota, could be transmitted to Beijing and eventually end up in a Chinese-language journal published in China. LS

Last year I chose to use my DoDA post to broadcast my former
institution’s ideas about communicating what digital archives are to the
public. This year, I have a new job at Penn State as Digital Records Archivist.
This job didn’t exist before, and we’re still trying to figure out what it’s
going to be. But we have a lot of irons in the fire here, so I’ll stick to the
theme of talking about what my ‘day of digital archives’ looks like.

I think it surprises some people to hear that I find my average
day to be distinctly non-technical, but it shouldn’t. And I’m okay with that. I’ve
found that what interests me most about the work is the challenge of figuring
out how it all fits together, how it blends with the other work of archives, and
how the overall work of the institution can be reimagined and modernized.

A lot of what I am doing at my current institution is
capacity building: forming policy, trying to locate and implement best
practices (which can and should be institution-specific), researching and
reading work being done by others, experimenting with free tools, and just
dialoguing with other staff about challenges and issues. A significant and
persistent challenge is trying to align a developing electronic/born-digital
records practice with our institution’s established practices, which includes
trying to both mold digital practice to legacy practice and recommending ways
in which legacy practice might be modified to suit new digital realities.

For example, this morning I have spent a little time working
on my long-fermenting ingest workflow for electronic records. This workflow
needs to address not only the particulars of what we might call ‘digital
processing’—how we transfer material from some kind of external digital media
to network ‘dark archive’ storage, how we store that transfer (discreet files
or disk images or both), what metadata/manifest information we attempt to
extract, and how we document this activity—but also how the media flows to the
digital archivist in the first place, and what becomes of it afterward. We have
to setup policies and practices that govern the separation of media from
collections, how the media and the work we do with it is documented in
Archivists Toolkit (as well as how it is documented there after ingest), and ultimately how it is incorporated into
arrangement and description activities. Perhaps the biggest challenge I face
right now is figuring out just how to provide access to the material should a
researcher stumble across some record of it in the library catalog or finding
aid platform. Actually, this thing is pretty much done, but I keep tinkering.

A lot of what I am doing is just traditional archival work
cast in a different light. The next task on my list this morning is, well,
appraisal. Penn State recently contracted the services of Archive-IT, and we’ll
soon be using their crawling services to strategically capture university websites.
Despite being published and disseminated through web technologies and
platforms, Penn State University websites are subject to the same
considerations as other university records. They exist within record groups and
are potentially subject to retention schedules.

In preparation for this project, we secured a list of
sub-domains on PSU.edu from central IT, and have been visiting the websites on
this list to determine an originating department (provenance), look for sites
from departments that fit the collecting priorities of the university
archivist, try to determine how frequently the site updates (an ongoing
process), and record some descriptive information about the sites in advance.
Our initial collecting priorities will focus on sites related to the
administrative units, colleges, and commonwealth campuses, but future phases of
collecting will seek broader documentation of university work, culture, and
life. As Mike Shallcross stated in his excellent case study on archiving the
University of Michigan’s websites:

While reviewing
Michigan‘s online resources, archivists were keenly aware of the extent to
which websites help confer credentials (from the recruitment of students
through their graduation), convey knowledge, foster socialization, conduct
research, sustain the institution, provide public services, and promote a
distinctive culture.

Increasingly, the kind of content Mike refers to above is
being delivered through multimedia (primarily video), social websites, and
cloud-based services (universities are increasingly using YouTube, Flickr, etc.
to host content). Future appraisal and planning won’t necessarily be more
complex; we’ll just have a larger landscape of material to examine. It won’t
necessarily require special training or technical skills; it just requires an
awareness of institutional uses of technology and methods for delivering
content that should be collected by the repository.

Finally, I’ll take a little time later today to start
preparing for a talk I’ll be giving at this year’s Digital Library Federation
Forum in Denver. My position as Digital Records Archivist was written into a
Mellon personal scholarly archiving grant that was awarded before I started at
Penn State this past May.The project is
“an ethnographic study of faculty behaviors and articulated needs central to
robust scholarly creation and successful navigation of the personal archiving
and information management process.” From an archival point of view, the study
should provide some insight into the personal digital habits of faculty and the
technologies used to support and share their research. We’ll be collecting data
through surveys, interviews, an on-site observation, and hopefully, for my
part, I’ll be able to identify patterns that can help inform acquisition
and management approaches to born-digital material (see the post earlier today, "What's in a File Name?"). This has been an interesting
collaboration between various information professionals—I will be presenting
with the lead investigator, an Educational and Behavioral Services Librarian, as
well as an ethnographic researcher—and I think it speaks to the ways in which
archival work in the digital age is inevitably going to be cross-disciplinary.