Reflections on “Archiving Social Media”

Last Friday I attended the “Archiving Social Media” meeting, organized by the University of Mary Washington (UMW) and the Center for History and New Media (CHNM) at George Mason University. I won’t attempt to summarize the content of the discussions–you can read the notes from the breakout sessions for yourself, linked in the comments section on the page for each theme. You can also read over the tweets–#asome. A white paper summarizing the results of the meeting is also in the works. (Note that the meeting was kept deliberately small to encourage conversations and that until the last day or two was an “invitation only” affair.)

Rather, I’d like to share some observations and questions for my audience of archivist readers. First, I was quite disconcerted to find that I was one of a very few archivists present. The attendees were almost all historians or other academics working in the digital humanities, although there was also a large contingent from the Library of Congress. I don’t mean this as a criticism of the organizers, who I’m sure did their best to invite more archivists, but rather as something to keep in mind when reading the notes. My first recommendation for following up on this meeting would be to plan a similar one with enough lead time and funding so that the archivists who are most actively engaged in research on preservation of social media can attend. (More about that later.)

I think the general trend of the conversation would come as no surprise to most archivists who’ve been engaged in discussions about the preservation of electronic records over the years. Many of the same issues surfaced–the argument to “save everything” because “we can and storage is cheap,” as well as the recognition of the need for preservation considerations to be addressed early in the life cycle. I was in the group discussing “institutions,” specifically exploring who is responsible for preserving social media products. Many in the group were optimistic about the role of individual scholars in preserving the social media records that are relevant to their research, which would then be donated to an appropriate institution for permanent preservation. I’m somewhat skeptical about the viability of this option, but I’d be happy to be proved wrong. I still think the majority of preservation work in this area will come from the traditional custodians of cultural memory–archives, libraries and museums, who will collect and preserve the materials that fall within their missions. This will leave the same kind of gaps it always has. There will be materials that are only preserved outside our formal collecting institutions or not preserved at all, just as there have been in the past. (The model of LOC taking all of Twitter notwithstanding.) Are my archivist roots showing? Not everything is going to survive. And for those materials that do survive we will rarely be able to place those materials completely within their original context. Maybe it’s tempting to think that because of improvements in technology we should be able to do a better job of preserving the records created by every person, organization or government, but call me cynical, I don’t think the social structures that support preservation have changed that much.

What I took away from the meeting was the need for archivists to communicate with interested scholars, like those at the meeting, about our body of professional knowledge, which I believe gives the perspective necessary to approach this topic. To me, social media products look like just another category of electronic records. We need to define what characteristics of each different type of record need to be preserved for the record to be considered authentic and reliable. Then we need to produce, publicize and update best practices for how to preserve those records. We need forums for sharing information on tools and techniques. And we need a sense of urgency about the need to begin collecting social media records right away. There were many voices at this meeting pointing out the fragility of these kinds of materials. But how many archives are collecting them?

Preservation of social media is not my specialization. So, my question for you, readers, is how much of this is already being done? Who is working in this area? What reports and standards are already available? And, if not much is yet being done, what’s the proper forum for initiating such work? Where should this discussion take place in our profession?

10 Responses to Reflections on “Archiving Social Media”

I’m afraid I can’t enlighten you on who is doing this – I don’t really know of anyone in the UK working in this area. But I certainly agree with you about ‘improvements in technology’ not being the key, but the funding, the political will and the collaborative effort.

Context appears to me to be a huge issue: conversations that I have on Twitter can quickly become disjoined – I lose the context myself, never mind the idea of someone coming to it in years to come.

Thanks for the really useful summary. I agree with most, if not all of what you have to say about the role of cultural institutions in preserving social media. I am also somewhat skeptical about the idea that individuals will manage this stuff appropriately, then donate to repository at the end–although I am working with a donor now who actually did take good care of his stuff. But, for each time there is a case like that, there are probably five on the other side.

Part of the issue here is that the way technologies are currently implemented, it is very easy to lose control of your own records. The TOS for social media, and also the ways that systems for electronic communication are configured, make it very easy to do. For example, I discovered a while back that if you try to connect outlook to your GMAIL account, it will autoconfigure as a POP connection, download all your email to your local pst file, and delete from google server. While we can speculate as to the reasons Microsoft set it up that way (and it is not too hard to come up with explanations that emphasize nefarious purposes), the fact of the matter is that most people are not really paying attention to what software and hardware are doing with their stuff.

Two questions–were you involved in any of the discussions about lifestreaming, or what was the reaction to Cliff Lynch’s suggestion about IRB’s making research impossible. To me, that sounded like the type of hypothetical sound byte that is easy to think up over coffee, but what real evidence would there be to back it up? After all, no one has ever objected to people doing ‘human subjects research’ using newspapers or other published materials, which are somewhat analogous to a blog or twitter feed.

As a corporate archivist, I don’t know too many companies actively preserving their social media. However, I do know of one example that is really interesting.

At one of the Business Archives Section meetings during SAA, there was a fascinating presentation from Ted Ryan (Coca-Cola Archivist) about their efforts in capturing social media. They are working with a company out of the UK called Hanzo Archives that is preserving some of Coca-Cola’s websites and Facebook pages. Ted did a demonstration during the presentation and the sites are fully functional. All flash, YouTube videos, links, etc. worked just like they would if you had visited that site 6 months ago. If I hadn’t known that this was the site as it appeared 6 months ago, I would have thought he had just pulled it up on his computer that day. It was amazing and far beyond anything I had ever seen before. I’m probably not doing the best job explaining it, but you can read about the company and watch video demonstrations on their website (http://www.hanzoarchives.com/). This option probably won’t work for most archives because of budget constraints, but it’s super neat to look at.

Thanks for this summary — and especially for nudging toward the need for some more conversations between archivists and the historians etc. in the group. In the “technologies” group, I think that would have been especially helpful.

I’m optimistic about the idea of researchers archiving the material they work with and donating it, but am similarly torn. I certainly don’t think that that will capture everything, but am hoping that it will help. One missing component, I think, is technologies that build in good archiving practices along with with research tools on social media. I’m imagining an Archiving plugin to Zotero, and a “Shoebox-under-the-bed” plugin for WordPress, that use the same set of standards to record the kind of data that archivists want and export to a common format. If those plugins also include tools to help research, I’m hopeful that that would help archival data creation without requiring people to think too much like archivists themselves.

And reinforcing your point — creating tools like that would require some close collaboration between archivists and technologists.

Great post, Kate. For all the reasons that Chris so nicely summarizes, I also don’t hold out a lot of promise that individuals will be able to insure long-term survival of their born-digital social media related research materials without some intensive help from someone who understands what’s actually happening at the technology end of the corporate entities who are responsible for the servers and software on which this social media operates.

Kevin Glick, at Yale, did some interesting (and difficult) work last year with a CT-based not for profit org whose records were being brought into Manuscripts and Archives. I think that work provides some insight into what archivists need to doing. That org, which was folding as the work to bring their records into MSSA, did a lot of their outreach via a facebook page, and also had outsourced all of their technology support (including storage) to a vendor (the upshot, once the org stopped paying the bill, the vendor was planning to get rid of the org’s records, so the time for getting them to a repository was quite limited). The work that Kevin did in getting to the point where the electronic records could be accessioned was pretty intensive and involved a lot of third-party negotiations and hacking around in facebook. I think we’re at a point in the archival profession with this where we need some case studies (like the one to which Chris alludes and the work that Kevin has done) either in sessions at conferences or in archival publications.

Good to see others thinking about this — hope to see discussions/presentations at the SAA annual meeting and some good, detailed case studies in some of our professional pubs!

I’m not convinced that Twitter should be archived to tell you the truth. I’ve actually been thinking quite a bit about this lately….at what point is something a record and not an ephemeral conversation? Is it a record just because it’s fixed in a tangible form? What is the value of tweets?

One type of social media that I think we must focus on is the blog. For example, if I was thinking about an archive on Feminism that covered the current era, how could I not include relevant blogs in this archive?????? They are such an integral part of the movement. Obviously there are countless other examples of this.

The State Library and State Archives of North Carolina include social networking sites as part of their web harvesting program (with Archive-It). It’s a challenging process that still undergoes a lot of refinement. You can see the product at http://webarchives.dcr.gov, as well as associated guidelines and policies here: http://webarchives.ncdcr.gov/aboutwap.html.

Past Posts

Disclamer

Please note that in all posts on this blog, replies to comments, tweets, FB status updates, and in any other communication, all the views represented are strictly my own and nothing I say should be interpreted in any way as representing the views of SAA or the SAA Council unless I explicitly state that it is.