Social media plays an increasingly important role as we embrace networked platforms and applications in our everyday lives. The interactions of users on these web-based platforms leave valuable traces of human communication and behaviour revealed by ever more sophisticated computational analytics. This trace – the data generated by social media users – is a valuable resource for researchers and an important cultural record of life in the 21st century. As the programming and infrastructures of social media, or Web 2.0, mature and grow, researchers and collecting institutions need new techniques for capturing this web-based content. This report provides an overview of strategies for the archiving of social media for long-term access, for both policy and implementation. Specifically, it addresses social networking platforms and platforms with significant amounts of user-generated content, excluding blogs, trading, and marketing sites, which are covered in other Technology Watch Reports. Alongside the potential strategies for archiving social media, the challenges facing its preservation by non-commercial institutions will be explored. In the absence of established standards and best practice, this report draws on recent initiatives undertaken by research, heritage, and government archives in the UK, Ireland, and Germany. Based on the current conditions surrounding social media data and the lessons demonstrated by a range of case studies, recommendations for the future development of social media preservation for research and heritage collections will be made.

This report is intended for any institution with an interest in preserving social media for supporting research, public records, or cultural heritage. Good practice in managing and archiving social media data applies to researchers, archivists, and librarians alike. As social media continues to grow as a source of official government and corporate communications, the importance of effective preservation will increase. Similarly, as more people across the globe replace traditional forms of communication and expression with the media available through social media platforms, the record of histories and cultures will increasingly rely on the ability of researchers and archivists to document this fast-paced and dynamic form of data.

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.