Backstory

BackStory is a visualisation tool which exposes the historical layers of text and reveals which parts of a text have been added, removed or changed over time. It uses the revision history of Wikipedia articles. With BackStory, I try to enable a close reading of these edits, revealing not only the fact that data changes, but to allow insights into the nature of these changes. We see articles being updated to reflect current socio-political events, we see disagreement over information sources and spelling of words — is data singular or plural? — and, last but not least, we see how users mess with the data and how the history of data can help to make sense of it.

When we access a database, an online article or a file stored on our computer, what we are presented with is only the latest version of a document. When digital storage used to be sparse and expensive, old data got constantly overwritten when changes were made to it. Our current ways of accessing data is still built around this model of there ever only being one version of a file, of data. In reality, cloud storage and backup solutions have made version control ubiquitous – a technique primarily used by software developers for tracking changes in code. In order for a cloud service such as Dropbox to know, what data to synchronise across devices, it needs to keep track of what changed; it needs to record the history of data. Similarly, in recent versions of OS X, documents don’t need to be saved anymore, because all the changes are stored constantly. Collections management software might or might not record such revision data, but certainly with more institutions heading in the direction of Cooper Hewitt and the Tate and publishing collections data on GitHub, data history of digital collections will become more relevant; GitHub is built not only for data exchange, but most importantly for keeping track of changing data. The question is, how can we access this history? What kinds of insights can we get?

Process

I developed Backstory during the Beautiful Data workshop, which was organised by Harvard metaLab in 2014 and funded by the Getty foundation. I worked with article revisions from Wikipedia as this data is easily accessible, but it could work just as well with any other textual data of which different versions are available.
BackStory is by far not the first project to expose Wikipedia revisions. Most notably, Viégas and Wattenberg’s History Flow visualises the complete edit history of individual articles and reveals when and where disputes took place.

Result

Backstory is available for public use and the code can be accessed on GitHub. For World Aids Day, I developed a visualisation based on Backstory that traces the history of the Wikipedia article on HIV/Aids. This piece was commissioned by the ICA Philadelphia and funded by VisualAids.