Shawn
Walker

My research focuses on two complementary areas: 1) new forms of political participation emerging on social media platforms and 2) the related challenges of collecting, analyzing, and preserving data from social media platforms. This work examines how new forms of political participation are emerging on social media platforms through the analysis of social media posts surrounding social movements, protests, and elections. My work on social media methods also addresses gaps in our understanding about social media data, collection methods, and the implications (ethics, representation, etc.) of using those methods. I received my PhD in Information Science from the University of Washington Information School. I am a founding member of the Social Media (SoMe) Lab @ UW and a member of the DataLab. I also earned degrees in International Studies, and Liberal Studies, with a focus on public policy and technology, from Northern Kentucky University.

Research

Informbot

Increasing transparancy of social media research

In this project, we are developing an automated tool, “Inform Bot,” that researchers can use to create a public notice and trail of their data collection and research using Twitter data. The goal of this bot is twofold. First, to pilot a mechanism that provides systems of notice to users whose data is being collected in order to better understand the challenges of developing and implementing a transparency bot. Second, to better understand how users and researchers would interact with such a mechanism,
when given the choice. For example, to what extent do users actually want to learn more about the research studies, or want to opt-out of such collection? Also, what is the experience of researchers who use this type of tool to be more transparent and what impact does this have on their research?

This project aims to monitor the lifespan of hyperpartisan content that circulates on Twitter in the leading up to national elections and other democratic consultations. We plan to have the infrastructure of the project in place prior to national and regional elections to be held in 2018, including the United States gubernatorial, House of Representatives, and Senate elections, along with the Irish presidential elections, the Italian general election, the United Kingdom local election, the Brazilian general election, and the Russian presidential election. In our past research we have identified that user-generated, hyperpartisan news content has a remarkably short shelf life (Bastos & Mercea, 2017), a marker of the perishable nature of digital content at the center of political debates in liberal democracies (Walker, 2015; 2017).

We will use the public Twitter Streaming API to track the content tweeted by users in real time associated with the six electoral events listed above. After collection tweets will be parsed for real-time archiving of embedded content including images and URLs, hence identifying URLs tweeted in the context of electoral politics and archiving their content. URLs will be archived daily until they are no longer accessible (URL decay). At the end of each electoral period, we will analyze the type of content that disappeared using topic model (Grün & Hornik, 2011; Zhiqiang et al., 2013) and contrast that with the larger population of URL links tweeted in the period leading up to the vote. While analyzing content that has been deleted we also estimate the size of the retweet cascade that disappeared and whether there is a relationship between content (i.e., hyperpartisan and fake news) and content shelf life.

We seek to establish metrics for the lifespan of fake news and user-generated, hyperpartisan news articles. The project has the following objectives:

Estimate the lifespan of fake news and hyperpartisan news items

Establish reliable indicators for the type of content prone to URL change, decay, or deletion in the context of electoral politics

Identify platforms at the center of user-generated content which are likely to disappear shortly after the ballot

Develop tools for at-scale, real-time collection and monitoring of linked content embedded in social media datasets

The Ephemerality of Social Media Data

How Social Media Data Chances Over Time

Relatively little is known about how social media datasets change when observed at different points over time or how choices of collection method may impact the data at the core of our research projects, and subsequent research findings. For example: Will results measuring the prevalence of rumors over time differ if social media data are collected as it is produced in real-time, a few minutes after production, hours, days, or weeks later? What happens to the metadata — links to web pages, photos, and videos — embedded in and documenting this content over time? If data collection methods do not preserve and archive social media posts, metadata, and linked content; are researchers venturing into a different dataset each time they engage with it?

Archiving Social Media Data

What descriptive metadata, documentation, and statistics do archives need to provide researchers in order to preserve social media datasets for reuse? These questions are especially relevant as data archives such as the UK Data Archive and GESIS are already archiving and documenting social media datasets.