Powered by Academics: Our Approach is Driven by Results

Using Visibrain Focus to analyse the unrest in Ferguson

In my previous blog post I outlined a number of free tools that could be used to capture and analyse data from Twitter, in the next series of posts I will look at more powerful commercial tools. Over the past few weeks I have had the opportunity to use Visibrain Focus (commercial), which is a Twitter monitoring platform for digital marketing professions, however, it has several features which are useful for research purposes.

This blog post has two aims. Firstly, to show the potential of Visibrain Focus, and secondly to provide some Twitter insight related to the Ferguson unrest (using the ‘#Ferguson’ hashtag and the ‘Ferguson protests’ keyword). As I have the unique opportunity to access tweets from the Firehose API (i.e., all of the tweets), I hope it can also help those which are currently conducting research around these themes.

Over the last 30 days (i.e., 30 days going back from 22nd August 2015, 3.07PM, GMT), in total there are 1,715,534 tweets by 500,252 users. There are 13,337,415,455 impressions (that is to say the amount of users have seen the tweets). The tweets are 36% original (n=618,772), 64% are retweets (n=1,096,762), and 74% of tweets contain a link (n=1,269,006). The retweet percentage is of interest here, indicating that tweets related to the Ferguson unrest have a high retweet ratio.

As shown in the figure above tweets start to increase on August 9th which corresponds to the one year anniversary of the fatal shooting of Michael Brown by a white police officer. The largest peak occurs on the 10th of August where a total of 550,928 tweets are posted. There is a sharp increase as during this time period, police in Ferguson, Missouri, shot and critically injured an African-American teenager

Figure 2 – Most frequently occurring hashtags used in tweets related to the unrest in Ferguson

In regards to the top three hashtags, #Ferguson is used 803,860 times, #blacklivesmatter is used 70,393 times, and #mikebrown is used 52,823 times. However, it is important to note that in order to retrieve this dataset the hashtag #Ferguson and the keyword ‘Ferguson Protests’ were used. It may be better to state that the word cloud above represents the most frequently occurring co-hashtags.

Figure 3 – Most commonly used expressions in tweets related to the unrest in Ferguson

The above word cloud is generated by using the most commonly reoccurring terms found in tweet content. In addition to the hashtags in the word cloud above (such as blacklivesmatter), other interesting expressions include ‘state of emergency’ ‘police’, ‘shots’ and ‘last year’. Also interesting here is the expression ‘Sir Alex Ferguson’ which is the ‘noise’ in our dataset.

Figure 4 – World map of tweets related related to the unrest in Ferguson

The figure above is a map of where users are tweeting from using the location provided within a user’s biography. The majority of tweets derive from the U.S. 69.3% (n=531,654), U.K. 5.3% (n=40,303), and Canada 2.7% (n=21,093). However, this is a distribution I have observed across topics on Twitter and may have more to do with overall use of Twitter, as well as access to the Internet, and mobile devices.

In regards to language, the majority of tweets are in English 84.2% (n=1,445,680), Spanish 4.2% (n=72,854), and German 4% (n=68,061). Taken with figure 4 above, this is not surprising as the majority of tweets derive from English-speaking countries.

Visibrain can also infer gender, in this instance, 22.2% of tweets derive from males (n=381,061), and 17.2% derive from females (n=296,120) with 60.6% (n=1,039,824) classified as other i.e., as it is not possible to infer gender. This may be because the name provided by a Twitter user is not a real name or it is in a format that can not be processed by Visibrain’s algorithm.

Figure 5 – Audience and following numbers of tweets related to the unrest in Ferguson

The figure above shows audience and following numbers of users that have tweeted about the unrest. The most interesting aspect is that users have an average of 7,617 followers, and 158,815 users have a following of over 158 thousand i.e., a high audience.

In terms of devices, 56.7% of users use a mobile (n=973,988), 2.6% (n=45,311) use a desktop, 1.8% (n=30,870) use a web related client, and 8.3% (n=141,812) use an automated method with 30.6% (n=525,024) classified as other.

The top 5 domains include twitter.com 54.6% (n=693,775) youtube.com 2.8% (n=35,523) nytimes.com (1.7%) (n=21,705) theguardian.com 1.5% (n=19,267), and cnn.com 1.22% (n=815,694). Many videos are shared on Twitter so it is not surprising to see YouTube as the second most popular domain. However, it is interesting, to see The New York Times, The Guardian, and CNN as popular domains.

The top 5 content types include, text 62.5% (n=793,983), photo 46.1% (n=584,916), video 10% (126,789), and audio 0.2% (2,806). Image and video sharing are quite high, however text based tweets out number both photo and video sharing. Also of interest is that 1,273,716 tweets contain a link.

Visibrain allows end-users to export mention data in Gexf format, the files can then be imported into a Gephi to create network graphs. I extracted a mention graph from 12AM to 1AM on August 9th (i.e., 1 hours worth of tweets) in order to create a network graph, shown below.

Figure 6 – Network graph of 1 hour of tweets related to the unrest in Ferguson on August 9th 2015 created in Gephi

Visibrain has many other features, for instance, it is also possible to look at most occurring tweets, most re-tweeted users and apply various filters to sort through users and tweets. I hope to tweet out the different features and types of analysis that is possible using Visibrain over the coming weeks.

Below, is a more recent network graph tweeted over 4PM and 5PM on the 22nd of August.

Figure 7- Network graph of 1 hour of tweets related to the unrest in Ferguson on August 22nd 2015 created in Gephi