A group of sociologists at Indiana University recently claimed to have shown that “tweets predict elections”. The research looks at the proportion of tweets during the 3 months preceding the 2010 election mentioning either the democratic or republican candidate in a house race that mentioned the Republican candidate, and uses that ratio to predict the election outcome. In an Op-Ed published in The Washington Post claiming to describe the research, Fabio Rojas, one of the authors claimed that “In the 2010 data, our Twitter data predicted the winner in 404 out of 406 competitive races.” Really?

Below is Figure 1 from their paper (available on SSRN). I don’t know where Rojas was looking, but I see a lot of points on the right half of the graph—where Republican tweet share was higher than 50%—that are BELOW 0, meaning the Republican candidate LOST the election. Similarly, there are plenty of points in the left half of the graph—where the Republican tweet share was less than 50%—where the Republican candidate won the election.

The nice thing about the authors publishing this graph is that it gives a perfectly accurate description of the relationship between tweet-share and margin of victory. They are related, but the relationship seems to be fairly weak. So if we wanted to predict election outcomes, would it make any sense to use the tweet-share? We could probably do a lot better looking at who the incumbent is, the share of the vote won in the district by the party’s last presidential candidate, or any of a host of other variables. So where does the 404 out of 406 number come from? I can only guess that Rojas was making a claim about the in-sample predictions of the full model reported in Table 1: a model that includes such important variables as whether or not there was a Republican incumbent, the proportion of votes John McCain got in the district, and the proportion of the district that is white. But do we really think the tweet-share is accounting for many of those 404 correct predictions? And without having the data in hand, it’s hard to believe that even the full model they got 404 correct predictions.

What we can see from the model they report in Table 1 is that the share of mentions in Tweets seems to have some predictive power for the Republican vote margin beyond the other variables in the model. That’s interesting. But it’s a lot different than saying that the tweets predict the outcome.

And more importantly, does the tweet share actually influence anything? It might come as no surprise that the tweet share is correlated with the winner of the election: that is pretty much what we would expect. The winner will generally have more name-recognition, spend more money and have a more active campaign. All those things should generate more twitter chatter. What we want to know is: does the chatter on twitter about a candidate affect what people think of the candidate? Does it make them more, or less, likely to vote for the candidate?

These are interesting questions that my colleagues at the SMaPP lab at NYU and I are trying to answer. The basic data and analysis presented in the Indiana paper is interesting and informative. But it doesn’t help inform anyone to then make overstated claims that are so obviously contradicted by the data.

For today’s celebrity baby, baby photos and sonograms are passé. Instead, the baby needs his or her own “retweet network” picture, and we at the NYU Social Media and Political Participation laboratory are happy to provide this indispensable gift for the new #royalbaby Windsor (for higher resolution versions of the figure, click here or here.):

What can we learn about Twitter and the #royalbaby from this figure? When faced with a high profile but essentially low consequence event (unless you were betting on the baby’s arrival), Twitter was drawn upon to do what it seems to do best: provide information and entertain people. Our best guess at the various networks in this figure (and this is a very dense set of networks due to the huge volume of tweets – there are half a million Twitter users displayed on this plot!) are the following:

GREEN: news media, official accounts of crown, and celebritiesYELLOW: British comedians and journalistsRED: American comedians and journalistsBLUE: general parody accounts tweeting about #royalbabyPURPLE: parody accounts related to British Crown

The most popular retweets also confirm this rough dichotomy of news and humor. The most popular tweet of all was from none other than Harry Potter potions professor Severus Snape:

And by our count, of the top 10 most popular retweets, seven were humor or satire related and the remaining three were news. The BBC was the highest ranked news source, with 10,300+ retweets, and came in at the fifth spot, beating out the official announcement from Clarence House, which placed sixth with 6,600+ retweets. The remaining news source in the top 10 retweets: E! Online (an American entertainment website), which placed tenth with 5,800+ retweets. And for those Harry Potter fans out there, Snape also grabbed two more spots in the top 10 (second and eighth); Lord Voldemort himself finished just outside the top 10.

Was this event truly a global social media phenomenon? With the caveat that we only have access to Twitter data and thus can not comment on activity on other platforms such as Facebook, Weibo, or Google+, there are a couple different ways we can address this question. First, we can examine the sheer quantity of Tweets devoted to the keyword #royalbaby, which of course underestimates the total number of tweets discussing the royal birth because not everyone was using this hashtag. For comparison’s sake in the following figure, we compare tweets with the hashtag #royalbaby with tweets that featured the hashtag #direngeziparki—the main hashtag in the recent Turkish protests —on June 1st.

Three observations can be made from this figure. First, from the moment the news first broke that the Duchess was heading to the hospital, there was a very stead stream of constant twitter usage of #royalbaby – something on the order of close to 1000 tweets per minute; using a larger number of keywords, we were collecting about 1500 tweets per minute at this time. Second, there was – as would be expected – an enormous spike in tweets immediately following the birth.* This demonstrates yet again the power of Twitter to convey information to vast numbers of people in an almost instantaneous fashion. While in this case it was all basically in fun, the political consequences of such a powerful distribution network for information that can be accessed by the masses and elites alike should not be underestimated. Finally, for all the hype surrounding the royal birth, the sheer number of tweets for most of the day featuring the key hashmark for royal baby watching does not really seem to exceed by much – if at all – the key hashtag used in a single day of protesting in Turkey. Whether this says more about the Turkish protests or royal baby-watching remains to be seen.

The other way to use Twitter to assess the degree to which the royal birth was truly a global event is to look at the extent to which the hashtag #royalbaby was used around the world, which we do in the following figure. Again caution is in order, as one needs to interpret this figure through the lens of where Twitter is used globally:

With these caveats in mind, two conclusions seem apparent. First, this was a global social media event. People did tweet about the birth all over the world, although clearly there was more interest in some countries (especially in Europe) than others. Second, Americans – despite their decision to fight a war to free themselves from the control of the British crown – still remain quite interested the royal family!

******

* For the Twitter geeks out there, you will notice that our chart peaks at somewhere around 10,500 tweets per minute, while Twitter itself reported a top rate of 23,500 tweets per minute. You will also notice that at the very peak, our line appears to go flat briefly. We think both are caused by the same fact, which is that there is a limit to the number of tweets you can collect from Twitter’s API when you search by particular keywords, which is what we were doing. This limit is 1% of all tweets. If your keyword search is returning less than 1% of all tweets, you get all of them. But if it returns more than 1%, you max out at 1%. This is why we think our top-line number is significantly less than Twitter’s number: for a brief period of time, tweets related to the royal birth seem to have been exceeding that 1% limit.

The news media today have been atwitter about the ouster of Australian Prime Minister Julia Gillard in an intra-party leadership vote, 57-45 (see here, here, and here for example). So the press has clearly taken notice.

But what about the people? For this question, we can turn to social media. By a fortunate coincidence, the NYU Social Media and Political Participation (SMaPP) lab has been collecting Twitter data for the last two weeks on Australian politics in anticipation of this September’s parliamentary election. Thus we are able to look at the amount of activity on Twitter in the lead up to the vote as well as in the aftermath of the vote. The following figure details the number of tweets mentioning Gillard and Kevin Rudd, the ex-Labor PM – himself ousted from power by Gillard in a similar manner three years ago – who will now be replacing her.

The figure clearly demonstrate an enormous upsurge in Twitter activity surrounding the two principal figures in the drama in the hours following the vote. Indeed, tweets mentioning the two candidates grow from under 1000/hour to over 20,000 per hour for Rudd and as high as 25,000 per hour for Gillard (although, by comparisons sake, this is still nothing compared to the 3,000 tweets per minute recorded during recent Turkish protests).

Another thing we can do with social media data is to look at the networks of who seems to be retweeting whom. The following figure shows these networks for the past 48 hours in Australia, with the colors representing four different components of the network.

[Figure by Pablo Barbera; Data from NYU SMaPP laboratory. To view more clearly, click on the figure for the larger version and then zoom in, which will allow you to see the labels of accounts that were retweeted more than 200 times in the past 48 hours.]

What’s interesting about the figure is how it doesn’t seem to resemble an “echo chamber” – with people only talking to similar other people – as we might expect in a normal political discussion. Instead, we find that the internal networks are quite connected to each other, making the whole figure look more like amorphous blob than a highly segmented set of networks. Moreover, rather than represent ideological factions, our best guess at these networks is that they represent different sources of information: the green nodes on the north-west part of the figure are foreign news sources, the red in the middle seem to be Australian news sources, and the green in the south east corner are politicians.

Interestingly, another source for information that seems to be highly retweeted are either parody sites (KevinRuddExPM, GI_Gillard, Queen_UK) and, in one case, a jobs website (seekjobs) that sent out the following popular tweet:

What to make of all this? First, it seems that intra-party disputes might engender a different sort of response on Twitter than traditional inter-party disputes. In particular, we might be seeing more radio silence among politicians in the Twitter-sphere than publicity hungry politicians usually exhibit. For example, even though the ex-Prime Minister Gillard can be seen in the retweet network, here is her most recent tweet, which clearly predates the vote:

Second, the network illustrates that even with events that are highly covered by the main stream media, Twitter also serves as a way to disseminate information, both for media and non-media sources alike.

Finally, there are apparently quite a few Australians enjoying a laugh at the expense of the Labor Party today. As @Queen_UK put it:

Over the past several years the role of social media in promoting, organizing, and responding to protest and revolution has been a hot topic of conversation. From Occupy Wall Street to the Arab Spring Revolutions, social media has been at the center of many of the largest, most popular demonstrations of political involvement. The protests taking place in Turkey add to this growing trend, and are already beginning to add new layers to our understanding of how social media can contribute to public participation.

The social media response to and the role of social media in the protests has been phenomenal. Since 4pm local time yesterday, at least 2 million tweets mentioning hashtags related to the protest, such as #direngeziparkı (950,000 tweets), #occupygezi (170,000 tweets) or #geziparki (50,000 tweets) have been sent. As we show in the plot below, the activity on Twitter was constant throughout the day (Friday, May 31). Even after midnight local time last night more than 3,000 tweets about the protest were published every minute.

What is unique about this particular case is how Twitter is being used to spread information about the demonstrations from the ground. Unlike some other recent uprisings, around 90% of all geolocated tweets are coming from within Turkey, and 50% from within Istanbul (see map below). In comparison, Starbird (2012) estimated that only 30% of those tweeting during the Egyptian revolution were actually in the country. Additionally, approximately 88% of the tweets are in Turkish, which suggests the audience of the tweets is other Turkish citizens and not so much the international community.

These numbers are in spite of the fact that there are reports that the 3G network is down in much of the area that is affected. Some local shops have removed security from their WiFi networks to allow internet access, but almost certainly the reduced signal will have impacted the tweeting behavior of those on the ground.

Part of the reason for the extraordinary number of tweets is related to a phenomenon that is emerging in response to a perceived lack of media coverage in the Turkish media. Dissatisfied with the mainstream media’s coverage of the event, which has been almost non-existent within Turkey, Turkish protestors have begun live-tweeting the protests as well as using smart-phones to live stream video of the protests. This, along with recent articles in the Western news media, has become a major source of information about this week’s events. Protesters have encouraged Turks to turn off their televisions today in protest over the lack of coverage of the mainstream media by promoting the hashtag #BugünTelevizyonlarıKapat (literally, “turn off the TVs today”), which has been used in more than 50,000 tweets so far.

What this trend suggests is that Turkish protesters are replacing the traditional reporting with crowd-sourced accounts of the protest expressed through social media. Where traditional forms of news have failed to fully capture the intensity of the protests, or to elucidate the grievances that protesters are expressing, social media has provided those participating with a mechanism through which not only to communicate and exchange information with each other, but essentially to take the place of more traditional forms of media. Further, this documentation through multiple sources in public forums serves to provide a more accurate description of events as they unfold. The coming days in Turkey will give us more insight into the processes by which this takes place, but it is certainly an impressive realization of the potential for social media to be used in overcoming barriers to diffusion of information regarding and motivation for protests.

******

Update: we also wish to acknowledge the contributions of NYU Politics Ph.D. candidates Batuhan Gorgulu and Emine Deniz.

A few weeks ago I ran a conference in Florence, Italy on Social Media and Political Participation. We had 12 people people present papers and probably somewhere between 60-80 people who attended presentations. However, in addition to simply holding the conference, we also live-streamed it over the internet and gave the conference a hashtag: #SMaPP_LPD. The figure above shows all the twitter accounts that tweeted using that hashtag (weighted by the number tweets they issued) and their connections to other people who tweeted about the conference.

The benefits of live-streaming a conference in order to increase access to the conference are obvious: people who can not attend the conference can still watch the presentation. Our best guess is that we somewhere between doubled and tripled the number of people who could “attend” the conference, with over 130 people logging on to the feed on Friday and over 60 people doing so on Saturday. But the other great thing about live streaming is that not only can it extend your reach geographically, it can also do so temporally. So we’re in the process now of preparing a video archive of all the presentations (which will be up on the conference website shortly). We’ve already posted everyone’s presentation slides, so what this means is that shortly anyone will be able to download the slides for a presentation and then watch the video of the presentation whenever they like.

But the other point I really want to make is how much I felt the use of the conference hashtag improved my own experience attending the conference. Throughout the two days of presentations, I was able to communicate with other people – both in Florence and watching over the live-stream – about the papers as they were being presented. I got to see what other people found interesting about presentations and you could communicate in real time about issues being raised by presenters. Moreover, I personally found that – far from being distracting – the fact that I was looking at the Twitter feed and tweeting kept me more engaged with the presentation. You all know the feeling: no matter how interesting a conference, by the 6th paper of the day everyone (especially if they are jet-lagged!) starts to zone out a bit and get sleepy. I found the hash-tag conversation to be an antidote to this common feeling; it kept me alert and more engaged with the paper presentation. Furthermore, by “summarizing” what I was thinking about papers in 140 characters, I think I was actually more quickly processing what I was learning than simply by listening. As the conference moderator, I was also able to take questions over Twitter, thus allowing people who weren’t in Florence to participate in the question and answer session in real time, which is kind of amazing if you stop to think about it.

At a time when political science is increasingly coming under attack for not having enough to offer those outside academia, live-streaming and hash-tagging conferences seems a relatively simple way to make our research more accessible to a wider audience. And my experience is that this is a win-win situation: the same thing that allows more people access to our research can enhance our own conference experience, in addition to making it possible for us to “attend” more conferences beyond what our normal travel schedules/budgets would allow.

So consider this post a plea to conference organizers everywhere: please think about live-streaming and setting up a hash-tag as a part of your conference in the future! Adding a hash-tag is costless. Yes, live-streaming costs money, but so do a lot of other things associated with conferences, and my sense is that the cost of live-streaming is falling and will continue to do so. In the long-run, if we can make live-streaming a regular part of conference (much the same way “conference dinners” are usually automatically included in any conference budget) I think the payoff will be more than worth it.

Foreign policy pundits have been bullish about the ability of social media to bring democratic change in authoritarian regimes. Observers have argued that social media can literally “make history” by helping topple regimes, and democracy promoters are sinking big money into a variety of trainings with this very goal. But in countries such as China, Russia, and Iran, where users of local social networks still far outnumber users of Facebook and Twitter, authoritarian governments have used their leverage over domestic networks to contain online opposition to the regime.

The story of Russia’s most popular social networking platform, VKontakte, illustrates this point well. In March 2013, reports (ru) surfaced about how VKontakte collaborates with Kremlin officials to gather intelligence on opposition groups that use the site. The most damning of the reports claimed that the site shut down opposition “groups” and misdirected message traffic between opposition figures.

Indeed, in the aftermath of the December 2011 parliamentary elections, when allegations of massive electoral fraud brought tens of thousands of Russians onto the streets in the largest anti-regime protests since the fall of the Soviet Union, the relationship between VKontakte and the Kremlin even became coercive. Four days after the election, the company’s founder Pavel Durov reported that he had been summoned by the FSB (Russia’s internal security service) to answer questions about opposition activity on Vkontakte. Durov’s hesitation to cooperate fully appears to have landed him in hot water, as investors with ties to the Kremlin recently purchased a 48% share in Vkontakte and Durov may have fled the country after his home was searched in connection with an alleged traffic violation.

In our research on social media, we have found that the ownership structure of social media matters greatly for politics. When nondemocratic governments have leverage over the content and structure of social networks, users lose the ability to access independent points of view and learn about government malfeasance. Not only is information sharing monitored and potentially blocked, but democracy activists avoid networks connected with government authorities for fear of reprisals.

Though scholars have long warned about the attempts of authoritarian leaders to influence the internet, little empirical evidence has been brought forth about the effects of these efforts on politics at the micro-level. In a forthcoming article, we used survey data from the 2011 parliamentary elections in Russia to examine how usage of different social networks affected users’ awareness of electoral fraud. Our results indicate that users of Western networks like Facebook and Twitter are about five percentage points more likely to believe that there was significant electoral fraud during the elections. Usage of Russian networks, VKontakte and Odnoklassniki, meanwhile had no effect on awareness of electoral fraud.

We argue that the reason for this discrepancy lies in the type of information being spread on these networks. During the election season, local networks’ vulnerability to state pressure seems to have led many opposition activists to focus their social media strategy on Western social networks, such as Facebook and Twitter, which are much harder to monitor and pressure. Alexei Navalny, Russia’s most popular political blogger, maintained an active public Facebook page and Twitter account, which he used to spread hundreds of YouTube videos, photographs, and anecdotes documenting electoral fraud, and yet Navalny maintained only a token presence on Vkontakte and no presence on Odnoklassniki. This strategy is at odds with the goal of reaching a mass audience since Odnoklassniki and VKontakte each have five times as many users as Facebook (only 5% of Russian internet users are on Facebook).

Caption: Figure 1 shows the week on week change in activity on social networking sites in the weeks surrounding the elections. There were large spikes in activity on Facebook and Twitter, but no such spikes in VKontakte and Odnoklassniki usage.

Of course, it’s possible that individuals with preformed opinions about electoral violations select into usage of Facebook and Twitter and eschew usage of native social media platforms. Its hard to dismiss this possibility, but our findings do indicate that Facebook/Twitter users are remarkably similar to VKontakte users across a range of factors that might be correlated with perceptions of electoral fraud (sex, income, education, place of residence, support for Putin, levels of political participation, and support for the opposition).

Our presumption was also that Facebook and Twitter usage would also increase levels of protest participation, as the emerging narrative suggests. This should certainly be true if the self-selection process described above was at work (users with preconceived notions about rampant fraud should be especially likely to join protests against electoral fraud). But surprisingly, we found no relationship between usage of Facebook/Twitter and participation in post-election protests.

Thus, users of Western social networks were not more politically active than either their counterparts on Russian social networks or even non-users of social networks. Yet they were more informed about the wrongdoings of the government.

I have been helping to organizing a conference in Florence this Friday and Saturday (May 10th and 11th) on the effects of Social Media on Political Participation. We’ve got a great lineup of papers (including the Monkey Cage’s own Henry Farrell). We’ve managed to secure funding to live-stream the conference (it will be available here), and anyone watching will be able to participate through the hashtag #SMaPP_LPD. We’re hoping to convince people that live-streaming is an important use of university resources, so if you are interested (or you know anyone who might be interested) please join us! Most of the papers are now available on the conference webpage.

This is the fake tweet (the AP twitter account was hacked) that yesterday caused the US Stock market to briefly lose $200 billion worth of value. While the market recovered afterwards, I take this as yet another reminder that social media are not merely mimicking functions played by traditional forms of media. This incident points to two crucial aspects of divergence. First, the speed at which information can spread across social media networks is truly stunning. Note the time on the tweet: 1:07 PM. Now note the time of the market drop:

Second, this event took place because someone hacked the AP twitter feed (according to the Wall Street journal “a group of enthusiastic Syrian youths” have claimed responsibility). It’s difficult to imagine anyone hacking the front page of the NY Times or brainwashing Walter Cronkite. Moreover, while hacking the AP account certainly enhanced the number of followers who saw the tweet and enhanced it’s credibility, it is not difficult to imagine this sort of “hoax” spreading quickly from a home grown Twitter account either, especially given the fact that plenty of people access Tweets via #hashtags (keywords) in addition to simply following others. At its core, social media continue to be differentiated from traditional media by the fact that anyone can send a message. How many people will listed to that message will of course vary, but send the right message in the right circumstances that looks the right way, and you can knock $200 billion out of the US economy in a heartbeat.

Apropos of John’s post from earlier in the week, here’s some additional evidence that people are still paying attention to the issue of gun control a month out from the Newtown tragedy:

The data are from a collection of 5 million + tweets that we’ve collected at the NYU Social Media and Political Participation (SMaPP) lab since the shooting that contain a number of related keywords, including the six on the graph. While the data are still in a crude format (e.g., nothing in the figure shows whether tweets are supportive of the NRA or opposed to the NRA when they include “NRA” in their tweet), one pattern is quite clear. Even as tweets directly related to the Newtown shooting have tailed off (i.e., those containing the hashmarks #ctshooting or #PrayForNewtown) people are still talking about the political/policy implications (i.e., gun control, NRA, 2nd Amendment). This therefore extends the initial observation we made about this pattern after three days of tweets (here and here) to more than a month’s worth of tweets.

What’s interesting about this is that it provides at least some rudimentary evidence that it is not just those in the media that are continuing to talk about topics such as gun control; it is the mass public as well. That being said, the biggest boost in the discussion of the issue by the mass public (the second set of peaks on the right part of the figures) came following President Obama’s gun control speech on January 16th, suggesting that while the public remains interested, elites (and especially the president) can play an important role in sustaining that interest. Of course, the data (which show tweets on “gun control” trending up before the President’s speech) are also consistent with a world in which public opinion may have encouraged him to act as well.

*******

Update: Here is the additional figure requested by Danny Hayes in his comment below:

Danny does seem to be right that discussion picked up on the 9th, but there seems to have been almost as much chatter the previous two days as well. The biggest mini-spike actually comes a couple days after the Biden announcement. But overall, I think the pattern of Tweets clearly is consistent with Danny’s claim that the Tweeting is being driven by the White House as opposed to visa versa. And yet, it is still interesting to see that it is not just journalists “covering the story”, but indeed tens if not hundreds of thousands of individuals (with the caveat again that these are just counts of tweets) that are doing so as well. The ability to see citizens in action this way is a new opportunity for social scientists—as indeed is the ability for citizens to “speak” publicly in this way!—and one which I think will prove very interesting to follow.