The race between Democrat Martha Coakley and Republican Charlie Baker appears all but tied going into the final two weeks of the 2014 gubernatorial campaign.

Of course, the source for nearly all these measures of public opinion comes from polls released by WBUR and other Massachusetts media outlets, which for years and years have sampled “likely voters” that are demographically and statistically representative of the larger population. Without diminishing in any way the information that polls extract from citizens — whether through online or phone panels — recent developments in social data offer another lens through which public opinion can be captured.

Twitter, as some recent academic studies have shown, can be a consistent predictor of election outcomes. Though Twitter now can claim only approximately 15-20 percent of the U.S. population as so-called "active users," citizens on Twitter talking about politics have been accurate indicators of identifying winning candidates, even when controlling for other factors.

In other words, as one study bluntly put it: More tweets equates to more votes. And in this regard, applied to the Massachusetts governor's race, in just the last week Coakley has a decidedly clear advantage in the real-time poll of public opinion that is Twitter. As of midnight Wednesday — and coming off Tuesday night's televised debate — Baker had been mentioned on Twitter 6,727 times since 7 a.m. on Oct. 14. By comparison, Coakley was mentioned 9,924 times. Put differently, Coakley was mentioned in 3,197 more tweets than Baker, a percent increase of 47.52 percent over her competitor during the same time frame. The numbers of mentions per candidate per hour are visually graphed in Figure 1:

CLICK TO ENLARGE (Courtesy of Jacob Groshek)

There are a couple of points worth mentioning at this point, the first being “mentions” themselves. On Twitter, a mention is when one user specifically identifies another user by placing the username (@jgroshek, for example) in a tweet. Importantly, these mentions are not necessarily reciprocal, meaning that @jgroshek may choose to mention the sender of the tweet or not, and in this sense mentions are somewhat more organic measures of presence and influence on Twitter. Which is to say mentioning activity comes potentially from the whole universe of users on Twitter, and not just those actively working for one campaign or another, thereby making mentioning a more difficult metric to artificially inflate (though that is still a possibility).

When looking at Coakley’s decided advantage on Twitter in terms of mentions (by tracking keywords 'coakley' and 'marthacoakley'), it is not only that she is leading in terms of overall volume, the trend lines indicate that since Oct. 16 she has rarely fallen behind on Twitter in mentions and that her lead seems to be growing and Baker’s presence (keywords 'charlie baker' and CharlieForGov') is relatively shrinking.

During the most-tweeted hour of Tuesday's debate, Coakley was mentioned 1,198 times compared to Baker’s 803. Importantly, this finding seems to be more than just anecdotal but an ongoing trend. To elaborate, in the 144 hours since Oct. 16 - when Coakley started to lead Baker consistently on Twitter - Baker has only out-mentioned Coakley on Twitter in just 29 instances (or 20.14 percent of the time) on an hourly basis.

In addition, when applying network analysis and algorithmic sorting to the 15,849 tweets to this year's popular political Twitter hashtag — #mapoli — between Oct. 14 to Oct. 21, it is the Coakley account (@marthacoakley) that appears as the most influential user account on Twitter in that specific user group of (presumably) politically mobilized citizens.

This sorting of influence is based on the betweenness centrality algorithm, which locates nodes (i.e., user accounts) that appear most often on the shortest path between others nodes in the network. Simply stated, the betweenness centrality algorithm finds gatekeepers on Twitter that are mentioning and being mentioned by diverse user groups, as indicated by color through another algorithm: modularity.

While most political contests are set as red versus blue, algorithmic sorting does not pre-define user groups or colors, and so the modularity algorithm detects communities of frequent interactions within networks. It is also worth noting that in the #mapoli network there is not a strong visual separation of Democrats and Republicans, which is to say they are mentioning one another, but also that independent groups are relatively isolated to this point.

These data-based approaches and expanded research parameters to social media provide additional insights to the overall picture of the Massachusetts gubernatorial race. While all estimates at this point have a 50/50 chance of success, based on Twitter data collected through the Twitter Collection and Analysis Toolkit (TCAT) at Boston University, the volume and user indicators reported here suggest that if the election were held today, Coakley would be a clear winner.

Of course, not unlike polls, social data requires updates and trends can change between now and Election Night. In future analyses and columns, we will examine more of what is being said rather than just who is active in mentions. But for now and for the Baker campaign on Twitter, as Oscar Wilde wrote in "The Picture of Dorian Gray," “There is only one thing in the world worse than being talked about, and that is not being talked about.”

Jacob Groshek, PhD, is an assistant professor in the Emerging Media Studies Division at Boston University's College of Communication. He also directs Betweetness Labs as a platform to make access to the TCAT system available for off-campus users.