Twitter doesn’t represent everyone, so can we learn from it?

A recent report from Pew Research indicated that sentiment on Twitter — what people are saying about certain issues — does not necessarily represent the overall opinion of humanity.

“At times the Twitter conversation is more liberal than survey responses, while at other times it is more conservative…. The lack of consistent correspondence between Twitter reaction and public opinion is partly a reflection of the fact that those who get news on Twitter — and particularly those who tweet news — are very different demographically from the public,” offers the report.

Using traditional research methods on content analysis and text analysis software, the study looked at national polls in comparison to tweets that appeared to be in response to eight big events — including the first presidential debate, major speeches by Barack Obama and his re-election — and analyzed if people reacted positively or negatively. (Results here)

While Pew’s conclusions confirm what may seem obvious, it also points to the idea that Twitter is primarily an information dissemination tool — not necessarily a population information gathering tool. Businesses, often captivated by the allure of access to millions of consumers, have seen social media as a barometer of their markets. From using hashtags — which connect tweets to one conversation — to relying on expensive software, brands and media outlets are increasingly relying on social media not only to get a sense of customers, but to make strategy decisions based on the analysis.

Yet Paul Hitlin, a Pew senior journalism researcher and co-author of the study, acknowledges that most valuable information seems to flow one way — from news outlets, businesses and celebrities to the users, not the other way around. While exact numbers are extremely difficult to pin down, studies, including one by Cornell University, are showing that the majority of tweets are generated by a small percentage of users. The rest use it primarily as a consumption tool.

Crimson Hexagon, maker of Pew’s analysis software, says it’s “very accurate” and explains that to use it, Pew builds a baseline of tweets (about 250, according to Pew) that are categorized and labeled manually, so the computer has a reference point. Pew essentially trains the software to recognize different themes in any online “conversation” and then consumes the Twitter firehose of content to make an evaluation of overall sentiment. Pew says that when tested, CH’s software got the right answer about 90 percent of the time — a high number, but not overly reassuring when considered in the context of a measurement tool.

Essentially, there is a lot of noise, much of it due to geography. The Pew study had trouble figuring out which tweeters are part of the American electorate. Twitter users can register a location, but it does not necessarily indicate citizenship. A German citizen could tweet disparagingly from Topeka and a Kansan could tweet positively from Berlin. Not to mention that IP data — where someone is located — is not always accurate since people use proxies for work and otherwise.

Or maybe someone’s not tweeting in response to the event at hand — the subject of their tweet is just a coincidence. Not to mention, Pew’s survey could not access the tweets of anyone with a locked account. Hitlin acknowledges that all these sorts of cases are tough to handle and hurt accuracy.

And as any Twitter user knows, retweets don’t always mean an endorsement.

This doesn’t even get into the difficulties of software assessing language. Hitlin acknowledges that a tweet of “Great idea, Obama” could come from a genuine supporter or from detractor known for dry humor. Additional information in the tweet like “Great idea, Obama. This is the same thing you proposed last year” — a negative tweet — tips off the algorithm to associate negativity with phrasing like “same.” But again, something like “Great idea, Obama. We need more of these same policies” could generate false positives too; someone may mean that.

CH’s software slices all the text into subsections — rather than categorizing each Tweet, paragraph, sentence or word — and treats the “assertion” (their term) as a unit of measurement. The results are not determined by percentage of Tweets, per se, but rather the percentage of assertions out of the entire body of texts that have a certain sentiment about the issue at hand.

Hitlin maintains that Twitter does allow for insights into certain populations. “(Twitter) may be self selected, but it’s very passionate,” he says. “It gives you a sense of what groups are activated.” In some ways he says it’s like quickly interviewing 10 people on the street after an event. The survey results won’t be representative, but might point you in the right direction.