Gnip Provides Twitter's Full Public Archive

Want a heaping helping of public opinion? Gnip has a deal for you: Every public tweet since Twitter launched in March 2006.

10 Tips For Tapping Consumer Sentiment On Social Networks

(click image for larger view and for slideshow)

Want a heaping helping of public opinion? Gnip has a deal for you: Every public tweet ever posted on Twitter since the social network launched in March 2006.

The company's new Historical PowerTrack for Twitter delivers all publicly available tweets from the past six years. It leaves out direct (private) messages and deleted comments, however.

Gnip, a social data provider based in Boulder, Colo., has a distribution license agreement with Twitter that allows it to provide historical tweets to its customers. It also delivers the full corpus of Twitter users' observations to the U.S. Library of Congress, which archives the tweets to preserve the public's real-time take on historical events.

But beyond the historical value of the Twitter fire hose, how can businesses benefit from this massive volume of data?

"We're going to see an impact on all industries. Obviously, a lot of business intelligence has focused on internal sales data, inventory--those kinds of things," Gnip president and COO Chris Moody said in a phone interview with InformationWeek.

Marketers, for instance, can use the Twitter archive to refine their predictive analyses.

"You're doing a product launch, and 25% of the conversation is negative. You'll now be able to decide if 25% [is] good or bad. You can go back and look at your last five product launches and see how they compare," Moody said.

The full Twitter historical corpus allows businesses to study people's opinions on multiple generations of their product lines.

Emergency services organizations could also benefit from historical tweets by studying human behavior during natural disasters. For instance, emergency planners could examine tweets from a previous hurricane.

"They can look at conversations geographically and understand what evacuation routes people took," Moody said. "And they can plan more effectively for the next disaster."

Historical tweets can benefit Wall Streeters as well. "Hedge funds have been trading on social data for a while, at least as an input to their high-frequency trading algorithms," said Moody. Having access to the full Twitter archive may help them develop better algorithms that more accurately determine the financial markets' next moves.

As Moody sees it, social data is "a new lens" to the world. "You've got this incredible view that was not available before," he said. "Understanding potential demand--and the world view--is what's interesting to me."

Other companies, some of which are Gnip's customers, also collect Twitter data. But Moody says Gnip's full historical corpus is an industry first. "We have every tweet, which, depending on your use case, can be very important," said Moody. "And we have the rights to re-syndicate that data."

Gnip uses a cloud-based architecture to deliver data in the JSON format to its customers. Its standard service uses a real-time streaming interface, but the historical Twitter data is delivered via download.

"Streaming has bandwidth limitations that could cause customers to wait a long time to receive all the data. By doing a download, we can get the data to them much faster," said Moody.

Will humankind benefit from the release of billions of 140-character observations?

"We don't know what the world's going to do. We don't have any idea of all of the amazing innovation that will come from this data," said Moody.

He added: "We hold this incredible data set, which really represents the world's thoughts at any one moment in time. If you have access to that data, what would you do with it? We're about to find out. That's why we think this is incredibly exciting."

In-memory analytics offers subsecond response times and hundreds of thousands of transactions per second. Now falling costs put it in reach of more enterprises. Also in the Analytics Speed Demon special issue of InformationWeek: Louisiana State University hopes to align business and IT more closely through a master's program focused on analytics. (Free registration required.)

Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.

Why should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.