DataSift’s service is similar to those offered by Gnip and Topsy, although there are some notable differences around delivery models, historical ranges (DataSift’s historical data only goes back three years) and the number of platforms analyzed. Gnip and DataSift, for example, cover all sorts of social media, blog and comment platforms, whereas Topsy is focused just on Twitter.

Between news of Apple’s $200 million acquisition of Topsy on Monday and DataSift’s funding new today, it should be clear just how much value companies are putting on being able to make sense of the conversations taking place on social media. There’s a lot of talk happening in a lot of places, and capturing, filtering and analyzing it all is a tall technological order for companies that don’t specialize on that workflow.

A sample diagram of what DataSift can extract from a tweet.

Given the value of what they’re doing, it’s anyone’s guess how long DataSift and Gnip will remain independent companies, but it seems they’ll command a higher price than Topsy did.

For more information on DataSift’s business and the value of — and difficulty of processing — social data, check out this interview with DataSift founder Nick Halstead from our Structure Europe conference in 2012.

]]>DataSift, one of the two companies (along with Gnip) granted real-time access to the Twitter firehose, now offers real-time and historical analysis of Tumblr data. While it’s best-known for Twitter, DataSift actually analyzes dozens of social media and commenting platforms, which is pretty handy if you want to compare sentiment, engagement or whatever else across platforms where people behave quite differently.

]]>Even though visualization and analytics player Tableau Software just went public, it’s getting another boost on Tuesday. DataSift is poised to add the Twitter firehose and other social media sources to its service through a partnership between the two companies.

Tableau users get out of the deal the ability to visualize and analyze connections between their own existing internal data and the social data. The new functionality strengthens Tableau’s already sturdy position as a tool for helping less technically savvy people get a picture of operations.

The partnership makes sense for DataSift, which has been looking to get its data closer to the products inside which companies analyze other data, particularly in business-intelligence (BI) software. In February, the company came out with an open-source tool for querying DataSift’s many sources that could tie in with BI programs.

And the data from DataSift is only getting bigger, by the way. In addition to the Tableau connection, DataSift announced the addition of three new inputs of its own: Facebook Pages, Google+ and Instagram.

The premise of getting a quick handle on social sentiment through visualizations has been gaining in popularity. Wal-Mart Stores and Blab have come up with visualizations of social data recently, for example.

Just a couple of months ago, DataSift inked a similar partnership with Splunk. Plugging DataSift’s social data into Splunk provides visibility into multiple real-time communication streams, which can assist in efforts to spot infrastructure problems and show how social promotions impact infrastructure and how infrastructure issues can make social news.

While this partnership signals points to greater adoption for DataSift, the more significant development is that it moves the needle a little closer to a data democracy, in which everyone, not just data scientists, is empowered and equipped to analyze and learn from data. Indeed, my colleague Derrick Harris has wondered if Tableau is akin to data’s George Washington, and more data for Tableau could only mean the opportunity for more understanding.

]]>DataSift is releasing an open-source version of its Query Builder service to work alongside enterprises’ existing business-intelligence software, allowing more employees to gain more insight from social media mentions.

The open-source presentation of Query Builder, which permits existing DataSift customers’ developers to simplify the tool’s appearance and functionality, might seem like a matter of crossing big data with even more data. But it’s an important step in trying to prompt business decisions based on what companies can learn about users of Twitter, Facebook and other outlets, not just see what people are saying. Social media analysis becomes more actionable and worthwhile with this sort of functionality.

Plenty of companies ask Twitter to filter out certain parts of its enormous data set. But DataSift is one of just two companies licensed to syndicate the firehose of all Twitter feeds. (The other is Gnip.) Its internet-based Query Builder service also allows customers to run natural-language processing off the entire Twitter firehose and adjust it on the fly in several ways. The processing requires a massive amount of storage, to the tune of 1.3 petabytes, said Nick Halstead, DataSift’s founder and chief technology officer. With the open-source versions, developers can add the Query Builder’s streams and processing to business-intelligence platforms, and users won’t even be able to tell it’s running in the background, Halstead said.

With Query Builder, which was announced in August, users can also pull in Amazon forum messages, YouTube comments, bitly links, Topix posts, Facebook status updates and other social statements, in addition to tweets. The data streams and insight on them all cost subscribing customers $3,000 or more per month. Those users will be able to use open-source versions and get more employees on board.

Twitter is currently being sued by PeopleBrowsr, a company that has had full access to Twitter’s firehose, after Twitter said it would restrict PeopleBrowsr’s access to the firehose following the expiration of a previous contract. From Twitter’s perspective, it makes sense to consolidate who is accessing and re-distributing that data due to the scope of the data and work involved in transmitting those tweets. But it hasn’t always been this way, and PeopleBrowsr is challenging Twitter’s earlier commitment to openness when it comes to that firehose of data.

With access to the full Firehose of data, it is possible to move far beyond the Twitter experiences we know today. In fact, we’re pretty sure that some amazing innovation is possible. Today, we’re happily turning the Firehose on for some new partners focused mainly on exploring the incredibly rich field of real-time search and discovery. We are thrilled to announce that Ellerdale, Collecta, Kosmix, Scoopler, twazzup, CrowdEye, and Chainn Search join us as partners. These companies range from funded startups to part-time, one-person operations so we came up with a fair way to license access that scales with their business. If you think there may be a potential partnership involving access to the Firehose, let’s start a conversation.

So who’s still has access and distribution rights to the firehose? The full list of companies with commercial partnerships isn’t publicly available, but there are a few like Bing and Salesforce that have publicized this relationship, and several companies listed as part of Twitter’s certified program do have firehose access and can provide enterprise access to that data for other companies who are interested. Duncan Greatwood, CEO of Topsy, which is one of the companies licensed to re-distribute the data, said that including Topsy’s archives, there are now more than 250 billion tweets. That makes it unrealistic, in his opinion, that many companies would be prepared to access the full stream — most companies only access smaller percentages.

But PeopleBrowsr CEO Jodee Rich said his company relies on full access to the firehose to serve its customers, and an agreement with one of the other providers wouldn’t make sense.

“The nature of those agreements are so short-term that no one could possibly build a viable business on those agreements,” he said in an interview. In Rich’s legal statement, which is available online here, he notes that PeopleBrowsr was paying Twitter more than $1 million per year for firehose access.

Topsy, Gnip and DataSift are three providers who license the full firehose of data from Twitter and then re-sell portions of it to either developers building products with the data (showing your influence on Twitter, for instance), or to marketers and brands using sections of tweets to track conversations and trends on the network. PeopleBrowsr is arguing in the court documents that being forced to get data from one of these companies would compromise its ability to serve its clients, since it previously had full access. Twitter counters that it’s totally within its rights to change the terms given that its initial contract with PeopleBrowsr has run its course:

But as Twitter has grown, its contracting practices have matured. Where it once contracted directly with just a handful of data customers like PeopleBrowsr, it now has hundreds of data customers. In order to handle that broad commercial demand in a consistent and transparent manner Twitter has created channel resyndication partnerships with Gnip, DataSift, and Topsy. PeopleBrowsr is free to contract with any of them, just as its competitors do. What it is not free to do is insist that Twitter preserve forever its earlier business model, or continue to be bound by a contract that expired more than a year ago.

A Twitter spokesperson replied that, “We believe the case is without merit and will vigorously defend against it.”

]]>Social data platform DataSift, one of two companies with access to Twitter’s firehose, is teaming with link shortener Bitly to help its customers understand what content is actually clicked on and read, not just shared and retweeted.

Marketers spend a good chunk of their money on content marketing but when it comes to measuring how well that content is doing, it often comes down to counting tweets, retweets, posts, comments and likes. By tabulating friends and followers, marketers can estimate how much reach a message had. But it’s still harder to know what content is really getting clicked on and read the most.

By partnering with Bitly, which is used to share 80 million new links a day, DataSift can help its clients look at social conversations and click data together. That allows them to see what content actually drives the most interaction. This gives brands a way to create content that’s really engaging for users and also have a better way to measure their “return on influence.”

“Now, with Bitly, you don’t just look at links but you look at content. Now you can see who is engaging with stories and you can see which bits of news are resonating. Just because something was retweeted doesn’t mean someone clicked on it,” DataSift’s founder and CTO Nick Halstead told me in an interview. “This will change the perception of success.”

DataSift customers can now pull up live data on what stories are being shared and how many clicks they’re actually getting. They can filter by geography and see where the referrals are coming from and what keywords and titles are working. A company can now apply some SEO techniques to come up with new content that matches what people are clicking on or rework existing content and headlines to maximize engagement.

Halstead says the partnership makes sense for both companies. DataSift has a critical way to provide link analytics on social content while Bitly has a way to get more use out of its data and it can sell DataSift customers on its data stream. DataSift now has 300 big enterprise clients and just raised $15 million earlier this week led by Scale Ventures.

]]>DataSift, best known as one of the two companies with full access to Twitter’s firehose of streaming data, has raised $15 million in Series B. The new money, which comes from Scale Ventures — along with GRP Partners, IA Ventures Northgate Capital and Daher Capital — adds to the $14.7 million DataSift has previously raised, for a total of $29.7 million.

DataSift is more than just a collector of Twitter data, however. It also takes in data streams from dozens of other web and social media sources, and can handle corporate data, as well. And the company’s real value comes in the analytics it applies to all that data, letting users filter and correlate across myriad different factors.

Here’s an interview I did with DataSift Founder and CTO Nick Halstead during our Structure: Europe conference in Amsterdam last month, in which he describes how DataSift built an infrastructure capable of handling so much real-time data and how companies can use such data effectively.

]]>Datasift, the UK company that has access to the Twitter firehose analyzes a petabyte of tweets and ships terabytes of insights around the world. And the infrastructure needs to keep up. The company has replaced its older networking gear with Arista switches to support its analytics operation, Datasift CEO and founder Nick Halstead said in a conversation with Derrick Harris at our Structure:Europe 2012.

Check out the rest of our Structure Europe 2012 live coverage here, and a video recording of the session follows below.

]]>Cloud computing tends to be a very North America-centric topic, if only because so many of the biggest providers of cloud resources and services are based in the United States. That’s fair enough — the business side of things is very important — but other continents, particularly Europe, have a lot more to bring to the table than just seemingly restrictive data privacy laws.

We’ll discuss many of the finer points of European cloud computing and web infrastructure at our Structure: Europe event Oct. 16 and 17 in Amsterdam, both business and technological. To whet your appetite, though, here are seven reasons why Europe is a lot more important than many people might think.

5. OpenNebula

Led by Spanish computer scientist Ignacio Llorente, OpenNebula is a fairly popular open source cloud platform that rivals the work being done, largely in the United States, by the OpenStack and CloudStack projects. The project has been around since 2005, and claims a handful of large companies and European research institutions as users. Although it doesn’t have the big-name backers of the other two, it should remain viable for a long time because of its rather large user base.

6. One-third of Twitter’s firehose

DataSift, which is headquartered in Reading, England, is one of three companies (along with Gnip and Topsy) certified to resell the all the billions of data points streaming from Twitter every day. Social media, and Twitter especially, are a huge focus of corporate analytics efforts, and anyone that can capture and analyze all the world’s tweets is a kind of a big deal.

]]>Depending on what research you believe, chief marketing officers are either more powerful than they’ve ever been — or they’re on their way out.

Early this year,a Gartner analyst predicted that CMOs will have bigger IT budget power than CIOs by 2017. Naturally, that analysis was quickly bandied about by CMOs.

But last week, in a blog post unambiguously titled “Marketing is Dead,” Bill Lee wrote that CMOs, as a species, are under fire.

Lee, president of the Lee Consulting Group which focuses on “customer engagement,” cited data from a 2011 Fournaise Marketing Group study suggesting that CEOs don’t see ROI on marketing efforts and are sick of being asked for marketing money with no discernible payoff. On top of that, Lee posits that shoppers don’t pay attention to traditional marketing anymore. Ouch.

Cloud and big data reshape the marketing role

Underlying this seeming contradiction is that marketing is being redefined in the era of cloud-delivered, self-service applications and services and web-connected consumers. Several CIOs and CTOs have told me that they agree that CMOs gaining clout in their businesses — but the most successful CMOs are those who “get” that effective marketing is both broader and more focused than it’s been in the past.

“Broader” here means that the channels are no longer limited to radio, TV, print and online publications but social networks as well. CMOs who understand that the data flowing in via Twitter and Facebook is an important source of market intelligence — a big data feed that must be monitored and tapped.

Filtering the social networks

The explosion of social networking use means that “multi-channel” marketing is more multi-channel than ever. You don’t have to just track newspaper, TV and radio “thought leaders,” you need to watch for your company’s own thought leaders — your best customers and what they’re saying. That means narrower, less scattershot messaging — why hit up people who are not even remotely interested in your product or service? The idea is that your thought leaders will convince others that your offering is worth a look.

Many firms spend lots of resources pursuing outside influencers who’ve gained following on the Web and through social media. A better approach is to find and cultivate customer influencers and give them something great to talk about.

In order to become customer-centric and deliver a consistent message to each individual regardless of the communication channel, companies must first integrate all their customer- and prospect-related data. Up to now, organizations would silo the various types of customer-related data.

So whether a given CMO has clout or is about to get pink-slipped depends a lot on her ability to understand the importance of this data trove and capture and make use of that big data resource.