At Mashable’s panel “Measuring social media – lets get serious,” I had the opportunity to question Kevin Weil, Twitter’s product lead for revenue, on an issue concerning Twitter’s data that has been bothering me for a long time. Christina Warren, the Mashable writer who moderated the panel, went on to write a post about the questions I put to Kevin, here.

First, a bit of background. There are two ways to get data from Twitter: via the application program interface or API, or via the “Firehose” of all tweets (or some percentage thereof). The API is free but capped, so you can only get a portion of what is actually there. The Firehose is supposed to be all tweets for a given time period, and is quite expensive to access, whether via Twitter directly or through Gnip, an aggregator that helps other providers get access. Twitter has not disclosed which tools have full Firehose access.

The major problem that has arisen is the crop of slick-interface social monitoring and analytics tools that use the API instead of the Firehose and represent themselves as though they are appropriate analysis tools for significant amounts of conversation. While okay for small businesses that don’t have much volume, for brands with medium to large amounts of conversation, the data provided by the API is incomplete because the API will only give away so much data for free and caps their access.

This is a major problem. Bad data = bad research = bad decisions = bad results and damaged relationships with stakeholders. In turn, this results in damage to your ability to use social media to grow and prosper, whether you are at a brand or an agency. The amount of money being spent on the basis of bad research done with API data is certainly in the tens of millions of dollars and possibly in the hundreds of millions. There is simply no way of knowing the magnitude of the damage this issue may have caused.

These tools routinely present themselves as competitors to monitoring or analysis tools that have access to all the Twitter data that would really be necessary to offer the pretty graphs that they do, but it’s impossible to know which of them is telling the truth. Twitter has not taken action to either shut these tools down, change the nature of the data they provide via the API, or disclose those providers who have full Firehose access, and publish guidelines as to when the others might be appropriate.

I spoke with Kevin briefly after the session and I believe that Twitter is going to take appropriate action. While this may (and probably should) be painful for a large number of tool providers, ultimately the industry as a whole will benefit greatly from increased consistency and measurement accuracy.