The 1 means it is a Twitter verified account, 0 means it isn’t. You can sort by that column to find all the most notable accounts that are following someone. For example, ScraperWiki has 44 followers who are verified users. Here’s a few, as you can see, they’re mainly journalists!

You can see how they compare to our other users in this chart which plots the number of followers an account has along the horizontal axis and the number they follow on the vertical axis. The verified accounts are shown as orange dots. This plot shows that on average verified accounts have more followers than unverified ones. There’s nothing to do to turn it on for new users that you’re scraping. If you’ve previously scraped a user, you’ll have to clear them and start again to add the verified column.

The Search for Tweets and Get Twitter followers tools are the most popular on our platform.

Why is this?

In part this is because we’re sociable creatures; platforms like Twitter get a lot of interaction time from a lot of people. A certain section of the population has a data packrat mentality. For them ScraperWiki is an easy way to collect, keep and download that Twitter data which they feel you must be able to make use of. But more than this, there is value in social data.

Why should you be interested?

This is data about your customers, or people who have made some small effort to interact with you. What do these people have in common? Where might you find more people, like them, who would be interested in your products? What can you offer them to make them give you money? You may have your own internal data to mine for insights but if you wanted this type of data from customers then the alternative is running market research investigations. This would likely provide richer data but at a higher cost. Mining your social data will help answer these questions relatively economically.

What can we do?

At ScraperWiki we’ve just scratched the surface of what’s possible in terms of data collection and analysis from social platforms. Alongside the public Twitter tools we have a LinkedIn tool, and experimental tools for extracting data from Plurk, Flickr, Instagram and Facebook. Adding more tools is simply a matter of time and inclination rather than any powerful technical challenge. So far we’ve been exclusively using the free, public APIs for services. This works pretty well. The public API for Twitter only gives search results back seven days. But if you’ve started a collector on ScraperWiki we’ll keep all the tweets in your search for as long as it’s running. Twitter’s API is “rate limited”. It will only provide a fixed number of results in a given time period. This threshold is pretty high though – in theory you can get 18,000 tweets every 15 minutes. Using our follower tool we’ve collected up to millions of followers for high profile accounts.

If more data is required then bigger data feeds from DataSift and Gnip are possible, although these are pretty pricey – thousands of dollars per month. Access over and above the public APIs is only available for some services.

What is possible?

So that’s the data collection side of things. Once we have the data then what you can do with it is only limited by your imagination, and programming skills. For example, we’ve looked at the time course of the #InspiringWomen hashtag back here. Andy Cotgreave at Tableau shows how easy this type of analysis is to do with pre-packaged tools here. You could use this sort of analysis to track a product launch or a media campaign.

You can look at the characteristics of your followers, and incidentally discover some of Twitter’s following rules – as I showed here, at the end of my review on R in Action. You can do machine learning to find out which of your followers are company accounts (see here). I’ve even written code to find out which followers have faces in their profile pictures. This sort of analysis tells you more about your followers, and hopefully customers.

These are just examples. I’m interested in finding out about the people who have liked or commented on the ScraperWiki Facebook page. I’ve discovered that I find the responses from Facebook’s API easier to understand than Facebook’s web interface!

Finding out more?

If you want to learn more about exploring social media data for your self then Matthew A. Russell’s book is and excellent introduction; I’ve reviewed it here.