On the latest Talk Show, Gruber wondered what the spread of Twitter clients really is. It made me realise that no one has really explored how many people use Twitter’s own apps. I thought I’d (try to) find out.

Firstly, Twitter doesn’t report the information itself, so the only way to find out is by brute force. Due to the Twitter Streaming APIs, this is relatively straightforward. I ran a script which literally scanned tweets for their source link and counted each unique source.

Due to the time sensitivity of Twitter, finding one exact data set is impossible as client usage varies constantly. For instance, a big global event may encourage people to tweet on a day they otherwise wouldn’t. On a normal day, though, this method should be a good indicator of average user behaviour and app usage. For reference, the results that I collected are from a random sampling of tweets across a 9-hour period, approximately 9am to 5.30pm on the 18th July.

In total, I collected and analysed one million tweets. Whilst this seems like a lot, it still pales in comparison to daily tweet volume1. As always, I would have loved to collect more data, but there are, naturally, physical limitations to doing so. However, I still feel the study was significant enough to portray the trends to a good degree of accuracy and warrants publishing.

Below is a sorted list of the findings. Any app that created more than 0.02% of tweets2, in the period observed, is listed.

The raw data, by itself, is useless. In order to determine how many people would be affected by a ban on third-party clients, the first step is to identify anyone using native clients. There are a handful of them: Twitter.com, m.twitter.com, Twitter for iPhone, Android, Blackberry, Windows Phone, iPad, Mac as well as TweetDeck and Keitai Web (Twitter’s Japanese client) and SMS (identified as “txt”). Twitter recently announced a client for Nokia S40, but I don’t think it has caught on — zero of the million tweets originated from this client.

The total of these rows is 708,101 out of 1,000,000.

Therefore, this means that at least 70.8% of the total originated from first-party clients, and at most 29.2% of people use third-party Twitter clients.

Already, first-party apps clearly have a monopoly. The actual share of first-party usage is even higher, however.

This is because not all of the observed apps are actually “clients”. Many are simply apps which post to Twitter. Instagram is an example of this; it just posts links back to its own service. The “Tweet Button” is a first-party example: it allows the user to tweet, but it isn’t a client. Therefore, it is necessary to filter out these “non-clients” to show a true representation of first-party-client dominance.

In an ideal scenario, a human would individually assess the ‘clientness’ of each app that created a tweet in the period. However, this is too time-consuming to be feasible. Instead, I have hand-checked only the 118 apps listed above, and use that ratio to extrapolate across the 291,899 remaining tweets to gain a relatively-accurate estimate.

My assessment of each app is denoted in the third column of the table. I found that 36 of those apps were not clients (identified by the third column). Applying that ratio4 (30.5%) to the 291,899 tweets, it is estimated that 89,053 tweets were not from clients. By discounting these ‘invalid’ tweets, the overall bucket is reduced, thereby increasing the final proportion of first-party app usage to 708,101 out of 910,947 tweets, equivalent to a percentage share of over 77%.

For people that think Twitter will never ban third-party clients because there would be too much backlash, I think this 77% figure shows that Twitter could do it with ease. A large portion of the 23% would be happily herded to a first-party client, as they don’t really care what app they use — it just turned out that the client they first downloaded wasn’t a Twitter-owned app. The only people who would care would be the geeks, like me and anyone else who could be bothered to read this post, who actually care about the client they are using. And let’s face it, Twitter doesn’t care about geeks.

2 This accounts for the 46,628 tweets that are missing from that list; any client that produced less than 200 tweets has been omitted.

3 The full name of this app is “Twil2 (Tweet Anytime, Anywhere by Mail)”. I shortened it in the table as it was unnecessarily wide.

4 In fact, I think that that this ratio should be even higher, as I would expect a lot of the smaller sources are from obscure apps that simply post to Twitter. There is a bias for clients to produce proportionally more tweets (which are more likely to appear in the list I checked, thus skewing the ratio) than non-clients which is not factored in the calculations above.