The Anatomy of a Forgotten Social Network

March 31, 2014

The Anatomy of a Forgotten Social Network

While network scientists have been poring over data from Twitter and Facebook, they’ve forgotten about Tumblr. Now they’ve begun to ask how this network differs from the rest.

The study of social networks has gripped computer scientists in recent years. In particular, researchers have focused on a few of the biggest networks that have made their data available, such as some mobile phone networks, Wikipedia and Twitter.

But in the rush, one network has been more or less ignored by researchers: Tumblr, a microblogging platform similar to Twitter. So an interesting question is how the network associated with Tumblr is different from the Twitter network.

Today we get an answer thanks to the work of Yi Chang and pals at Yahoo Labs in Sunnyvale. These guys point out that relatively little is known about Tumblr compared to other networks like Twitter and set out to change this.

The basic statistics are straightforward. Tumblr is a microblogging service with about 160 million users who together have published over 70 billion posts.

The most significant difference between Tumblr and its bigger cousin, Twitter, is that there is no limit to the size of the posts that users can create. By contrast, Twitter imposes the famous 140-character limit on all of its posts. Tumblr also supports multimedia posts, such as images, audio, and video.

Another important difference is that Tumblr does not require users to fill in basic profile information, such as gender or location. So this makes the analysis a little trickier than it is with other networks that do collect this information. Nevertheless, Chang and co say that Tumblr users tend to be much younger than people on other networks, with the majority of users being under the age of 25.

Chang and co study the nature of Tumblr using a subset of almost 600 million posts published on the network between August and September last year. They say that over 90 percent of these posts involve photos or text. Despite supporting other types of media, these have clearly not yet become popular on Tumblr.

One interesting question is whether Tumblr more closely resembles a blogosphere network than a microblogging network like that of Twitter.

There are significant differences between these types of network. A key characteristic of Twitter is that there is a good deal of reciprocity between users. Reciprocity is the likelihood that if user a follows user b, then b also follows a.

In the blogosphere, reciprocity is almost non-existent. Only 3 percent of bloggers have this kind of reciprocal link. On Twitter, however, the ratio is much higher: some 22 percent of tweeters have reciprocal links.

In this respect, Tumblr is even denser than Twitter, with almost 30 percent of connections being reciprocated. What’s more, the average distance between two users in Tumblr is 4.7; in other words one user can connect to another in an average of 4.7 steps. That’s half the distance of the blogosphere and about the same as the distances in Facebook and Twitter.

How long are posts on Tumblr, given that there is no length limit? The average post is 427 characters long and a quarter of them are longer than Twitter’s 140 character limit. By contrast, the average length of a tweet is just 68 characters.

Finally, Chang and co say that content tends to be reposted more quickly on Tumblr. “Approximately 3/4 of the ﬁrst reblogs occur within the ﬁrst hour and 95.84 percent appear within one day,” they say. By contrast, on twitter about half of retweeting occurs within an hour and 75 percent within a day, they say. “Tumblr is more vibrant and faster,” say the Yahoo researchers.

This work provides a useful snapshot of Tumblr as it was in late 2013. As such, it will allow researchers to understand how the network evolves in future.

That will be important for Yahoo. It’s worth bearing in mind that in May 2013, it paid over $1 billion for Tumblr. So it’s not at all surprising that they want to understand what they’ve bought.

What is a little more puzzling, though, is that they’ve waited until now to find out.