The Facebook decline paper is a disgrace to Princeton’s name

The obvious answer to the question “why won’t Facebook decline by 80% by the end of December this year” is “because obviously it won’t, what kind of idiot would even claim it would?”. It’s the leading social network in all age groups, and between July and December 2013 total user numbers only fell by 3%.

However, if you’re reading the papers today, you might be forgiven for thinking otherwise. The Daily Mail is the worst offender, because obviously the Daily Mail is the worst offender, but plenty of derp is being thrown left, right and centre. I’m quoting the Mail piece, because hell, why not:

Faebook is heading for a catastrophic decline and could lose 80% of its users by 2015, researchers warned today.

(yes, Faebook in the lede is the Daily Mail’s typo. QUALITY JERNALISMS!)

The researchers in question are proper academics, more or less: they’re two PhD candidates at Princeton, John Cannarella and Joshua A Spechler. They’ve written a paper which takes a standard epidemology model, the SIR (susceptible, infectious and removed) model, and tries to apply this to the spread of social networks. It’s not a bad choice in theory: it’s generally accepted that social networks spread virally; and the SIR model applies to diseases which are fatal or immunising (so once you’ve got over it, you can’t get it again, like measles [*]) – most people who give up on a network don’t come back, so fair play.

There are a couple of obvious [**] early alarm bells: the paper is not peer-reviewed, and Cannarella and Spechler are studying for PhDs neither in the epidemiology department nor the digital cultures department. They are mechanical and aeronautical engineers. Working entirely outside your discipline doesn’t necessarily disqualify you from doing good work… but it makes the need for review by someone who does know the discipline even more important than usual.

Extrapolating the best ﬁt into the future shows that Facebook is expected to undergo rapid decline in the upcoming years, shrinking to 20% of its maximum size by December 2014.

Unfortunately, this claim is solely due to the paper not undergoing peer review, or apparently proof-reading, before being made publicly available. Page 7 says:

Extrapolating the best ﬁt model into the future suggests that Facebook will undergo a rapid decline in the coming years, losing 80% of its peak user base between 2015 and 2017.

This second conclusion fits with the charts and data presented in the paper. So nobody at all is actually predicting the 80% decline by December 2014; the journalists reporting on it are gibbering halfwits, and the writers are monumentally half-arsed for failing to spot such a basic and disastrous mistake in such a short piece of work.

But also, the premise of what we’re doing is stupid

What about the “losing 80% of peak user base by 2017″ conclusion, then? This is indeed what the authors’ model predicts.

Unfortunately, the authors’ model is not entirely robust.

My TL:DR summary of the paper’s methodology is “we modelled MySpace’s growth and decline against the number of Google searches for MySpace, and then applied the same model to the number of Google searches for Facebook”.

If you think this is a ridiculous way of doing things, given the niche, geographically and age-group limited status of MySpace versus the universality of Facebook, and given the different corporate natures of the two organisations, you are correct.

There is an excellent piece in The Week which covers these flaws in the paper’s central conceit very well (keywords: no Murdoch; profitable; less spam; universal; vast corporate cash war chest).

But also also, we’ve completely juked the stats

However, if the models line up, then – subject to critiquing the assumptions – there might be something of value in the paper, right? Well, no. This is where things move from “hmm, I’m not sure this fits with existing research on epidemiology or social networking” to “oh, go and stick your heads in a fire”.

The model used is not actually the SIR model. It is a model called irSIR, which the authors have invented (page 3). They have used this because the SIR model doesn’t work. They don’t cite any epidemiology research when justifying their irSIR model, just a “common-sense” theory about how social network users behave, coupled with a couple of descriptive papers about online network usage.

They don’t use any of the work on social ties that digital cultures theorists have spent the last 20 years developing. Nor do they use any of the work on epidemiology beyond the SIR model as detailed in first-year undergraduate classes. Because hell, where would be the fun in that?

Strangely enough, the model they have custom-built to fit their data on MySpace’s decline fits their data on MySpace’s decline almost perfectly.

However, there’s a new problem. The decline thesis doesn’t really fit the data on Google searches for ‘Facebook’, which remain at 2011 levels and don’t show much of a declining trend at all (the dotted bit is Google’s projection; feel free to ignore everything after January 2014 if you’re sceptical):

The authors get past this problem in a way that is truly ingenious: despite not having any evidence that the increase in October 2012 is fake, they scale back all post-October data by 0.8x. As a result, they end up with this beautiful chart, which not only matches the shape of the MySpace curve, but does so over a similar time period and is even steeper:

Strangely enough, following the modification to make their data on Facebook line up almost exactly with the data on MySpace, the projected decline for Facebook lines up almost exactly with the recorded decline for MySpace.

In short, this paper is incredibly sloppy, is based on a flawed premise, and only works because the data has been tortured until it confessed.

If the authors apply the same principles to mechanical and aeronautical engineering that they apply to social media uptake, then I’d be fucking reluctant to get in a plane that either of them had had anything to do with.

[*] A small proportion of people who get diseases like measles are at risk of getting them again, which more complicated models have been built by actual epidemiologists to allow for.
[**] If you are used to reading academic papers. Not, apparently, if you are a journalist.

7 thoughts on “The Facebook decline paper is a disgrace to Princeton’s name”

Excellent article with some first rate critical analysis – sadly lacking in many media outlets these days. What is interesting is that two engineers feel they can use mathematical principles to predict the future of a group of people. One thing we do know from social psychology is that whilst some behaviour is predictable, some is not. Even though social network theory does apply in many instances of all animal social groups, there are cases where it doesn't always fit because it fails to take into account behaviour enough. Throughout the Internet we see engineers and mathematicians trying to come up with answers without considering psychological factors enough. The engineers who came up with this paper not only needed to consider input from people who understand the online world better, they also ought to have included input from psychologists. And what we might say is that the 80% drop is perfectly possible. But so too would be an 80% increase. That's the beauty of human behaviour – it is variable and not easily open to mathematical prediction.

Agreed. Also, a decline in Google searches for Facebook is surely a consequence of the continued increase in the use of mobiles/tablets. Surely most moderate to heavy users enter through the fb app or at least have the site bookmarked?

Good critique; as expected: Journo's are illiterate and engineers use fudge factors.
Having said that I reckon they're right and that Facebook (as we know it) will be largely extinct by 2017. Any punters?

The SIR model (and their variant of it) assumes *from the outset* that the outbreak will have a short life-cycle i.e. that Facebook will grow and die quickly. It assumes a static population – i.e. no-one is born and no-one dies – which is a reasonable approximation if you're working on a disease epidemic which grows rapidly over days/weeks/months or (maybe) a small number of years. But it is obviously not a sensible assumption otherwise. Also in the SIR model extinction is guaranteed – things can only go one way – the population of people who will never again log into their FB accounts can only grow and the pool of potential users can only decline. So they are assuming a scenario, from the very start, where FB will die and do so quickly!

In epidemiological terms the interesting question is whether Facebook, in the medium term, might become endemic – which intuitively seems rather likely – but is ruled out by the SIR model (and their variant).