Crowdsourcing predictions

If I mention the name Nate Silver, or FiveThirtyEight, chances are that most of you know what the next sentence is going to be about. Silver is an American statistician who gained widespread global fame earlier this month by correctly predicting not only the outcome in the US presidential elections, but also the winner in every single US state. And it wasn't a one-off - he'd done the same thing in the 2008 elections, getting only Indiana wrong. But he wasn't the only one with an uncanny accuracy for predicting the election outcome; online trading websites, such as InTrade, were also calling the election results with an extremely high degree of accuracy. These websites use a forecasting tool called a prediction market.

Aside from elections, prediction markets have been used since the 1940s to predict anything from sports outcomes to Oscar winners and lifetime video game sales. Prediction markets work through the iterative buying and selling of 'shares' on a particular topic or event; the market prices therefore act as a representation of the probability of a particular outcome happening. The key is that the market prices aren't driven by any one single person's opinion on what the outcome will be; over time, the market aggregates opinions and predictions from a large, diverse group of people. Prediction markets are of interest to economists, because they provide an attractive alternative to other means of assessing opinion in a particular area. For example, opinion polls are open to a number of potential problems, but in particular there is always the possibility of a potential mis-match between what people tell you, and what they actually do. This is in part because opinion polls represent a single snapshot of the zeitgeist in a fixed, narrow timeframe; any future events that might affect a given individual's opinion on a topic won't have any effect on the poll outcome. In contrast, prediction markets operate over much longer periods of time, and so can capture changes to opinions that might happen in response to newly-available information. Moreover, prediction markets allow for questions to be asked that have an objective and verifiable outcome that can be much more easily quantified than an opinion poll. So it's not just economists that are interested in them - scientists are now starting to see whether prediction markets offer any potential benefits to research.

A new initiative called the Science Prediction Market Project has been set up to look at how effective it is to actually implement forecasting tools within a research setting, and it isn't pulling any punches; the first market is looking at whether or not researchers can predict the outcome of the Open Science Reproducibility Project. The issues that prediction markets address in science are slightly different to those in economics. In science, what we have as opposed to opinion polls is published literature, which has passed through various mechanisms such as peer-review to get out into public forums. What these processes don't necessarily capture, though, is the underlying belief of scientists; generally these only tend to make themselves known in anecdotal situations, when people start to speak 'off the record', say at conferences. Potentially then, the outcome of prediction markets may be even more interesting in science than in economics, because there might be an even greater discrepancy between what is publically and privately available.

Things get really interesting when you consider the Reproducibility Project. This is a large-scale attempt that aims to see the extent to which studies in a single year, from three major psychology journals, can be replicated. There is a growing concern that many reported findings in psychology (and more generally across the sciences) are false, and replicating studies is a great way to look at the extent of the problem. The trouble is (certainly in the psychological sciences) that it's not happening enough, it's not being incentivised appropriately, and the lack of replication is seen by many as a contributing factor to the growing number of accusations of scientific fraud and misconduct that we are seeing. The Reproducibility Project provides an independent benchmark on what proportion of psychological science actually replicates.Importantly, it can act a sort of independent 'arbiter' for private beliefs about whether a given study was ever showing a true effect or not. The likelihood of whether a study will replicate or not is in part dependent on the statistical power of the study - essentially, the probability of correctly rejecting your null hypothesis when the null hypothesis is actually incorrect. So for example, if a particular study is powered at 50%, then that means that there's only a 50% chance that a true effect will replicate. So we need to take care in announcing failures to replicate. But given that we know the statistical power of the studies in the Reproducibility Project is at least 80%, we can still make informed guesses on whether the original effects will be reproduced. On the basis of this, the Science Prediction Market Project is going to look at the extent to which the prediction market accurately predicts which studies will replicate and which studies won't, by attempting to tap into those private beliefs that researchers hold.

Clearly, the public information available for these papers is identical, in that they're all published studies with some sort of associated effect. However, from the simple fact that the Reproducibility Project exists at all, we are not expecting that all of the studies in the project's line of sight will replicate. The importance of the Science Prediction Market Project, then, lies in the extent to which the prediction markets accurately forecast which studies will and won't replicate. If the market is highly accurate, it comes with an unsettling implication that could potentially undermine the whole publication process. For papers that don't replicate, an accurate market would suggest that we collectively knew all along that the papers were incorrect, but no one was prepared to publically say anything to that effect. Moreover, the nature of the market means that you get convergence on an opinion about a given paper from a crowd, rather than from anyone specific individual. Of course, there will always be people who say "I knew this study was wrong all along", but that isn't particularly informative, as individuals might have an agenda, or dislike something esoteric about the paper, or simply dislike the authors - scientists are only human, after all. But if the prediction is averaged across a number of experts in the field, the hope is that that aggregated opinion is a much more informative one.

Time will tell whether the results from the prediction markets match the actual outcome of the Reproducibility Project. If it turns out that we are, in fact, completely rubbish at predicting outcomes of projects like this, we can perhaps breathe a sigh of relief that we weren't all collectively sticking our heads in the sand. On the other hand, if the prediction markets end up showing we're actually quite accurate, we might have to start rethinking whether the peer review process as it currently stands is doing the job we intend it to. More generally though, the whole concept of using prediction markets in scientific research opens up all sorts of new and interesting avenues for tapping into those private thoughts and beliefs we hold that, until now, we've not had an objective way of accessing.

Special thanks go to Professor Marcus Munafò for his advice and explanations of prediction markets. You can follow him on twitter here.