Economists betting on replication

A bunch of folks are collaborating on a project to replicate 18 experimental studies published in prominent Econ journals (mostly American Economic Review, a few Quarterly Journal of Economics). This is already pretty exciting, but the really cool bit is they’re opening a market (with real money) to predict which studies will replicate. Unfortunately participation is restricted, but the market activity will be public. The market opens tomorrow, so it should be pretty exciting to watch.

There was some discussion about doing this with psychology papers, but the sense was that some people were so upset with the replication movement already that there would be backlash against the whole betting thing. I’m curious how the econ project goes.

14 Comments

This was done with the psychology papers replicated in the OSF Reproducibility Project Psychology by economists at hhs.se (Anna Dreber Almenberg).
Betting against counter-intuitiveness, social psychology and small samples was an okay strategy for me (I was a student then and had almost no inside knowledge), I think I tripled my initial seed fund, some even quadrupled it. One problem they had is that the studies took so long to conclude, so they settled 7 on last observed market prices.
Participants have been paid, but I don’t know how good the predictions were, I assume this will be in the big reproducibility project paper.

They repeated this again with the ManyLabs 2 project, though I think that’s not that far along yet.

We could just do this will all PubMed/SSRN indexed papers, with some big federal or philanthropic subsidy to achieve volume. Volume over a certain threshold would trigger replication by some third party (a lot of details to work out here), with payouts to one side or the other accordingly. Volume below a certain threshold would just be evidence that it probably will or probably won’t replicate.

Of course, prediction market design requires that event outcomes can be clearly said to have either occurred or not. Therefore, we define a result as being replicated if the statistical method used in the original paper yields a p-value 0.05.

How were sample sizes chosen?

All planned sample sizes were chosen so that if the replication produces an effect of the same strength as in the original study, the probability of p<0.05 in the replication is 90% (the tests have 90% power). In some cases, larger samples than those with 90% power were chosen because the lowest multiple of the typical original session size was somewhat above the 90% power sample."http://sciencepredictionmarkets.com/repdetails.html

I suspect this is unclear and they mean "significant in the same direction". So if the null is always false there is 50% chance of winning with very large sample size. Looking at http://sciencepredictionmarkets.com/studies.html it appears they have usually doubled the sample size of the original study. So 0.9*0.5=0.36, or 36% percent would replicate if there was nothing to the theories at all?

If you’re an economist whose work is being bet on and you think you’ve done high-quality research, is it better to bet that your work will replicate (presumably you assess this bet as positive expectation), or is it better to hedge the risk to your reputation by betting *against* your own work?

“This project will provide evidence of how accurately peer prediction markets can forecast replication of scientific experiments in economics.”

While the efforts to replicate high profile research should definitely be applauded, I don’t really see the point of using prediction markets to forecast the results. Other than generating interest in the replication efforts, what difference does it make whether ‘peers’ can accurately forecast the results? (And yes, I understand that it may provide some information about the credibility of specific studies.)

I think it is interesting to see how well can researchers tell apart credible and incredible claims. The impact of junk science might be overestimated as no one believes most of it anyway. And there might be some widely believed stuff that turns out to be untrue, which is invaluable insight about the types of questions where human intuition fails.

Most bullshit in science (or even “science”) is well-suspected in the particular field. It’s just that no one has an unequivocal proof and thus only talks about it during a drinking rage in a bar. Under this scenario prediction market is expected to reveal the true crowd beliefs.

Less than a month from A to Z? That’s suspicious on its own. Particularly when there is ZERO output reported 2+ months later – even though it’s just 18 papers. (Smarter people would have realized that 18 papers is too small a number, IMO).