We are PhD candidates in political science at MIT, and our paper, “Questioning the Effect of Nuclear Weapons on Conflict,” is now forthcoming in the Journal of Conflict Resolution (JCR). This paper began as a replication of Robert Rauchhaus’ 2009 article “Evaluating the Nuclear Peace Hypothesis: A Quantitative Approach,” that we wrote up for Gary King’s Gov 2001 class at Harvard in Spring of 2011. Here are a few lessons we learnt along the way that may be useful to others thinking of trying to publish replication papers (for a more comprehensive set of pointers, we recommend Gary King’s “Publication, Publication”).

Software is not interchangeable

We were easily able to replicate Rauchhaus’ key findings in Stata, but couldn’t get it to work in R. It took us a long while to work out why, but the reason turned out to be an error in Stata: Stata was finding a solution when it shouldn’t have (because of separation in the data). This solution, as we show in the paper, was wrong – and led Rauchhaus’ paper to overestimate the effect of nuclear weapons on conflict by a factor of several million.

It’s very easy when you’re working through someone else’s code to be satisfied when you’ve successfully got your computer to produce the same numbers you see in the paper. But replicating a published result in one software package does not mean that you necessarily understand what that software package is doing, that the software is doing it correctly, or that doing it at all is the appropriate thing to do – in our case none of those were true and working out why was key to writing our paper.

Go beyond pointing out errors

When you replicate, don’t stop once you’ve found mistakes in the original paper. It’s much easier to tear down the work of others than it is to build your own, but the latter is much more likely to be published. There are mistakes or questionable methodological choices in almost all published work, and unless the original paper is very influential (eg, Reinhart and Rogoff), the audience for pure critiques is likely to be fairly narrow.

We initially wrote a very short critique of the original Rauchhaus paper (around 3 pages long) and submitted it to JCR, only to have it desk rejected for being too short. However, once we refocused the paper away from being solely critical and produced new substantive findings on nuclear weapons and conflict, we ultimately got accepted.

Don’t worry if your paper isn’t going to revolutionize the field

Although replication papers should go beyond criticism, they don’t have to represent revolutionary contributions to the field to get published. Incremental improvements upon previous work are still an improvement, are beneficial to the accumulation of knowledge, and can still be published. “A marginal, but nevertheless important, empirical contribution” (as one of our reviewers described our paper) can still be worthy of publication.

Replications are part of your methods training

As you get in to the process of replication, don’t just use the methods you have been taught as part of your methods sequence – use it as an opportunity to learn about methods you haven’t been taught. Neither of us would have thought a great deal about the issue of separation if we hadn’t come across it in the Rauchhaus paper, but since we did, we’re now both aware of the seriousness of the problem and potential ways to deal with it.

Learning new methods as you conduct the replication ensures that even if you don’t end up publishing your replication paper, you’ll have learnt new skills that will be valuable down the road.

Mark Bell and Nicholas Miller

About Mark Bell

Mark Bell is a PhD student in Political Science at MIT with interests in international relations, political methodology and security studies, with a focus on issues relating to nuclear proliferation. He holds an MPP from the Harvard Kennedy School, where he was a Frank Knox Fellow, and a BA in Politics, Philosophy and Economics from St Anne’s College, Oxford University. He previously worked for the British think-tank CentreForum and for Senator John Kerry. [website]

About Nicholas Miller

Nicholas Miller is a Ph.D. candidate at MIT in international relations. His interests include nuclear proliferation, foreign military intervention, and civil war. His research is forthcoming in International Organization, the Journal of Conflict Resolution, and Security Studies. Prior to MIT, he graduated Phi Beta Kappa with a degree in Government from Wesleyan University. [website]

About their paper

Abstract: We examine the effect of nuclear weapons on interstate conflict. Using more appropriate methodologies than have previously been used, we find that dyads in which both states possess nuclear weapons are not significantly less likely to fight wars, nor are they significantly more or less belligerent at low levels of conflict. This stands in contrast to previous work, which suggests nuclear dyads are some 2.7 million times less likely to fight wars. We additionally find that dyads in which one state possesses nuclear weapons are more prone to low-level conflict (but not more prone to war). This appears to be because nuclear-armed states expand their interests after nuclear acquisition rather than because nuclear weapons provide a shield behind which states can aggress against more powerful conventional-armed states. This calls into question conventional wisdom on the impact of nuclear weapons and has policy implications for the impact of nuclear proliferation. [Paper | Analysis and Data]

Hat dies auf Cornelius Senf rebloggt und kommentierte:
Here’s a very interesting post from Mark Bell and Nicholas Miller, both PhD students from MIT, about publishing replication papers. They have re-analyzed data shown in a paper by Rauchhaus (2009) and found that the results of Rauchhaus were reproducable in Stata and but not in R. Thus, Rauchhus results and conclusions were due to an error in Stata and he seriously over-estimated the effect found. Have a read through the blog post and the paper.

Where is the error in Stata? The author’s so called “Do-File for Analyses.txt” is actually not a Stata do file but it does refer to Stata’s user-written command -firthlogit- from SSC. Please provide a reproducible do-file in Stata.The claim that results and conclusions were due to an error in Stata is not supported.

There’s also a few more issues at the beginning of the R replication script, which fails to run if you don’t adjust some file paths and add a beta.draw object for the first diff. Nothing serious AFAICT.

Make your claims replicable — post the Stata code and the R code. Keep in mind that Stata’s firthlogit is an old user-contributed package, just like nearly everything on CRAN is. Pointing the blame at Stata is like saying that Gary King made a mistake in his class when in fact it was a grad student in the class who had a typo in their program. It is of course unsettling to find that the two packages do not match in their answers, but without a full review of code in both, you can only say that at least one is in error, and not assume that R is infallible. For instance, there was an anecdotal piece of history when Stata and S-Plus had the same mistake in their mixed model or GLM or something like that because they implemented the same typo in Biometrika paper. They matched each other fine, but the numbers did not make sense because a plus was switched for a minus in the paper that they both accurately implemented. Finally, the issue of separation (http://www.ats.ucla.edu/stat/mult_pkg/faq/general/complete_separation_logit_models.htm) is exactly what Firth’s penalty term (http://biomet.oxfordjournals.org/content/80/1/27.short) is supposed to address (among other issues), so having produced an answer in presence of separation is exactly the expected behavior.

Thanks for the comment – we never claimed (either in the post above or in the paper) that there was an error in the firthlogit package – as you note, firth logit is a user-written package, and offers a good way of dealing with separation appropriately. We use it in the paper as one way of dealing more appropriately with the data.

The problem we identified in the Rauchhaus paper was that Rauchhaus used a logit GEE (xtgee) which failed to recognize separation in the dataset and returned a coefficient on the variable causing separation when it should not have done (in general stata drops the variable causing separation but does not do so in this case). Our data and code will be published on the JCR website when the paper comes out, but if you are interested you can download Rauchhaus’ data and code from the original 2009 article on the JCR website.

[…] When replication isn’t: We were easily able to replicate Rauchhaus’ key findings in Stata, but couldn’t get it to work in R. It took us a long while to work out why, but the reason turned out to be an error in Stata: Stata was finding a solution when it shouldn’t have (because of separation in the data). This solution, as we show in the paper, was wrong – and led Rauchhaus’ paper to overestimate the effect of nuclear weapons on conflict by a factor of several million. […]

[…] For example, similar to the graduate student who replicated the Reinhart-Rogoff paper, two students from MIT published their replication work in the Journal of Conflict Resolution in August. Mark Bell and Nicholas Miller changed the common […]

[…] two options touch on the important question if we can get replications published. There are some replication studies out there, and some even receive an answer by the original author in form of a ‘replication […]

[…] Guest post: How to persuade journals to accept your replication paper This was an exciting piece by two graduate students who described how their replication study was published after being assigned the project in class. Having so many clicks, this shows that many students and scholars want to learn how to publish replication work. The main incentive for doing replication studies is to get published – now more journals need to acknowledge that replications are valuable. […]