Here’s what happens when scientists evaluate research without knowing the results

Last week, we published a Q&A on an initiative to have scholars “pre-register” research designs for studies using the American National Election Study before the data was available to actually conduct the analysis. As detailed in that discussion, one of the problems this was designed to solve is the “file drawer” problem, whereby positive results are more likely to get published than negative results, thus skewing the accumulation of scientific knowledge.

Another way to try to accomplish the same goal is what’s known as “results-free” reviewing, whereby papers are reviewed without the results of the analysis — that is, with only the research question, design and planned method of analysis. As it turns out, the journal Comparative Political Studies will soon publish a special issue featuring a pilot test of results-free review of social science research.

Two years ago, the editors of the special issue described the logic behind results-free reviewing here at The Monkey Cage. With the special issue now about to be published, we thought it would be interesting to ask those involved with the process about their initial impressions. This included the special issue’s editors, authors of a results-free manuscript submission, a formerly anonymous referee who reviewed one of the results-free submissions, and the standing editors of Comparative Political Studies. Each responded to a different question based on their experience.

The Monkey Cage (TMC): Question for the special issue editors Mike Findley (University of Texas), Nathan Jensen(University of Texas), Edmund Malesky(Duke University) and Tom Pepinsky (Cornell University): Is results-free review a solution to the problem of publication bias?

It certainly helps us get closer to a solution to publication bias, where papers that find evidence of their theories are more likely to be published than those that don’t. This is a problem across academic disciplines, because scientific progress works best when we have a record of not only the ideas that worked, but also the theories that proved unsuccessful. Null results, a term that we use for when a researcher does not find evidence for their theory, stand a greater chance of publication when reviewers are not aware of empirical results from the study. Overseeing this review process showed us how reviewers and editors evaluate such research. Reviewers and editors want results, and they judge unpublished research with results in mind. Results-free review forces reviewers to engage the theory and research design without the distraction of “significance stars,” tiny asterisks that scholars put in their results tables to draw attention to the fact that their test results are statistically significant. It also liberates authors to follow through on their planned research design without the pernicious incentive to reanalyze data until desired findings emerge. As long as authors do not deviate substantially from their design, they are guaranteed publication regardless of the findings. This is good for science.

However, we qualify this answer in two ways.

First, results-free review is only one way to tackle publication bias. Ideally, we could minimize publication bias by changing scholarly reliance on arbitrary statistical thresholds of “significant versus insignificant” to determine what counts as support for a theory. Results-free review works differently through understanding that even conscientious reviewers and well-meaning authors face strong incentives to produce splashy findings. Of course, other practices such as encouraging scholars to post their research designs, data and statistical analysis procedures for others to replicate and troubleshoot can reduce publication bias as well.

Second, implementing results-free review would affect how social scientists produce knowledge. Some of these changes may be desirable, others not. Results-free review worked very well with experimental social science research that tests incremental improvements over existing theory. Research questions that are too far afield from current scholarship may especially struggle in this system because reviewers need to believe a null result would be interesting to endorse publication. Results-free review is also an awkward fit for descriptive, exploratory or ethnographic research.

Different communities have different values for social science research. Our experiences do not tell us what we should value, only what happens under a different procedure. This gives us pause. We suspect that a mandatory results-free review practice across the social sciences would lead to homogenization of social scientific research, which we believe would be harmful. We do support adoption of results-free review as one submission track, as the Journal of Experimental Political Science has recently adopted, in a wider set of academic journals.

We had an overwhelmingly positive experience with results-free review. It was liberating to conduct our field research knowing that the project would be published regardless of our findings. When it turned out that our project did have null findings, we were relieved – especially as junior faculty – that our study, including the time and resources we invested in it, would still contribute to public scientific knowledge.

Our research design was also considerably sounder than it would have been without results-free review because the review process preceded the research itself. In addition to pushing us to strengthen our theory and think harder about the implications of our case selection, the review process improved major elements of the proposed experiment. Moreover, it encouraged us to write about the potential for null findings. It is difficult to get the unvarnished feedback provided by high-quality anonymous reviewers via other settings.

We note, however, two limitations that we encountered. First, we felt that the pre-acceptance tied our hands, causing us to stick with our pre-accepted plan even as we started to think of potentially useful changes to the research design. As results-free review becomes more widespread, standards to deal with useful revisions will need to be devised. Second, we encountered unanticipated delays in project implementation that pushed against the journal’s deadline for the submission of our results. Although this issue was ultimately resolved, we can imagine similar scenarios playing out in other complicated field studies that might make results-free review difficult to manage for journals trying to maintain a steady pipeline of articles..

Overall, we believe the benefits of results-free review outweighed these costs, and we hope that more journals in the social sciences will consider adopting a results-free review process, at least some of the time.

TMC: Question for reviewerMatt Winters (University of Illinois): As a reviewer, would you like more journals to adopt results-free review?

I think that results-free reviewing is valuable in two ways, both of which make me want to see more opportunities for results-free review.

First, there is the benefit to the scientific process. We can take steps to reduce the “file-drawer problem” in which null results go unpublished while false positives take up journal space. Authors also should be more receptive to reviewer suggestions for changes to research design when they receive those suggestions before they have done the research.

Second, there is a benefit to the reviewer. Reviewing a manuscript without results forced me to think about my standard operating procedures and about how I normally judge manuscripts. I found myself thinking about the extent to which I can be swayed as a reviewer by persuasive writing influenced by results. When authors have results to trumpet, the reading experience is more pleasurable. In reading through a manuscript without a set of results, I had to think more about what I should find compelling about a research question, and to ask myself what I would learn if the authors did not find evidence in favor of their hypothesis.

Not only might this shift the way I review manuscripts in the future, but it might also lead me to write more compellingly about my own research.

It is worth noting that we already do a lot of results-free reviewing. Anyone assessing grant proposals or sitting on a committee giving out fellowship money must take a stand on which research sounds more or less promising without knowing exactly what the results of the research will be. In advising students, we similarly must react to their initial ideas for dissertations or theses without knowing the results.

In this regard, results-free review is something with which we already have a lot of practice. Using it as a way of allocating journal space seems to be something to which reviewers should easily adapt.

TMC: Question for the journal editors Ben Ansell (Oxford University) and David Samuels (University of Minnesota): Do the gains of a results-free process outweigh the costs?

The other contributions to this Monkey Cage post are sanguine about a “results-free” submission process. We gave the go-ahead for this experiment, and in principle we agree that reviewing and making editorial decisions based on results-free research designs should reduce publication bias. However, we wish to draw attention to potential costs in such a departure from standard practice.

First, there is simply no way to fully insulate the publishing process from “data fishing,” searching for interesting correlations in the data that were not anticipated by the author’s theory, or “hypothesis trolling,” listing so many potential hypotheses that one is bound to be correct, and submission of a paper without results cannot expose research malpractice that occurs after completion of the research design. The purported gains of reviewing results-free submissions vanish if authors have (re)written their research designs in light of findings yet do not submit those findings to editors and reviewers for scrutiny. (Indeed, some reviewers for our special issue refused to believe that submissions were truly results-free, calling such a notion naïve.) “Preregistration” offers no solution, as authors could simply preregister multiple research designs but submit only the one that “works.” Many problems and analytical decisions cannot be anticipated at the research design stage.

A second issue pertains to research transparency. Reviewers typically devote part of their review to asking authors to explain, clarify and/or extend their analyses. Yet with no data or results to interrogate, this aspect of the review process is eliminated before a publication decision is made. This might tip the scale away from an attitude of “trust but verify” toward one of “mistrust and verify,” with the costs of “interrogation” shifting onto editors.

Third, we remain unconvinced that journals should review research designs. The value of much high-impact research depends on the extent to which the results challenge existing findings. The fact that there is a lot of guesswork to reviewing and editing doesn’t mean that we should dispense with evidence in favor or against a finding’s substantive as well as statistical significance.

The issue of null results, statistical nonsupport for a hypothesis, hung over the entire process of organizing this special issue. Discussion of the substantive importance of null results is not central to our scholarly culture, partly because null findings raise potentially unanswerable questions. It is not clear how a journal’s commitment to review research designs that ultimately generate null results is an effective use of valuable scholarly real estate.

Finally, a results-free publication process creates powerful disincentives to even do certain kinds of research. Nearly all the research designs we received were experimental or quasi-experimental. Results-free submission is a nonstarter for many approaches to knowledge-generation in the social science. A results-free approach to reviewing and publishing may work for some journals, but we are not likely to repeat it at Comparative Political Studies.