Strolling through the Garden of Forking Paths

The other day I got into another Twitter argument – for which I owe Richard Morey another drink – about preregistration of experimental designs before data collection. Now, as you may know, I have in the past had long debates with proponents of preregistration. Not really because I was against it per se but because I am a natural skeptic. It is still far too early to tell if the evidence supports the claim that preregistration improves the replicability and validity of published research. I also have an innate tendency to view any revolutionary proposals with suspicion. However, these long discussions have eased my worries and led me to revise my views on this issue. As Russ Poldrack put it nicely, preregistration no longer makes me nervous. I believe the theoretical case for preregistration is compelling. While solid empirical evidence for the positive and negative consequences of preregistration will only emerge over the course of the coming decades, this is not actually all that important. I seriously doubt that preregistration actually hurts scientific progress. At worst it has not much of an effect at all – but I am fairly confident that it will prove to be a positive development.

Curiously, largely due to the heroic efforts by one Christopher Chambers, a Sith Lord at my alma mater Cardiff University, I am now strongly in favor of the more radical form of preregistration, registered reports (RRs), where the hypothesis and design is first subject to peer review, data collection only commences when the design has been accepted, and eventual publication is guaranteed if the registered plan was followed. In departmental discussions, a colleague of mine repeatedly voiced his doubts that RRs could ever become mainstream, because they are such a major effort. It is obvious that RRs are not ideal for all kinds of research and to my knowledge nobody claims otherwise. RRs are a lot of work that I wouldn’t invest in something like a short student project, in particular a psychophysics experiment. But I do think they should become the standard operating procedure for many larger, more expensive projects. We already have project presentations at our imaging facility where we discuss new projects and make suggestions on the proposed design. RRs are simply a way to take this concept into the 21st century and the age of transparent research. It can also improve the detail or quality of the feedback: most people at our project presentations will not be experts on the proposed research while peer reviewers at least are supposed to be. And, perhaps most important, RRs ensure that someone actually compares the proposed design to what was carried out eventually.

When RRs are infeasible or impractical, there is always the option of using light preregistration, in which you only state your hypothesis and experimental plans and upload this to OSF or a similar repository. I have done so twice now (although one is still in the draft stage and therefore not yet public). I would strongly encourage people to at least give that a try. If a detailed preregistration document is too much effort (it can be a lot of work although it should save you work when writing up your methods later on), there is even the option for very basic registration. The best format invariably depends on your particular research question. Such basic preregistrations can add transparency to the distinction between exploratory and confirmatory results because you have a public record of your prior predictions. Primarily, I think they are extremely useful to you, the researcher, as it allows you to check how directly you navigated the Garden of Forking Paths. Nobody stops you from taking a turn here or there. Maybe this is my OCD speaking, but I think you should always peek down some of the paths at least, simply as a robustness check. But the preregistration makes it less likely that you fool yourself. It is surprisingly easy to start believing that you took a straight path and forget about all the dead ends along the way.

This for me is really the main point of preregistration and RRs. I think a lot of the early discussion of this concept, and a lot of the opposition to it, stems from the implicit or even explicit accusation that nobody can be trusted. I can totally understand why this fails to win the hearts and minds of many people. However, it’s also clear that questionable research practices and deliberate p-hacking have been rampant. Moreover, unconscious p-hacking due to analytical flexibility almost certainly affects many findings. There are a lot of variables here and so I’d wager that most of the scientific literature is actually only mildly skewed by that. But that is not the point. Rather, I think as scientists, especially those who study cognitive and mental processes of all things, shouldn’t you want to minimize your own cognitive biases and human errors that could lead you astray? Instead of the rather negative “data police” narrative that you often hear, this is exactly what preregistration is about. And so I think first and foremost a basic preregistration is only for yourself.

When I say such a basic preregistration is for yourself, this does not necessarily mean it cannot also be interesting to others. But I do believe their usefulness to other people is limited and should not be overstated. As with many of the changes brought on by open science, we must remain skeptical of any unproven claims of their benefits and keep in mind potential dangers. The way I see it, most (all?) public proponents of either form of preregistration are fully aware of this. I think the danger really concerns the wider community. I occasionally see anonymous or sock-puppet accounts popping up in online comment sections espousing a very radical view that only preregistered research can be trusted. Here is why this is disturbing me:

1. “I’ll just get some fresh air in the garden …”

Preregistered methods can only be as good as the detail they provide. A preregistration can be so vague that you cannot make heads or tails of it. The basic OSF-style registrations (e.g. the AsPredicted format) may be particularly prone to this problem but it could even be the case when you wrote a long design document. In essence, this is just saying you’ll take a stroll in the hedge maze without giving any indication whatsoever which paths you will take.

2. “I don’t care if the exit is right there!”

Preregistration doesn’t mean that your predictions make any sense or that there isn’t a better way to answer the research question. Often such things will only be revealed once the experiment is under way or completed and I’d actually hazard the guess that this is usually the case. Part of the beauty of preregistration is that it demonstrates to everyone (including yourself!) how many things you probably didn’t think of before starting the study. But it should never be used as an excuse not to try something unregistered when there are good scientific reasons to do so. This would be the equivalent of taking one predetermined path through the maze and then getting stuck in a dead end – in plain sight of the exit.

3. “Since I didn’t watch you, you must have chosen forking paths!”

Just because someone didn’t preregister their experiment does not mean their experiment was not confirmatory. Exploratory research is actually undervalued in the current system. A lot of research is written up as if it were confirmatory even if it wasn’t. Ironically, critics of preregistration often suggest that it devalues exploratory research but it actually places greater value on it because you are no longer incentivized to hide it. But nevertheless, confirmatory research does happen even without preregistration. It doesn’t become any less confirmatory because the authors didn’t tell you about it. I’m all in favor of constructive skepticism. If a result seems so surprising or implausible that you find it hard to swallow, by all means scrutinize it closely and/or carry out an (ideally preregistered) attempt to replicate it. But astoundingly, even people who don’t believe in open science sometimes do good science. When a tree falls in the garden and nobody is there to hear it, it still makes a sound.

Late September when the forks are in bloom

Obviously, RRs are not completely immune to these problems either. Present day peer review frequently fails to spot even glaring errors, so it is inevitable that it will also make mistakes in the RR situation. Moreover, there are additional problems with RRs, such as the fact that they require an observant and dedicated editor. This may not be so problematic while RR editors are strong proponents of RRs but if this concept becomes more widespread this will not always be the case. It remains to be seen how that works out. However, I think on the whole the RR concept is a reasonably good guarantee that hypotheses and designs are scrutinized, and that results are published, independent of the final outcome. The way I see it, both of these are fundamental improvements over the way we have been doing science so far.

But I’d definitely be very careful not to over-interpret the fact that a study is preregistered, especially when it isn’t a RR. Those badges they put on Psych Science articles may be a good incentive for people to embrace open science practices but I’m very skeptical of anyone who implies that just because a study was preregistered, or because it shares data and materials, that this makes it more trustworthy. Because it simply doesn’t. It lulls you into a false sense of security and I thought the intention here was not to fool ourselves so much any more. A recent case of data being manipulated after it was uploaded demonstrates how misleading an open data badge can be. In the same vein, just because an experiment is preregistered does not mean the authors didn’t lead us (and themselves) down the garden path. There have also been cases of preregistered studies that then did not actually report the outcomes of their intended analyses.

So, preregistration only means that you can read what the authors said they would do and then check for yourself how this compares to what they did do. That’s great because it’s transparent. But unless you actually do this check, you should treat the findings with the same skepticism (and the authors with the same courtesy and respect) as you would those of any other, non-registered study.

Sometimes it is really not that hard to find your way through the garden…

Post navigation

2 thoughts on “Strolling through the Garden of Forking Paths”

“A recent case of data being manipulated after it was uploaded demonstrates how misleading an open data badge can be.”

But – correct me if I’m wrong as I’m not sure which case you’re referring to – isn’t the point that any post-sharing data manipulation would be possible to spot because ‘the internet never forgets’ the original posted data? (At any rate it’s much easier to spot in such cases than with closed data.)

No and that’s part of what is so dodgy about this case (it was that sadness makes you see blue study). If I remember correctly, the original data were deleted so if people hadn’t downloaded it already there would have been no way to trace that back. The paper has since been retracted and the authors had been responding in a reasonable manner from all I can tell, so I don’t want to blow this particular case out of proportion – but it means that there is scope for all sorts of dodginess here.