Wednesday, 31 July 2013

A lot has been written about the
new pre-registration format in Cortex. Although the topic has been relatively
drip-feed in nature over the last few months since its initial inception [1], a
recent article in the Times Higher Education [2] voicing concerns over the
format caused a bit of a twitter storm (at least in my feed, which is largely
populated by science folk). Now this controversy has calmed down relative to a
few days ago, I thought I would chip in with my two cents. I feel this exercise
is largely for my benefit, so I can formulate my own thoughts on the matter. I
hope it is also useful to others. I am mindful of a recent article by Charlie
Brooker on our internet word emissions [3], but given the numbers of words he
has constructively vomited onto the internet over the years I think I can
indulge just this once.

The problem

The current state of affairs in
psychology is far from perfect and the issues have been well documented. From
over-inflated false positive rates due to ‘p-hacking’ [4] to underpowered
studies [5]. In my office we often discuss the ‘post-hoc’ nature of hypotheses
in fMRI papers – writing a manuscript as if you designed the study to answer
the question you have unexpectedly answered, as opposed to the question you
wanted to ask originally. I don’t intend to deliberate on these issues. The
fact that they are issues is relatively uncontroversial. The debate largely revolves
around how much of a problem these issues actually are (e.g. how prevalent they
are). Some would argue that the flexibility of the current system is what makes
our field so dynamic and creative. Impose undue restrictions, such as
pre-registration, and we would be conducting experiments with one hand behind
our back. My personal take is that these issues are important and need to be
addressed. Whether pre-registration is the cure for what ails research is open
to debate, but I firmly believe more needs to be done in order to promote replicability
across the field.

Whether pre-registration is the cure is an empirical question

I saw this point originally made
by Rolf Zwaan on twitter: “Is pre-reg better? It's an empirical question so no
wholesale adoption: we need a control group”. This is an important point. At
present we know there is “a problem” (with the obvious caveat that the extent
of the problem is a matter of debate). Although we can debate ad nauseam about whether
pre-registration is the solution to the problem, it is clearly amenable to
empirical testing. We can ask questions such as, are pre-registered studies
more likely to replicate than studies that were not pre-registered? It seems to
me there is no apriori reason to believe that pre-registration will not work, so
why not wait and review the data when we have it? Of course, this means we
shouldn’t adopt the pre-registration model wholesale, but to my knowledge
no-one was proposing this in the first place. In sum, why not accept its
introduction, wait for the data to come in, and then judge?

‘On average’ versus ‘paper-by-paper’

Fair enough, you might say, but
Chris Chambers, the scientist behind the introduction of pre-registration in
Cortex, has argued that pre-registered studies have “a substantially higher
truth value than regular studies” [1]. This is certainly a provocative statement,
and I am uncomfortable with the use of the word ‘truth’, as it is asking for
philosophical types to hijack the debate and start arguing about whether
science can ever reveal ‘truth’. Instead let us discuss ‘reliability’, or even
better ‘replicability’. Replicability is perhaps a more useful word as it is
something we can measure (see above) and measurement is inherently ‘good’. So
does pre-registration increase replicability? The short answer is: probably.If issues such as p-hacking are real concerns,
pre-registration should at least decrease the probability that a paper is
p-hacked, decreasing false positive rates. This should mean that pre-registered
papers are ‘on average’ more likely to be replicable than non-preregistered
studies.

This issue of ‘on average’ had
been nagging me for some time, but the issue was crystallised for me by James
Kilner [6]. Actually, I believe this issue might get to the heart of the
disagreement between the pro- and anti-preregistration camps.
Pro-preregistrationers (I’m not sure that is actually a word) are primarily
making an ‘on average’ argument. They argue that pre-registered articles ‘on
average’ will be more likely to replicate than non-preregistered articles. This,
of course, does not mean that if you take an article that has since been
replicated and one that turned out to be a false positive that you can decide
whether one was pre-registered and the other was not. The two ‘replicability’
distributions will undoubtedly overlap. Any individual paper must, as always,
be judged ultimately on its own merits. Just as I may ‘believe’ an fMRI study
where the results are significant following correction for multiple comparisons
relative to a study that reports uncorrected effects, I may be more inclined to
‘believe’ a pre-registered relative to a non-preregistered study. I might still
read the pre-registered study and decide it is terrible based on other criteria
though! Anti-preregistrationers (definitely not a word) are, perhaps
justifiably, worried that their non-preregistered studies will be automatically
dismissed as ‘non-truthy’. I reality, I don’t think this is likely to happen.
Just as the scientific method is messy and chaotic, the way we read and judge
published studies is messy and chaotic. We all have different criteria by which
we judge papers. The introduction of pre-registration seems unlikely to change
these habits, so why worry?

The problem is societal

Ultimately, the issue we
currently face within the fields of psychology and cognitive neuroscience is societal.
On this issue I am in agreement with Micah Allen: “My position is that the
"crisis" has more to do with our publish-or-perish culture”. The
dubious practices used by specific individuals are primarily a product of the
scientific society in which we find ourselves. Again, I do not want to go into
specifics, but the phrase “publish-or-perish” sums up the problem succinctly
enough. As a young researcher I have felt the undeniable pressure to publish in
‘high-impact’ journals and publish there often. This pressure has never come
directly from a supervisor or colleague, but simply from the ‘mood-music’ of
science – the constant conversations about who got what published where, who
got what fellowship to go where, who got what prize and why. I have been lucky
enough to have great supervisors from undergraduate to post-doc. Perhaps others
are not quite so lucky, but the pressure is there regardless of your
supervisor.

Will pre-registration address
this issue? In short, no. It is primarily a tool, which could increase the
replicability of a small subset of studies but is unlikely to be adopted across
the board (indeed, I don’t think anyone would argue it should be adopted across
the board). It may have a small effect on our scientific culture in that it
could change the ‘mood-music’, increasing awareness of these issues across labs
and departments. The more people give voice to the problems we face, the more
likely people are to not utilise particular questionable practices. However, I
would argue that this effect is largely a by-product of the debate surrounding
pre-registration as opposed to pre-registration itself.

Final words

The format of Cortex’s
pre-registration model has a lot to like about it. In particular, the
requirement to formally state whether specific effects were predicted or came
from post hoc observational analyses
is great. It is difficult to argue that such a distinction should stifle
scientific creativity and could readily be adopted in more journal formats.
Pre-registration would ensure people had actually predicted specific effects
prior to conducting the study, but let us be optimistic (just this once) and
trust that scientists, when asked to conform to such a format, would be
truthful (without the need to pre-register). In other words, this distinction
should probably already happen and we shouldn’t need a pre-registration model
to force our hand.

In relation to replication, I was
lucky enough to recently find an unexpected, and therefore exciting, result in
a behavioural experiment. I’m sure I could have published this result without
further investigation however I spent the next month replicating the effect
using the same analysis methods that I developed in the first experiment. This
isn’t to congratulate myself on being a good scientist (I’m sure I’ve committed
just as many sins as the next psychologist), but it is to say replication
should occur within lab. Across-lab replication is more reassuring, but if we
aren’t bothering to replicate, particularly when we find something unexpected,
then it is little wonder why false positives creep into the literature.

To summarise this overly long
first blog post, I thought I was going to take the middle ground in this
argument. In actual fact, I have persuaded myself that pre-registration is probably
a good thing. What I definitely believe is that it isn’t a bad thing, and we
shouldn’t fear it. I have been selective in what I have discussed, and there
are more arguments both for and against. These debates are healthy and
productive. That, in my mind, is perhaps the biggest boon of the introduction
of the pre-registration format – the open debate that could potentially nudge
our current scientific culture towards valuing reliability and replicability,
without an adverse effect on the creativity within our field.

Acknowledgements: Thanks to Micah Allen and James Bisby for commenting on a first draft and therefore giving me the guts to publish.

Tuesday, 30 July 2013

Well, I've finally set up my own blog. This first entry isn't a blog as such, more a test to see if I can actually post something. I hope to blog on a reasonably regular basis on matters psychological and neuroscientific. We'll see how it goes...