Saturday, July 27, 2013

There has been a lively debate recently about study pre-registration, a publishing model (or online repository) where detailed methodological and statistical plans for an experiment are registered in advance of data collection. The idea is to eliminate questionable research practices such as failing to report all of a study's dependent measures, deciding whether to collect more data after looking to see whether the results are significant, and selectively reporting studies that 'worked.'

Chris Chambers and Marcus Munafo wrote a widely discussed article that appeared in the Guardian:

Open letter [with over 80 signatories]: We must encourage scientific journals to accept studies before the results are in

. . .

[The current] publishing culture is toxic to science. Recent studies have shown how intense career pressures encourage life scientists to engage in a range of questionable practices to generate publications – behaviours such as cherry-picking data or analyses that allow clear narratives to be presented, reinventing the aims of a study after it has finished to "predict" unexpected findings, and failing to ensure adequate statistical power. These are not the actions of a small minority; they are common, and result from the environment and incentive structures that most scientists work within.

The Open Data badge is earned for making publicly available the digitally shareable data necessary to reproduce the reported results.

The Open Materials badge is earned by making publicly available the components of the research methodology needed to reproduce the reported procedure and analysis.

The Preregistered badge is earned for having a preregistered design and analysis plan for the reported research and reporting results according to that plan. An analysis plan includes specification of the variables and the analyses that will be conducted.

One could imagine the introduction of two new demerit badges for Questionable and Rejected work.1

Questionable badges are issued when the committee suspects that questionable research practiceshave been used, as outlined in the paper by John et al. (2012).

The Rejected badge is earned when there is a suspicion that outright fraud may have occurred. This will typically spur an inquiry.

While an admirable goal, there may be aspects of this scheme that the proponents haven't fully considered.

The pre-registration of study designs must be resisted, says Sophie Scott

. . .

...there are numerous problems with the idea. Limiting more speculative aspects of data interpretation risks making papers more one-dimensional in perspective. And the commitment to publish with the journal concerned would curtail researchers’ freedom to choose the most appropriate forum for their work after they have considered the results.

. . .

Moreover, in my fields (cognitive neuroscience and psychology), a significant proportion of studies would simply be impossible to run on a pre-registration model because many are not designed simply to test hypotheses. Some, for instance, are observational, while many of the participant populations introduce significant sources of complexity and noise; as introductions to psychology often point out, humans are very dirty test tubes.

One possible outcome is that certain types of research are privileged over others.2 The badge manifesto states that...

Badges do not define good practice, they certify that a particular practice was followed.

I find this assertion to be kind of hollow in the absence of badges issued for these other types of research, considered unsuitable for Preregistration. Therefore, in the spirit of fair play, I hereby introduce three new badges!

The Exploratory badge is issued to meritorious research that is not hypothesis-driven. This could include characterization of disease states and vast swaths of the neuroimaging literature ("Human Brain Mapping"), particularly in the early days. Not to mention the entire Human Connectome Project...

The Fishing Expedition badge can be earned by imaging studies that use exciting new methods like multi-voxel pattern analysis in neural decoding ("mind reading") applications, machine learning approaches to classify patient vs. control groups, and the latest in data mining ("Big Data").

Sophie Scott has compiled the thoughts of researchers with varying degrees of opposition to pre-registration. Some are not totally opposed, but have questions on how it will be implemented and how it might be problematic for certain types of research. I fall into this latter camp.

The one current publication format for Registered Reports, in the journal Cortex, "guarantees publication of their future results providing that they adhere precisely to their registered protocol."

I'm not sure this would work in studies with children, patients, or other difficult populations, where everything is not always predictable in terms of task performance, nature of the brain response, etc. In my blurb on Sophie's blog, I said:

Another of your examples, neuropsychological case studies, is particularly difficult. Are you not supposed to test the rare individual with hemi-prosopagnosia or a unique form of synesthesia? Many aging and developmental studies could be problematic too. What if your elderly group is no better than chance in a memory test that undergrads could do at 80% accuracy? Maybe your small pilot sample of elderly were very high performers and not representative? Obviously, being locked into publishing such a study would set you back the time it would take to make the task easier and re-run the experiment. You could even say in the new paper that you ran the experiment with 500 items in the study list and the elderly were no better than chance. Who's to say that a reviewer would have caught that error in advance?

At any rate, I think it's important to have these kinds of discussions. And to freely distribute new kinds of badges.

Footnotes

1 Just to be clear, I made these up.

2 I'm not at all opposed to pre-registration, and I think it'll be an interesting experiment to see whether research practices improve and "scientific quality," or replicability, increases. But I can see the danger in that being viewed as "saintly" research with the rest of it tainted.

10 Comments:

Let me first declare - I was one of the signatories to the Guardian letter. Despite signing, I have expressed some reservations about pre-registration - but signed-up because I think we have little alternative and prereg 'may' offer a way forward for 'some' problems in psychology!

Contrary to the expressed views of some, I think that psychology *specifically* is in big trouble - and any denial of this makes things worse. Of course, other fields (eg genetics) have some specific problems, but psychology scores more poorly (than other sciences) on almost every variety of questionable research practice. I recently outlined many problems specific to psychology in my 'Negativland' overview paper (Open Access) http://www.biomedcentral.com/2050-7283/1/2

Psychology is in a mess - and part of the reason 'is' the dragging of our heels over any change! The problems are not new - they are as old as experimental psychology itself...and we do and have always led the field in questionable research practices!

I dont think anyone believes that preregistration fits everything and the counter-examples provided are often compelling, but they are no reason to reject preregistration. Rather, the counter-examples should make us look more closely at those areas.

Take the stated example of single cases in cognitive neuropsychology - an area that I have published in extensively over 20 years. Single cases are accidents of nature - we cannot fortell when or how they will arise, they are exploratory - and often...fishing expeditions that evolve through time. So we can see why they may not be ripe for prereg. Serendipity is key to case studies, but we must also note - this means they are hardly ever replicable. Almost nobody tries to replicate in cog neuro and moreover, where examined two labs may fail to replicate findings even when testing exactly the same patient. Reasons may include the fact that single case studies frequently have no controls, sometimes no stats (examples from a paper of mine here: http://uhra.herts.ac.uk/bitstream/handle/2299/1571/103138.pdf?sequence=1). So, while single cases may not fit prereg, they have a host of 'cultural' problems that have been ignored and need addressing. So, perhaps we need to ask not just why some areas don’t fit prereg, but whether each specific area of psychology has its own dedicated questionable research culture that needs addressing - for some, preregistration may be a solution; for others, different tactics may be required?

I enjoyed reading this post, coming as it did after some of the less-measured responses to Sophie Scott’s article. Like you, I am not opposed to pre-registration being hypothesised as a potential solution. I do have concerns about how it might be implemented. I do object to the premature conclusion that pre-registration is THE solution without evidence being presented to support that conclusion.

There is currently no evidence from our field to support claims about the success of pre-registration. With journals like Cortex now offering pre-registration, we can expect some evidence to become available about actual rates of publication of null findings and, more importantly, evidence for the presumed increased rate of reproducibility of positive findings from pre-registered studies. After all, if those findings prove to be no more reproducible than others from non-pre-registered studies, pre-registration will have been a costly exercise to no avail given the limited resources most psychological scientists work with.

The field of genetics did not adopt pre-registration as a solution to similar problems besetting it. Instead, it adopted large-scale replication samples involving considerable collaborative effort and data sharing. Neuroimaging genetics has adopted this approach through the ENIGMA consortium, in which I and my colleagues are actively involved. As this is already evidence that an alternative approach is viable, I find it odd that some bloggers are purposefully limiting the discussion to the implementation of their preferred method of pre-registration.

So, in my view, the debate should not be misrepresented as ‘pre-registration versus no pre-registration’ for dealing with the problems facing our field. Rather, the debate should canvass a range of options. There is no reason why pre-registration should be afforded a privileged position in this debate, particularly as there is currently no scientific evidence to support it.

Haha! Thanks for this great post. Nice to see some much-needed humour injected into this debate.

I'd like to pick up on one point you make about the particular participant sample (e.g. older adults) failing a basic task check (e.g. performing at chance).

At Cortex we're aware of this concern and have tried to address it by including the following criterion for publication at Stage 1 (prior to IPA):

• "Whether the authors have considered sufficient outcome-neutral conditions (e.g. absence of floor or ceiling effects; positive controls) for ensuring that the results obtained are able to test the stated hypotheses"

This criterion returns at Stage 2 (following study completion) and would lead to the manuscript being rejected if it fails:

• "Whether the data are able to test the authors’ proposed hypotheses by passing the approved outcome-neutral criteria (such as absence of floor and ceiling effects)"

This could of course happen through no fault of the authors - as you point out, just getting unlucky in some way. In this case the manuscript would be rejected, probably to the mutual satisfaction of everyone concerned (authors and journal).

Full details can be found here:http://www.elsevier.com/inca/publications/misc/PROMIS%20pub_idt_CORTEX%20Guidelines_RR_29_04_2013.pdf

I'm not really getting this scenario where non-prereg research is stigmatized out of existence. I think there is room for both, and the prereg movement is not in any danger of taking over all editorial decisions. My response to Scott's piece is here:

"I do object to the premature conclusion that pre-registration is THE solution..."

Can I just reiterate that, to my knowledge, nobody has suggesting that pre-registration is THE solution.

This misconception keeps appearing, not helped by Sophie Scott stating (erroneously) that we tout pre-registration as a "panacea" for QRPs. We do no such thing.

Instead, we believe it is part of the solution because it neutralises a variety of practices which we know are harmful to science, such as p-hacking, HARKing, and low statistical power. Obviously there are other potential solutions to QRPs.

If we're going to have a constructive discussion about how to improve research practices, we need to move beyond straw man arguments and engage each other on the key issues at hand.

"If we're going to have a constructive discussion about how to improve research practices, we need to move beyond straw man arguments and engage each other on the key issues at hand."

Check.

So, here is an attempt to clarify something I have been wondering about. It relates to statements like "there is no evidence yet that pre-registration is better" or "it remains an emperical issue whether pre-registration will improve things".

Now, if these issues are seen as problematic, then I would think that you would already be able to state upfront that pre-registration improves things [technically 1) would be improved only at the moment when (some of) the results of a Cortex-model- pre-registered study turned out the be non-sign.], or am I wrong ?

If this is correct, then sentences like "there is no evidence yet that pre-registration is better" or "it remains an emperical issue whether pre-registration will improve things" could perhaps be stated more optimally/ in more detail. For instance, do these statements mostly refer to whether pre-registered findings turn out to be more replicable or not?

(one thing -- to your list of three practices that pre-registration counteracts, I would also add a fourth: p-hacking)

It's worth taking a moment to consider (a) whether the argument that we need evidence that pre-registration 'works' is a sensible one; and (b) if so, what such evidence would look like.

In terms of the first question, most (if not all) of us would agree that p-hacking, HARKing, low statistical power, and publication bias all have a negative effect on science.

Therefore, one could argue from a purely logical point of view that instituting practices which counteract them must improve science. This argument assumes (a) that pre-registration effectively counteracts such practices; and (b) that pre-registration doesn't introduce other new problems.

I think both premises hold. For instance, unless one believes in precognition it isn't possible for the results to determine publication outcomes if editorial decisions are made prior to data collection. Therefore, pre-registration *must* help reduce publication bias. At the same time, I await any convincing arguments for what additional problems would be introduced by optional pre-registration (or which couldn't be solved by minor tweaks to existing protocols, such as the Cortex Registered Reports initiative).

That said, I think it is useful to consider how we would gauge the success of the pre-registration initiative (or indeed any scientific practice) in the years to come. There are many possible measures (e.g. PPV; pub bias), but the broadest gold standard is the extent to which our research produces consistent and convergent conclusions - i.e. how often our results are replicated and translated into useful outcomes.

Here we run into a wall. Psychology and cognitive neuroscience have a very poor track record of direct replication. Instead we've focused on weaker indirect replication (or worse, the flabby compromise of 'conceptual' replication).

This is where pre-registration again helps. Pre-registration will incentivise direct replication, thus helping to generate the very tools (metrics) we need in order to judge the long term success of science.

The ‘misconception’ that you are only proposing pre-registration as a solution might have arisen because you don’t mention any alternatives in your Guardian blog or in any of your replies on this blog or elsewhere. If you don’t mention other options, you are effectively presenting pre-registration as THE solution, so I see no straw man argument misrepresenting your approach.

Misrepresenting those who express reservations about the implementation of pre-registration as not being willing to address the problems facing the field is an unwelcome development in your argument, as is the suggestion that the reservations are not ‘sensible’ or ‘logical’ in your reply to the 'astute' @anonymous above. This sort of language introduces an element of sanctimony on your part that will not help the debate, and makes me think Neurocritic's ‘demerit badge’ might actually be on the cards. As I mentioned in my earlier post, neuroimaging genetics has endorsed an alternative solution through the ENIGMA project.

There are alternatives. As I wrote in response to Dorothy Bishop’s blog, a method of rating journal output could be effective, i.e., holding journals - and editors - directly accountable for the findings they publish. This could take the form of a metric that assesses journals' quality according to the ratio of positive vs null findings they publish and the number of replication studies they publish. If the former is too high, the science in the journal is questionable. If the latter is too low, the journal editorial policy does not reflect adherence to good science. Aside from providing an incentive to publish replication studies, a journal quality metric of this sort could prove quite useful for reducing questionable research practices and reduce the reliance on impact factors, but would require endorsement from the field before it could be used in assessment exercises or by funding agencies and promotion panels. Something similar has already been proposed in the peer reviewed literature: http://bit.ly/14ouwCf

The claim that pre-registration will guarantee publication also needs to be examined. Editors retain the right to determine whether an article is suitable for submission to the journal. Let me provide a concrete example from the journal Cortex of how this can be a problem. Volume 48, Issue 7 of that journal is a Special Issue on “Language and the Motor System”: http://tinyurl.com/lqm4kya

Yet, the entire issue is composed of articles written by proponents of language embodiment. Not one article from the alternative perspective.

There appears to be a misunderstanding here, which is probably my fault for not being clearer in the first place.

1. I did not say that reservations against pre-registration are not sensible or logical. What I tried to do was ask whether the argument that we *need* evidence to advocate pre-reg is a sensible one (no value judgement implied - simply that it makes sense) because there is also a logical point to be made. Do you disagree with the argument, stated above, that there is also a logical basis for advocating pre-reg?

2. I do not accept your argument that because we focus on pre-reg as *a* way to address several QRPs that you can legitimately represent our position as pre-reg = *the* solution. Especially when we state in our Guardian piece that "Study pre-registration doesn't fit all forms of science, and it isn't a cure-all for scientific publishing." We have never argued or implied that pre-reg should be mandatory or universal. To represent our position in this way, and then attack that representation, is the textbook definition of a straw man argument.

3. I don't think this needs to turn into argument about who's solution to QRPs is better. I like the approach suggest by Hartshorne and Schachner. I don't think it addresses all concerns. Lets try all of them!

4. I am not some official spokesman for Cortex and can't comment on editorial decisions that I had no involvement in. I will note that none of the articles you refer to were Registered Reports, so I don't see how this is relevant, given the specific publication criteria we have set up for RRs.http://www.elsevier.com/inca/publications/misc/PROMIS%20pub_idt_CORTEX%20Guidelines_RR_29_04_2013.pdf

You state: "The claim that pre-registration will guarantee publication also needs to be examined."

No, what we have said is that Cortex virtually guarantees publication of studies that pass in-principle acceptance (IPA). You can see this at the above link (see criteria for manuscript acceptance at Stage 2).

Perhaps this is nothing more than a publicity stunt. Perhaps what really happened at Cortex is the following. Editors got together and wondered how to increase the impact factor and visibility of the journal. Inflated self-citations worked well a few years ago, it was about time for a new trick. Any ideas? How about this pre-registration business? That ought to attract some attention and generate controversy. I'm only half joking.

Links to this post:

About Me

Born in West Virginia in 1980, The Neurocritic embarked upon a roadtrip across America at the age of thirteen with his mother. She abandoned him when they reached San Francisco and The Neurocritic descended into a spiral of drug abuse and prostitution. At fifteen, The Neurocritic's psychiatrist encouraged him to start writing as a form of therapy.