Psychology and the one-hit wonder

Don’t miss an important article in this week’s Nature about how psychologists are facing up to problems with unreplicated studies in the wake of several high profiles controversies.

Positive results in psychology can behave like rumours: easy to release but hard to dispel. They dominate most journals, which strive to present new, exciting research. Meanwhile, attempts to replicate those studies, especially when the findings are negative, go unpublished, languishing in personal file drawers or circulating in conversations around the water cooler…

One reason for the excess in positive results for psychology is an emphasis on “slightly freak-show-ish” results, says Chris Chambers, an experimental psychologist at Cardiff University, UK. “High-impact journals often regard psychology as a sort of parlour-trick area,” he says. Results need to be exciting, eye-catching, even implausible. Simmons says that the blame lies partly in the review process. “When we review papers, we’re often making authors prove that their findings are novel or interesting,” he says. “We’re not often making them prove that their findings are true.”

It’s perhaps worth noting that clinical psychology suffers somewhat less from this problem, as treatment studies tend to get replicated by competing groups and negative studies are valued just as highly.

However, it would be interesting to see whether the “freak-show-ish” performing pony studies are less likely to replicate than specialist and not very catchy cognitive science (dual-process theory of recognition, I’m looking at you).

As a great complement to the Nature article, this month’s The Psychologist has an extended look at the problem of replication [pdf] and talks to a whole range of people affected by the problem, from journalists to research experts.

But I honestly don’t know where this ‘conceptual replication’ thing came from – where you test the general conclusion of a study in another form – as this just seems to be a test of the theory with another study.

It’s like saying your kebab is a ‘conceptual replication’ of the pizza you made last night. Close, but no neopolitana.

6 Comments

Replication is how we build theory therefore I don’t understand how researchers could get away with (or get upset at, as a previous post here mentioned) replication. I am getting frustrated with “science” “journalists” who flatly make a claim in the headline and then don’t even bother to post a link to the actual study.

Perhaps I’m thinking of this too simply, but I think conceptual replications are a very good thing and don’t see why people make a fuss (here and on other blogs recently).

Perhaps I am mistaken, but as I’ve always seen it, the issue is this: Direct replications are desirable. Direct replications by the same experimenter are not. You will always have more power to test your hypothesis if you combine all your data. Splitting the data in order to show “replication” is really not smart, statistically speaking. A conceptual replication serves the same purpose while showing the effect generalizes beyond the parameters of the initial experiment. This is highly desirable!

For instance, if you show that memory is better for red cars – and replicate it 100 times – that’s all well and good, but if the theory was that the color red improves memory then you better test it with a different stimulus, different timing, hell! a different modality while you’re at it. People will (or should) be much more willing to accept that the effect is real and interesting if you show it is not dependent on the precise task (stimuli, trial sequence, software, etc.). If the effect is an artifact of the task, well… you can replicate an experimental artifact a thousand times if you don’t ever change anything.

I think psypocalypse sums it up very well. Conceptual replication doesn’t replace direct replication, they both fulfill important roles. I wrote a short commentary in the Psychologist issue you link to and I’ve written another, hopefully helpful piece in the context of dissonance theory. Here’s a link to each:
Conceptual Replication: http://wp.me/p2kCK4-1h
Replicating Dissonance: http://wp.me/p2kCK4-2Y

The pizza-kebab example doesn’t begin to address the issue. Let’s say that you want to test the hypothesis that kids like fruit better than vegetables. So you run a study with apples and carrots and it turns out you’re right, kids prefer the apples to the carrots. To make the more general point, you’d still obviously want to test more fruits and vegetables. You might also want to cook them different ways or serve them at different meal-times, etc. Otherwise you’re making a point that’s restricted to apples and carrots.

Now, let’s get back to the problem I think you’re highlighting. In an effort to generalize the finding, people move too quickly away from the apple vs. the carrot and consider it settled because of later conceptual replications. I think that’s right — the original study should obviously be replicated, and if it can’t be, then it shouldn’t count as evidence in the broader fruits vs. vegetables context.

I agree with psypocalypse. Only a very few published studies cannot be replicated at all. A much more significant problem is that their effects are only found after extensive practice or with the tested stimuli or with the exact instructions used. So, the results may be correct but actually of little interest. Finding a replication with different stimuli or subject pool gives stronger evidence of the robustness of the effect.

Of course, if enough conceptual replications fail, an investigator would likely then try a direct replication too.

Journals don’t want to publish studies that didn’t find significant results, even when those studies are built upon published studies that report significant findings. Massaging the data isn’t uncommon……