Pages

Friday, March 23, 2012

Tests of the supernatural fail again: new study can’t replicate findings of precognition:
Who says that you can’t test the supernatural? Intercessory prayer, near-death experiences, and ESP—all have been tested (and refuted) using science; all are classical “supernatural” phenomena whose mechanisms, if they existed, would seem to defy the laws of physics (I’m not going to get into arguments about the definition of “supernatural” here). And now there’s a new paper in PLoS ONE by Ritchie et al. (free download at link, reference below) that refutes a recent paper presenting evidence for precognition: the idea that somehow one could have intimations in the present about stuff that hasn’t yet happened.
The original paper, published in 2011 in the Journal of Personality and Social Psychology by Daryl Bem of Cornell University (download the paper here, and see my post on it here), gave statistically significant evidence for precognition in several experiments. In brief, experimental subjects who were asked to memorize a list of words, and then type as many as they could remember onto a computer, did better at remembering those words to which they were subsequently exposed when presented with random selections of the initial word list and irrelevant “control” words. This implied that seeing the words later increased one’s ability to remember them in the past.
The paper, appealing as it did to many people’s love of psychic stuff, got a lot of attention; it was, I believe, a subject on my radio interview with woo-meister Alex Tsakiris at Skeptiko. (Alex loved it of course.)Bem’s experiment was criticised by other scientists, and I think there are still some attempts to replicate it in the works; my own judgment was that the results couldn’t be replicated by others. That seems to be the lesson of the paper by Ritchie et al., who took Bem’s most significant experiment and replicated it three times in three different laboratories: The University of London, The University of Edinburgh, and the University of Hertfordshire.
The results are simple: none of the three replications achieved anything near statistical significance. The respective probability values (the values that results as extreme as those seen could be due solely to chance) were 46%, 94%, and 61%; the overall probability was 83%. For “one-tailed” tests like these, results are considered significant only if the probability of attaining them by chance is 5% or less; and the replication results didn’t even come near that threshold. Conclusion: Bem’s results are severely in question.
What happened in Bem’s study if his results really were wrong? Ritchie et al. have several theories:

There were statistical and methodological “artifacts” outlined by several critics (see references 2, 3, 5, 6 and 7 in their paper)

Other variables, not recorded by Bem (subjects’ use of self-hypnosis or meditation, anxiety level, etc.) could have been responsible for the results. I don’t really understand this criticism because it seems that the “supernatural” character of precognition would be unaffected by those variables

the effect might be genuine but is hard to replicate. Ritchie et al. note that this is a common claim by psi advocates when results aren’t replicated. It’s like theologians who say, “God cannot be tested.”

The authors favor the hypothesis that Bem’s original result was due to “experimental artifacts.” They also note that there is at least one other published report of a failure to replicate Bem’s response: the paper by Robinson (2011) cited below. The PLoS paper ends with a cute conclusion:

At the end of his paper Bem urges psychologists to be more open towards the concept of psychic ability, noting how, in Alice in Wonderland, the White Queen famously stated, ‘Why, sometimes I’ve believed as many as six impossible things before breakfast’. We advise them to take a more levelheaded approach to the topic, and not to venture too far down the rabbit hole just yet.

Bem has published a response to Ritchie et al.’s piece: it’s basically a non-response, calling for more work and floating the possibility that the negative attitudes of Ritchie et al. could have had an effect on their results (that, too, would be a paranomal result). As Bem said, “Ritchie, Wiseman, and French are well known as psi skeptics, whereas I and the investigators of the two successful replications are at least neutral with respect to the existence of psi.” That’s a pretty lame defense. Why would you re-test someone’s results if you weren’t a skeptic? On Thursday Ritchie et al. published a response to Bem’s critique.
An interesting side note: Chris French, one of the authors of the Ritchie et al. paper, wrote a piece in the Guardian, “Precognition study and the curse of the failed replications,” giving his take on Bem’s study and describing their own difficulties in getting their failure of replication published. It was rejected by three journals, including the original journal—the Journal of Personality and Social Psychology—before it was finally accepted in PLoS ONE! The unwillingness of the original journal’s editor to even send Ritchie et al.’s paper out for review is reprehensible, particularly in light of the splash made by Bem’s paper. Extraordinary results deserve extraordinary scrutiny. As French notes:

This whole saga raises important questions. Although we are always being told that “replication is the cornerstone of science”, the truth is that the “top” journals are simply not interested in straight replications – especially failed replications. They only want to report findings that are new and positive.
Most scientists are aware of this bias and will rarely bother with straight replications. But straight replication attempts are often exactly what is required, especially when dealing with controversial claims. For example, parapsychologists are typically happy to accept the findings of a new study if it replicates a previously reported paranormal effect. However, if it fails to do so, they are likely to blame any deviation from the original procedure, no matter how minor. It was for this reason that we chose to follow Bem’s procedure as closely as possible (apart from a minor methodological improvement).
Given the high cost of paper publications and the high submission rejection rate of “top” journals, it might be argued that rejecting replication studies was defensible in the pre-internet era. But what would prevent such journals from adopting a policy of sending reports of replications, failed or otherwise, for full peer review and, if accepted, publishing the abstract of the paper in the journal and the full version online? Otherwise, publication bias looks set to remain a major problem in psychology and science in general.

Doubt and replication are the sine qua non of science, and journals must always send out failed attempts to replicate for peer review, and find a way to publish them if they’re sound.