New Truths That Only One Can See

Since 1955, The Journal of Irreproducible Results has offered “spoofs, parodies, whimsies, burlesques, lampoons and satires” about life in the laboratory. Among its greatest hits: “Acoustic Oscillations in Jell-O, With and Without Fruit, Subjected to Varying Levels of Stress” and “Utilizing Infinite Loops to Compute an Approximate Value of Infinity.” The good-natured jibes are a backhanded celebration of science. What really goes on in the lab is, by implication, of a loftier, more serious nature.

It has been jarring to learn in recent years that a reproducible result may actually be the rarest of birds. Replication, the ability of another lab to reproduce a finding, is the gold standard of science, reassurance that you have discovered something true. But that is getting harder all the time. With the most accessible truths already discovered, what remains are often subtle effects, some so delicate that they can be conjured up only under ideal circumstances, using highly specialized techniques.

Given the desire for ambitious scientists to break from the pack with a striking new finding, Dr. Ioannidis reasoned, many hypotheses already start with a high chance of being wrong. Otherwise proving them right would not be so difficult and surprising — and supportive of a scientist’s career. Taking into account the human tendency to see what we want to see, unconscious bias is inevitable. Without any ill intent, a scientist may be nudged toward interpreting the data so it supports the hypothesis, even if just barely.

The effect is amplified by competition for a shrinking pool of grant money and also by the design of so many experiments — with small sample sizes (cells in a lab dish or people in an epidemiological pool) and weak standards for what passes as statistically significant. That makes it all the easier to fool oneself.

Paradoxically the hottest fields, with the most people pursuing the same questions, are most prone to error, Dr. Ioannidis argued. If one of five competing labs is alone in finding an effect, that result is the one likely to be published. But there is a four in five chance that it is wrong. Papers reporting negative conclusions are more easily ignored.

Putting all of this together, Dr. Ioannidis devised a mathematical model supporting the conclusion that most published findings are probably incorrect.

Other scientists have questioned whether his methodology was skewed by his own biases. But the same year he published another blockbuster, examining more than a decade’s worth of highly regarded papers — the effect of a daily aspirin on cardiac disease, for example, or the risks of hormone replacement therapy for older women. He found that a large proportion of the conclusions were undermined or contradicted by later studies.

His work was just the beginning. Concern about the problem has reached the point that the journal Nature has assembled an archive, filled with reports and analyses, called Challenges in Irreproducible Research.

Among them is a paper in which C. Glenn Begley, who is chief scientific officer at TetraLogic Pharmaceuticals, described an experience he had while at Amgen, another drug company. He and his colleagues could not replicate 47 of 53 landmark papers about cancer. Some of the results could not be reproduced even with the help of the original scientists working in their own labs.

Given what is at stake, it seems like a moral failing that the titles of the papers were not revealed. That was forbidden, we’re told, by confidentiality agreements imposed by the labs.

Maybe the researchers deeply believed that their findings were true. But that is the problem. The more passionate scientists are about their work, the more susceptible they are to bias.

The fear that much published research is tainted has led to proposals to make replication easier by providing more detailed documentation, including videos of difficult procedures. A call for the establishment of independent agencies to replicate experiments has led to a backlash, a fear that perfectly good results will be thrown out.

Scientists talk about “tacit knowledge,” the years of mastery it can take to perform a technique. The image they convey is of an experiment as unique as a Rembrandt.

“Many scientists use epithelial cell lines that are exquisitely sensitive,” Mina Bissell, a cancer researcher at Lawrence Berkeley National Laboratory, wrote in Nature in November. “The slightest shift in their microenvironment can alter the results — something a newcomer might not spot. It is common for even a seasoned scientist to struggle with cell lines and culture conditions, and unknowingly introduce changes that will make it seem that a study cannot be reproduced.”

But that can work both ways. Embedded in the tacit knowledge may be barely perceptible tweaks and jostles — ways of unknowingly smuggling one’s expectations into the results, like a message coaxed from a Ouija board.

The problem stands to get worse. It has been estimated that the corpus of scientific knowledge has doubled in size every 10 to 15 years since the days of Isaac Newton. The National Library of Medicine’s PubMed database alone contains 23 million citations.

Exciting new results will continue to appear. But as the quarry becomes more elusive, the trophies are bound to be fewer and fewer. If a result appears only under the full moon with Venus in retrograde, is it truly an advance in human knowledge?

Raw Data is a new column by George Johnson, a science writer for The New York Times and many other publications and the author, most recently, of “The Cancer Chronicles: Unlocking Medicine’s Deepest Mystery.” His website is talaya.net; Twitter: @byGeorgeJohnson