I am happy to have been granted permission by Russell Turpin to post this very useful article. Included are comments on the original article by Eric Pepke and Ken Arromdee, also posted with permission.

Listening to the frequent discussions over controversial empirical claims, an unsophisticated reader could easily walk away with the view that only tradition and prejudice separate the sparring factions. Such a reader might think that most scientists cast a skeptical eye on paranormal phenomena, the claims for homeopathic dilution, the idea that the earth is relatively young, etc., merely because these scientists were taught opposing claims. As one poster’s signature would have it, such critics merely engage in “school of thought bashing.”

I think this view is wrong. I think it stems, in part, from an inadequate understanding of how to evaluate evidence. The evidential claims for many of these controversial notions exhibit common flaws. They are the kinds of flaws that scientists recognize from many, many past failures. It is this history of dead ends which seduced previous researchers with flawed evidence that informs the way scientists evaluate the evidential claims accompanying these controversial notions.

In this article, I will first list some of these evidential flaws and then discuss errors in relating evidence to theory. Of necessity, this is a short list that omits most such problems. It is largely biased by what I have seen in newsgroup discussions. (A true survey would require a book, of the order that David Fischer wrote for historians.) Finally, I will discuss when mere mistakes (which plague every research direction) turn into quackery.

Evidential Flaws

In the foreground of such controversies are the various studies and experiments published in journals or elsewhere. Various professional posters in the science newsgroups often complain about readers who read all such studies and experiments as if they were the same. The problems listed below are a small sampling of the kinds of issues that the critical eye brings to the reading of these studies and experiments. (I purposely omit particular issues of experimental design and statistical analysis.)

SUBJECTIVE MEASUREMENT. There are unfortunately times when a study or experiment must rely on the measurement of very subjective experience: whether a patient feels better or worse, whether two drawings are similar, etc. This element of subjectivity is notorious for introducing unintended and subtle errors into the result. Studies that eliminate this element as much as possible put the result on firmer ground. Thus, it is better to measure the effect of a medicine through chemical or physical analysis or other objectively measured symptom than through patient report, it is better to compare discrete matches rather than drawings, and it is better to count light flashes with a photodetector than with one’s eyes.

SMALL DIFFERENCES. Studies and experiments that show a small difference between the test and the control when the test result falls within what well-established theory would predict are somewhat suspicious. This kind of result begs for different experimental design, tighter controls, or investigation of other possible causes.

TIGHTER CONTROLS TURN POSITIVE RESULTS NEGATIVE. If tightening the controls in an experiment turns a positive result into a negative one, this is virtually the death knell for the alleged phenomenon. Almost always, this shows that the positive results stemmed from a phenomenon other than the one the experiment is designed to detect. Future positive results are viewed suspiciously unless a good explanation for this history is forthcoming.

CONTINUING NEGATIVE RESULTS. Negative results count more against a claim than positive results count for it. This is especially true if negative results continue over time as the alleged phenomenon is studied, even if they are few in number compared to the positive results. The reason is simple. If the phenomenon is real, those studying it should eventually reach the point where they can reliably demonstrate it and where they can teach others how to reliably demonstrate it.

It often takes a knowledge of the field of concern to evaluate these issues. The history of forward steps, set-backs, or stagnation set a context that underlies how a new study is received. This context usually is not explicit in the article or report on the study.

Theoretical Flaws

The flaws above concern a particular phenomenon that is alleged to occur and the experiments to evince it. The step from evinced pheonomena to theory is also plagued by potential error.

NO DIRECT EVIDENCE. Perhaps the most severe flaw of an empirical theory is that all evidence for it is very indirect. Sometimes this cannot be helped. For example, all historical theories suffer this flaw, since the past can only be observed through its effects on the present. (This makes the study of history particularly challenging.) But theories of current phenomena should admit fairly direct testing. For example, if the flow of qi energy through the body and the existence of molecular patterns from homeopathic dilution are true theories, those who study these things should be able to find experiments that fairly directly measure qi and these molecular patterns.

NO DEEPENING EVIDENCE. Similarly, theoretical knowledge should grow and become more detailed as experience increases. In the 1960s, molecular biologists could only mouth vague claims about DNA guiding the development of organisms. Now they can tell how this happens in more detail, and back this discussion by (tens of?) thousands of experiments that evince these details. Two centuries ago, Lavoisier described how oxygen combines with other elements to release energy. Our knowledge of chemical reaction has increased tremendously since then. But what has happened to the theoretical underpinnings of homeopathic dilution in two centuries? Why does it remain vague mouthings about “molecular patterns”?

PREDICTED PHENOMENA REMAINS SLIPPERY. As experimental and theoretical work progresses, more evidence and more sound evidence for the related phenomena should appear. If the phenomena predicted by a theory remain plagued by evidential flaws as research progresses, then the theory itself becomes very suspect.

POOR INVESTIGATION OF ALTERNATIVE EXPLANATIONS. Often the results claimed for a novel theory are potentially explained by well-founded theories. These alternative explanations need to be investigated, and such paths barred by better controls in future experiments.

REVOLUTION WITHOUT SUPPORT. A theory becomes especially suspicious when, in addition to suffering the above flaws, it directly conflicts with a theory that measures well by the same criteria. Using again the homeopathic theory of dilution as an example, if it is true, it will cause a revolution in chemistry and biology that makes cold fusion look like small potatoes. But its evidence remains far too indirect, too shallow, and too slippery to succeed at such a revolution, despite two centuries of research in it.

Where the Ducks Are

All the problems above occur within conventional theoretical and experimental investigation. Whether and how they are resolved help determine which theories are accepted and which are rejected. Scientists live on the tension between two poles. Driving them to the exotic is their eagerness to discover new and revolutionary facts. Warning them away from quackery is a skeptical eye informed by knowledge of the myriad errors that have misled others in the past. Scientists looked at N-rays, slippery water, and cold fusion because of the exciting potential to discover something new. They turned away from these things because the evidence did not pan out. John A. Wheeler invited parapsychologists into the AAAS because he thought there was beginning to be some real science in what they did. Ten years later, he knew this had been a mistake.

The attraction of the new and exotic is very strong, and its lure is so bright that it sometimes causes people to lose their critical sense. And some people, unfortunately, never develop a critical sense. Those who have lost or never developed a critical sense create and join “schools” where quackery is born from weak theories and mistaken notions becoming instutionalized. These “schools” are full of the kinds of rationalizations that people use to justify their views when nothing else is available. There are far too many of these to list, but some of the more colorful signposts are listed below.

“PARADIGM” TALK. “Paradigm” is perhaps the most abused word in these discussions. Whenever a proponent of a controversial empirical claim counters criticisms of the evidence by reference to a “paradigm shift,” it is time to put on one’s hip-waders. To the extent that “paradigm” just means a new theoretical view, it prevails because of — not in spite of — sound evidence. The rise of quantum mechanics is frequently referenced as the paradigmatic example of paradigm shift. But the discovers of quantum mechanics did not have to philosophically argue their opponents into making a paradigm shift before quantum phenomena were accepted. The proponents merely presented ever increasing amounts of solid evidence.

To the extent that “paradigm shift” is used to describe something about the social and historical process of how research is done, it has little legitimate role in discussions of evidential quality. Most other uses are so vague that no significant meaning can be attached.

THE WORD “SCIENCE” USED NARROWLY. A quack will often reply that his ideas have evidence, just not the kind accepted by “science.” The problem with this is that science is no more and no less than sum total of what we have learned about evaluating general empirical claims and their evidence. (Its application to modern research and the need for a new word such as “science” is merely because so much progress in this area has been made in the last three centuries.) With regard to general empirical claims, asserting that there is no scientific evidence is the same as asserting that there is no good evidence. Quacks want to find some room in between, but they cannot explain why we should accept the kind of evidence in their case that has proven so bad in other cases. In essence, they engage in a kind of special pleading that hangs on attaching some odd meaning to the word “science”.

“SCIENTIFIC PARADIGM.” This phrase has almost no useful meaning. (Peter Kaminski take note!) If it is used by someone defending a controversial empirical claim, it is virtually guaranteed that the argument is bullshit.

MISCHARACTERIZATION OF THE STATE OF THE ART. Quack theorists often distort the rest of science is in order to make their favored notions seem more equal in comparison. Thus, “conventional” physics is sometimes accused of ignoring the observer. (Hah!) “Allopathic” medicine is sometimes described as based on non-holistic principles, as practicing the notion of “one symptom, one diagnosis, one cure,” etc. ad nauseum. This is all bullshit.

“QUANTUM.” Unless the writer is referring to physics or chemistry, the use of phrases such as quantum, the uncertainty principle, entropy, etc., are warning signs. If they are combined with other words in novel ways — e.g.: “quantum psychology,” “democratic entropy,” etc. — it is an almost sure sign of bullshit. (For Jeremy Rifkin, the rule is reversed. His writings about entropy are bullshit especially when he discusses physics and chemistry.)

CARTS BEFORE HORSES. Proponents of quack theories are full of excuses for why they have such meagre evidence of their beliefs. These range from “no one funds us” to “the conspiratorial and established institutions ignore us for political reasons.” These excuses would not be needed if there were good evidence for the notions in question. The fact that these excuses are offered is almost an admission that the proponent believes despitea lack of good evidence. It it were otherwise, the proponent would focus on the evidence and argue for funding or institutional change because the evidence is so good, rather than excusing the lack of evidence because of these other factors.

“MILLIONS OF CHINESE CANNOT BE WRONG.” This excuse usually comes in the defense of notions resurrected from older traditions, e.g., traditional Chinese medicine. In some sense, it falls under the “big lie” tradition. In a few minutes, someone with a modicum of historical knowledge should be able to think of several cases where millions of Chinese (or Amerindians or ancient Hellenes or …) and millenia of experience werewrong. The fact is that we have learned a lot about how to perform and evaluate empirical research in the last three centuries and that this gives us a significant advantage over previous traditions. (One of the curious things about the resurrection of older traditions is that foreign traditions are more interesting that native ones. Thus, one hears arguments for qi and traditional Chinese remedies, but almost never for the four humour theory of disease and the frequent bloodletting and purges it prescribes.)

Once a “school” has developed around poor theories, it essentially halts all useful progress by its practitioners until the “school” is reintegrated with the larger scientific community. The institutionalization of theories in an uncritical atmosphere and away from the larger scientific community almost guarantees that there will be a continuing sequence of “positive” results, sometimes for centuries, even though the phenomena remain slippery, understanding remains vague, and discovery of new knowledge is left to the rest of science. In short, a duck is born. Quack, quack.

That’s an excellent summary! Here are a few thoughts I had while reading it. Some of them overlap with things you have said, especially the first one, which overlaps several of your categories, and the second, which overlaps REVOLUTION WITHOUT SUPPORT.

MARGINAL RESULTS. When faced with marginal results, scientists will attempt to refine or replicate the experiments until stronger and more consistent results are found. When a researcher spends an inordinately large amount of time interpreting and reinterpreting old data, or new data from the same experimental setup, and relatively little time attempting to get better data, the results are suspect.

MISESTIMATION OF EFFECTS. Quack researchers frequently misestimate the effects their discoveries will have. While they may speak about grandiose social effects, they frequently underestimate the scientific effects. One example is homeopathy, which would cause a revolution in chemistry if true. Yet the supporters seldom grapple with the idea of these effects. Another example is the frequent claims for a carburetor or other gizmo which will make an automobile get an incredible number of miles per gallon. Simple calculations reveal that the engine needs to operate at higher than Carnot efficiency. Personally, if I knew a way to run a heat engine at higher than Carnot efficiency and thus ignore the 2nd law of thermodynamics, I would have better things to do than waste my time building a carburetor factory.

SCIENCE AS INSTITUTION. Philosophers, psychologists, and anthropologists, when they deal with science, currently view it as “that which scientists do.” Although this definition is possibly useful for what they are trying to study, when it is used as the meaning of “scientific” in “scientific evidence,” trouble starts. The conflation of meanings leads to the notion that all those things which any scientist does are valid science. This results into a combination of appeals to authority and ad hominem attacks which are wrongly presented as scientific inquiry.

ANALOGOUS THEORIES. Many scientific theories begin as analogies to existing well established theories or as attempts to apply the results of a field of study laterally to something new. Although this sometimes produces theories which hold up well on their own, it frequently gives undeserved credence to the new theories. Well established theories generally apply to a specific well-defined set of phenomena, and the support for the theory exists within that context. The analogy or lateral application discards the context entirely. The result is a sort of informal belief that the new theory is well supported, when there may be no reason to believe that the two situations have anything to do with each other. An example of this is Social Darwinism, whereby evolution by natural selection of organisms is assumed to work as well to social institutions.

DEFENSIVENESS. It is a common human tendency to take criticism of one’s work personally and respond devensively. Scientists must constantly be aware of this tendency and suppress it, because unchecked defensiveness is the death of scientific inquiry. When a researcher consistently interprets criticism of his or her theories, hypotheses, or data as personal insults, they become suspect. The researcher falls into the trap of considering it a personal conflict and naturally resists the kind of criticism that is absolutely neccessary to test hypotheses. The first strong indication that I had of the problems with cold fusion, back when it still seemed plausible and exciting and everyone was trading speculations about mechanisms, was a letter by one of F&P [Fleischmann and Pons — whj] accusing all of their critics as attacking them personally.

I’d like to add something else, mostly because I ran across it yet again. Comments?

“IT WAS ONLY TO GET YOU TO THINK” One common tactic of crackpots is to dismiss disproofs of their claims with the excuse that the claim was not intended seriously, but was meant only to get their opponents to think, to argue properly, or some similar meta-reason. Until the crackpot gives this excuse, it is not possible to distinguish between his serious claims and his non-serious ones. Furthermore, the crackpot’s claim may contain factual errors, or sufficiently elementary logical errors, which are too simple to be useful for encouraging thought,

Several possiblities suggest themselves, none of which indicates worthiness of the crackpot’s ideas.

One possibility is that the crackpot is working backwards from his conclusion. If he does not work far enough backwards, he will come up with problematic “support” for his claim; since he does not really believe the result because of the support, but rather believes the support because of the result, he uses this excuse to dismiss the problems. In his own mind, the support is not evidence, but only a means to convince others of what he already knows, so he doesn’t consider this unfair.

Another possibility is that the crackpot’s true claim is somewhat broader than apparent at first glance. Talk of paradigms, comparisons to Galileo, etc. may suggest a general dislike of the scientific method and of what the crackpot considers the scientific establishment. When the crackpot disputes some well-known scientific result, he mainly desires not just to disprove that result, but to take scientists in general down a peg. He argues many nonscientific positions not because he strongly believes particular ones, but rather because he holds an anti-science meta-position; to him, his argument is about scientists’ ability to determine truth, not about specific truths.

“Being a surgeon, he had no difficulty in diagnosing acute appendicitis,” says his son, Vladislav. “It was a condition he’d operated on many times, and in the civilised world it’s a routine operation. But unfortunately he didn’t find himself in the civilised world – instead he was in the middle of a polar wasteland.”

[…]

“Still no obvious symptoms that perforation is imminent, but an oppressive feeling of foreboding hangs over me… This is it… I have to think through the only possible way out – to operate on myself… It’s almost impossible… but I can’t just fold my arms and give up.”

[…]

“He was so systematic he even instructed them what to do if he was losing consciousness – how to inject him with adrenalin and perform artificial ventilation,” says Vladislav. “I don’t think his preparation could have been better.”

[…]

“A general anaesthetic was out of the question. He was able to administer a local anaesthetic to his abdominal wall but once he had cut through, removing the appendix would have to be done without further pain relief, in order to keep his head as clear as possible.”

“My poor assistants! At the last minute I looked over at them. They stood there in their surgical whites, whiter than white themselves,” Rogozov wrote later. “I was scared too. But when I picked up the needle with the novocaine and gave myself the first injection, somehow I automatically switched into operating mode, and from that point on I didn’t notice anything else.”

“Rogozov had intended to use a mirror to help him operate but he found its inverted view too much of a hindrance so he ended up working by touch, without gloves.”