Scientists are not robots recording events using mechanically objective methods. Like all human beings, scientists have biases. Indeed, the inherent bias in science has spurred debates over climate change, evolution, and other topics. Nonetheless, the reliability and validity of data used to draw conclusions are the linchpin for scientific progress.

Too often arguments rage after the analyses are run and conclusions drawn, rather than being addressed more thoroughly upfront. We contend that many of the problems science and society face are traceable to a lack of rigor in both teaching and implementing widely recognized “best practices” methods in science. One field that has a history of this is animal behavior. A century ago, the scientific and public worlds were transfixed, and many leading physical and biological scientists misled, by Kluge Hans, the clever horse that purportedly could solve mathematical problems far beyond today’s counting apes and parrots. More than 100 years later, the essential lessons from that debacle, uncovered by an experimental psychologist who recognized that the horse was taking clues from its owner’s breathing, posture, and other inadvertent signals, have still not been absorbed. In fact, psychologist Robert Rosenthal of the University of California, Riverside, and others documented many recent examples, such as discovering that students found that rats from a “smart strain” learned faster than “dull” rats when, in fact, all were from the same population! We can and must do a better job of minimizing potential biases in science to ensure transparency and reliability in the data we collect, analyze, and interpret, and thus help close off avenues of criticism for controversial findings.

Many blind eyes

There are two major ways to alleviate problems stemming from observer bias. The first, particularly salient in experiments, is for the observer/experimenter to be blind to the conditions being tested. So, if one is evaluating whether one drug treatment rather than another alters a measure of anxiety in mice, then the observer should not know what treatment the mice have received. Thus, unconscious biases in recording data are eliminated. Alternatively, or in addition, at least two independent observers should record the data. High inter-observer agreement in the collected data helps ensure the reliability of the data and their subsequent interpretation.

Furthermore, with today’s easy and cheap access to video-recording equipment, it is often quite feasible to have observers evaluate the trials using any of a number of computerized observational programs, and avoid tedious hand recording. However, after evaluating a systematic sample of nearly 1,000 original data papers published since 1970 in each of 5 major journals in animal behavior, we found that, with one exception, fewer than 5 percent of the papers in each journal had employed the best techniques available for minimizing bias.

Even lab researchers using automated recording methods must be wary of the potential biases they introduce to the experiment. Rarely is the reliability of these contrivances measured across time to ensure that they are accurately measuring what the researcher assumes she or he is measuring. Moreover, these methodologies are often only applicable in research using standard kinds of behavioral measures amenable to such automation, and not to pioneering work using new measures in unusual contexts. Thus, even highly controlled laboratory experiments must incorporate additional measures to ensure unbiased and reliable data collection.

A pervasive problem

Those not studying behavior may think they are off the hook. They are not. Electrophoresis gels, neuron density in histological slides, measures of micro- and nanoscale molecules, and so forth also need to be coded blindly and reliably. This can often be done easily, but as in the field of behavior, these methods are not always used, or at least not reported. Furthermore, use of false colors in evaluating brain scans and the cosmos can conceivably distort qualitative interpretations depending on the bins used to pool responses and even the shades selected.

Field researchers may question our call for formal methods to reduce bias and unreliability: they are out in the field by themselves; certainly they know what population or individual they are observing; they are not able to do the kinds of things advocated by those lab purists who are just skeptics about field research; what ever happened to trust? But we argue that awareness of the problem and appropriate use of modern recording methods can allow both blind data collection and reliability testing of at least subsets of data.

Another criticism is that these techniques inhibit creativity, novelty, and the ability to uncover important phenomena due to the time-consuming, tedious, and often unnecessary constraints they entail. As a recent paper by John Lounsbury and others at the University of Tennessee shows, scientists are low on conscientiousness as measured by patience with following routine rules, structured procedures, conventional thinking, and related ways of dealing with their work and career. But impatience is not an excuse for less than thorough research practices.

Exacerbating the problem is the increasing role of multiple authors on papers, in which each component contributed by an author or team is often accepted without meticulous evaluation by the rest of the authors, which can be difficult for scientists who have, by design, different backgrounds—and, perhaps, a human desire to be collegial. Nonetheless, all authors should be concerned about their reputations being sullied by colleagues being, perhaps only inadvertently, less rigorous than should be expected in good science. The recent substantiation of intentional fraud in numerous areas of science, most recently in social psychology, is not what we are warning against. It is that such events may partly be a consequence of a culture in science and editorial policy that does not deliberately make clear the expectations that papers should show how they implemented methods to avoid problems with bias.

In short, more proactive procedures are needed to minimize the possibility of bias in our observational and experimental research. While blinding and reliability precautions do not ensure valid results, their lack may lead to, if not actual errors, the perception of undue bias. As we began, scientists are not robots. They are individuals caught too often in a competitive race and, like everyone else, are prone to self-deception. As scientists, we should be willing to turn the scientific lens on ourselves using not just intuition, eminence, or perceived integrity, but the hallmarks of science: public, replicable, reliable, and quantitative information.

Gordon M. Burghardt is Alumni Distinguished Service Professor of Psychology and Ecology & Evolutionary Biology at the University of Tennessee. Todd M. Freeberg is an Associate Professor of Psychology at the University of Tennessee. Read more about their thoughts on how to minimize observer bias in this recent Ethology perspective.

Add a Comment

Comments

1) In ancient time, it was good form to replicate relevant published studies before beginning your own original work. This sort of diligence is one way of avoiding bias, but it doesn't usually attract funding.

2) Depending on your library facilities, your preliminary and obligatory literature search may omit items that happen to be behind a paywall. Since researchers with non-conformist ideas may be relatively more likely to find alternative ways of making their work available, your opinions on a subject may become biased.

The fundamental flaw in almost all scientific research is that the people who formulate a hypothesis and conceive an experiment to test that hypothesis are the very same people who conduct the experiment and collect the data. What is required is a change to a blinded approach to scientific research so that the people conducting the experiments have no foreknowledge of the hypothesis being tested. Only in this way can bias, unwitting or otherwise, be eliminated.

Coming from a background in clinical research, where blinding is pretty fundamental when comparing one treatment against another, I've always found it fairly shocking that there's no blinding in lab research when research end points are measured by subjective observations.