Science cannot advance on fraudulent publications, whether the problems are big or small. We all know the basics of honest research, but there are also things we need to be taught. These are based on understanding our inadvertent tendencies to accept data that support our hypotheses and view data that reject them as aberrant.

Did you see the Comment by Glenn Begley in the 23 May 2013 Nature entitled Six red flags for suspect work, pages 433-434? It is a followup on a couple of studies that show that most preclinical cancer papers in top journals could not be reproduced, even in the same lab (references in the paper cited above). The point of Begley’s commentary is that it is not so hard to identify the poorly done papers. Begley breaks it down to six thoughtful questions that are good to think about in planning your own research and in evaluating the research of others. Some of the points may be more relevant to certain kinds of lab projects, but most of them are generally true.

I’ll list them here. Just to be clear, I’ll be using Begley’s exact words for the six headings, then my own words to explain them in contexts I understand.

1. Were the experiments performed blinded? This means that the person gathering the data does not know which case is part of which treatment, so the person cannot inadvertently boost some scores and lower others. You may think you wouldn’t do this, but you would, all the psychology studies show, no matter how honest you want to be. So scramble those treatments and make sure the person doing the counting, measuring, or scoring does not know how any given score will impact the overall result.

2. Were basic experiments repeated? Of course there need to be replicates. There need to be as many kinds of replicates as you can think of, within feasibility. In the paper you should explain what replicates you did. Back when I was working on wasps, the replicates were simply different wasp nests treated the same, so there was a sample size. Without this, I could not do statistics. You might have replicates that are locations. If you have a high elevation and a low one, you need more than one pair, or the differences could be due to something else. Now I work on a micro-organism, I can do exact replicates of the exact same clones. A full replicate will come out of the freezer and be tested independently fully. In the lab we have started talking about replicates and duplicates to reflect different levels of independence. I can’t believe anything would be published in a high quality journal or anywhere without repeated experiments.

3. Were all the results presented? One of the thing we talk about a lot in the lab is that you can’t edit out a troubling datapoint, unless you do tests both ways and point out a reason to remove it, like the incubator broke or something. You can’t edit out bands on a gel. Of course you can’t differentially enhance areas of the gel picture. I imagine you might do your analyses on all the gels but just show a representative one. These days I suppose you could put all the images in the supplement. If your data are more ecological or evolutionary, you might have tons of kinds of analyses you did, but don’t present. I’ll have to think about what to do here. How do you distill hundreds of hours of videotape of wasp behavior, for example, to answer a question and still show all the results? Think about what this means for your field. Be uniform and as comprehensive as possible.

4. Were there positive and negative controls? This is much more of a lab experiment thing. If you are doing PCR you should have a lane with no template DNA in it, and a lane with a template you are confident will work, a negative and positive control. But with field experiments the exact nature of controls gets more complicated. As much as possible, have positive and negative controls. Think hard about controls, about random design using blocks so conditions you haven’t thought of don’t impact your outcomes. I wrote earlier about a student who worked in a plastics lab after working in our lab. When I asked her if there was any carry over, she said that she had a much better understanding of thinking hard about appropriate controls than her fellow students had.

5. Were reagents validated? Begley’s description of this one talks about whether antibodies bind to the right thing or not. You should always be sure your reagents work. We have had lots of experiments fail because of bad reagents. I think this one really is tied up with the one before, that there should be appropriate controls that would detect poor reagents.

6. Were statistical tests appropriate? In my lab we spend a ton of time worrying about statistics. We generally use R so what students learn they can take with them anywhere. We worry about independence, about appropriate degrees of freedom, about whether the data are normal and so can use parametric statistics. We get it that you can’t pick and choose what data go into the statistics. We know you can’t end the experiment just because you now have significance (p hacking, written about earlier). We often have nested data. Everyone should take a statistics course or take several and be careful. There is no such thing as curves that are so obviously different that statistics are not needed. IN general ecologists and evolutionary biologists are probably good at this one overall.

This is Begley’s list. There are certainly other points that are important. So, think about all of these as you design your study and see if when you read a paper you can determine whether the paper you are reading succeeds or fails on the Begley scale.

Glad for the emphasis on (1) blinding measurement. We tend to think we don’t need to worry about it but its effects show up nearly any time you look for it, even when measurers have absolutely no vested interest in the outcome.