4.6: Checklist for the final draft

This course teaches scientists to become more effective writers, using practical examples and exercises. Topics include: principles of good writing, tricks for writing faster and with less anxiety, the format of a scientific manuscript, peer review, grant writing, ethical issues in scientific publication, and writing for general audiences.

강사:

Dr. Kristin Sainani

Associate Professor

스크립트

In this last module, I want to give you a few last tips for your very final draft before you send something off to your editor or to a journal. There's a checklist that you should go through. Once the prose is sounding good, there are a few other things that I want to make sure you check. Before you submit your final draft, you should check for consistency and, in particular, for numerical consistency. You also want to make sure that your references are not what I call, references to nowhere. I'll talk more about this in a minute. Checking for consistency means making sure that you don't have things that are contradictory in different places in the manuscript. This happens often. I was editing somebody's work and in the method section they said, "We followed participants for a minimum of two years." But in the results section, they said that the follow up time, "The average follow-up time was one and a half years". Clearly, the average follow-up can't be one and a half years if the minimum follow-up is two years. So this was an inconsistency. Maybe they meant to say that they had a maximum follow up of two years or that they aim to follow participants for two years. But if a reviewer or editor sees this kind of inconsistency, it raises a lot of red flags. I see numerical inconsistencies a lot when I'm reviewing papers. For example, I've seen a lot of papers where the abstract reports different numbers than the body of the paper. This may result from sloppy cutting and pasting or because authors redo an analysis, but maybe they forget to update the entire manuscript. I was reviewing one paper where the numbers in a table and figure should have been identical, but they didn't match. So I suspected that the authors had more than one version of their dataset running around. If you can't keep track of your dataset, that makes me worry about the whole analysis. So you want to make sure that your basic numbers match through out your paper, otherwise it raises all sorts of red flags. The final thing to check carefully is your references. You want to make sure that you don't have what I call references to nowhere. This is when the authors cite a reference, but when you look at that paper, it does not contain the information that the authors indicated was available there. I am often checking references for various reasons such as to track down original sources and I have found that the majority of the time, the reference does not in fact contain the promised information. I believe that this is the rule, not the exception. Often times, authors will misinterpret or exaggerate the findings from the original source. If you go back to the original reference, it turns out that the citing authors were selective in the information that they chose to mention in their paper. Or authors cite a paper to support a particular statement, but that statement is not in fact supported by the original reference. It might be supported in some roundabout way but not directly. Another common problem is that, citing authors often cite secondary sources rather than original sources. I call this citation propagation. Maybe Jones Adele does the original study, they come up with a statistic. Then Smith Adele wants to cite that statistic so they cite Jones Adele. But then Barry Adele gives the statistic in their paper and, instead of citing Jones Adele, they cite Smith. They read the statistic in Smith's paper so they don't bother to go back to get the original reference. Then James Adele cites Barry Adele for this statistic and so on. It reminds me of the game of telephone that children play. If you're not familiar with that game, that's where children sit in a circle. The first child comes up with a sentence and they whisper that sentence into the ear of the next child. And then that child whispers it into the ear of the next child and so forth. The last child says out loud what they heard and it's always something that's garbled and funny and that has little resemblance to the original sentence. This is exactly what happens in the scientific literature when you cite secondary rather than original sources. You lose important pieces of information down the citation chain and often things get completely garbled. And then sometimes authors just missed number references, or they put the right reference in the wrong place in the paper. Using a reference manager program like EndNote can help avoid that kind of problem. But here's an example, I was writing about UVC light a few years back. And I was reading a paper I got to this sentence: "These data are particularly disturbing as the UVC emission is even larger than ambient sunlight on a mountain." I was writing for a lay audience and I thought, great, this is a very easy comparison that anybody can understand, ambient light on a mountain. I wanted to use this comparison in my story so I needed to go back to the original references to verify its accuracy. I also had a question because when they said it's larger than ambient sunlight on a mountain, they didn't give any time frame. Is it larger than the amount that you get in one minute on the mountain? Is it larger than the amount that you get in an hour? So I needed to go back to the original references to get more information. Well, first I want to reference 13. It was a URL. But the link was broken. It brought me to a website but I got an error message. I searched all over that website and I could find no relevant information about UVC emissions on a mountain. Fortunately, they had given two references so I still had hope. So I went to reference 14, which was the paper. I scanned through the paper. I did a word search on the paper. It did not contain the words ambient sunlight, mountain, or UVC. It was a reference to nowhere because the paper contained nothing to support the author's statement. I was out of luck. So always double check your references. And here is finally an example of citation propagation, where citations get garbled through the literature, like a game of telephone because authors fail to cite original sources. When I was a graduate student, I worked on something called female athlete triad. One of the components of the triad is disordered eating. And people would always want to know how common is disordered eating in female athletes. At the time that I was a graduate student in the late 1990s, there was a hallmark statistic that everybody cited. The statistic was that 15 to 62 percent of female athletes have disordered eating. This statistic appeared in every paper on female athlete triad or eating disorders in athletes. Everybody used this statistic. But everybody cited different sources for this statistic. At one point, I was trying to trace back to where this statistic came from and I found about 50 different attributions for the statistic. Well, obviously that statistic came from somewhere. So everybody was just citing secondary sources. And I'll just give you some examples. I found it said in a paper from Journal of General Internal Medicine: "It has been estimated that the prevalence of disordered eating in female athletes ranges from 15 to 62 percent." While this paper gave two citations, a book from 1996 and a paper from 1996. Neither of those is the original source, the original source is actually from the 1980s. I'll show you in a minute. Just another example, I found in a fact sheet on eating disorders: "Among female athletes, the prevalence of eating disorders is reported to be between 15 percent and 62 percent." They cite a book, so again a secondary source. In a 2000 review paper in the American Family Physician, they say: "Although the exact prevalence of the female athlete triad is unknown, studies have reported disordered eating behavior in 15 to 62 percent of female college athletes." No citation was given there. And then in a 2004 paper in The Sport Journal: "Studies report between 15 percent and 62 percent of college women engage in problematic weight control behaviors." They cite a 2000 paper, Berry and Howe. Well, I actually went and pulled that 2000 paper, Berry and Howe, and guess what? It doesn't contain any mention of this statistic whatsoever. So it's a reference to nowhere. Not only is it not the original source, but it doesn't even talk about the statistic in that paper. Interestingly, I found the following statement in a 1999 New York Times article. And actually they are the closest to getting this right. They say: "But informal surveys suggest that 15 to 62 percent of female athletes are affected by disordered behavior that ranges from a preoccupation with losing weight to anorexia or bulimia". Informal surveys is actually right on the money. So I know that some fact checker at the New York Times actually went back and found the original source for this statistic. A publication like The New York Times has a superb fact checking department. Unfortunately, few scientific journals have fact checking departments to double check sources like this. Out of curiosity, at some point as a graduate student, I got interested in where this statistic actually came from and I tracked down the original sources which took some sleuthing on my part. The statistic comes from three papers that were published in the 1980s. The 15 percent value comes from a 1987 paper. The 62 percent value comes from a 1988 paper and in between those, there was a 1986 paper that found a value in between 15 and 62 percent. All three studies shared an author, Rosen, but do notice that all three studies were from the 1980s and this statistic was being cited decades later. What's more, if you look carefully at the studies they are very limited in their design. These were cross-sectional surveys, self-report, with no non-athlete control groups, and they weren't random samples of athletes. They were just convenient sample. The 1986 study surveyed varsity athletes from some midwestern universities in a whole bunch of sports. The 1987 study was on nine to 18 year old swimmers at a particular swim camp. These weren't even college athletes, though sometimes, when this statistic is cited people say college athletes. And the final study looked at just 42 college gymnast. So these are totally non-representative samples across a wide range of athletic groups with no controls. And what the study is considered disordered eating was also poorly conceived and poorly measured. I don't want to go into a lot of detail but if you're curious there was a wide range in what they were defining as disordered eating and the definition differed for those three different surveys. Here are a few more details about the findings if you're curious, but the bottom line is, this statistic was widely used as if it was the gold standard. It was used everywhere in the literature. But it basically had no meaning because it came from studies that were old and that were inherently limited by their low quality design. So my take home message is, you always want to cite the primary source. Take the time to go back to the primary source. Don't just cut and paste other authors' citations and assume they are correct because, more often than not, those authors have made errors in their citing. Dig back to the original source and get it right.