After you have finished with your first pass and are now happy with the overall shape of things, it is time to polish the micro-level structure of words, sentences, and paragraphs. Once again, your overall goal should be simplicity, clarity, and readability. You can achieve this goal by cutting out everything that is not necessary. Condense and straighten your sentences so that they are short, to-the-point, and easy to follow. When editing your sentences, pay attention to these points:

Ensure that each sentence in a paragraph belongs to that paragraph: the first sentence defines the topic of the paragraph, and the rest stick to the topic. If a sentence goes off in a tangential direction, delete it or move it elsewhere; if the paragraph is long and its topic appears to change on the way, split the paragraph into two or more.

Check that your sentences follow one another logically and that there are no jarring, abrupt changes in direction. Tie your sentences together with transitional words. Begin your sentences by addressing the words or concepts that finished the previous sentence, or use conjunctions that refer to the previous sentence (however, in addition, to the contrary, and so on).

Check sentence length; aim at short, precise sentences. Try to make every sentence shorter by rephrasing the idea using fewer words and cutting out words that do no work. Those lazy words include repetitions and unnecessary adverbs and adjectives—almost always, if you remove “very”, your sentence will be tighter. Also look out for wordy expressions involving the passive voice or nominalizations of perfectly good verbs (see below). Tighten your sentences, remove redundancy. Rejoice for every word that you cut!
If your sentence still feels too long after tightening, look for ways of splitting it. Long sentences are taxing to read because the reader has to hold a great number of words in her short-term memory: about 25 words is already at the limit. You do not need to count words for spotting sentences that are too long, though. You can spot those sentences visually (if a sentence spans several lines, it is too long) or, better, by reading your text aloud. Wherever you stumble or run out of breath, you have a problem.

Make meaning early in the sentence, and keep your subject and verb close. If the first words tell who does what, it is easier to decipher the rest of the sentence. If the main point of the sentence becomes clear in the beginning, even wordy sentences can be comprehensible. But if the meaning of your sentence is only unlocked by its 27th word, the long and winding road there will be littered with the remains of readers who have perished from utter mental exhaustion.

Use active voice and avoid the passive (for exceptions, read on). There is a long tradition in the scientific literature to use the passive voice, probably because the passive voice sounds more distanced and impersonal—somehow more “academic”. But when anyway writing about abstract concepts, there is no reason for making them any more abstract, impersonal, or distant! So it is time to get rid of this tradition: avoid the passive wherever you can.
How to spot the passive when editing? Easy—if your sentence ends with the actor (“by X”) or if you can insert “by zombies” to the end of the sentence without violating grammar. Whenever you spot a passive sentence, try to rephrase it and activate the verb. “X influences Y” is better and shorter than “Y is influenced by X”. If you use active verbs, your sentences will be stronger, shorter and more readable.
You do not need to always use the active voice, though. There are times when the passive voice works better; some concepts and elements sound out of place if made actors. It is OK to use the passive voice when the researcher wants to remove herself (or other researchers) out of the picture. Common examples include “it has been experimentally confirmed that” or “it has been argued that” (in particular if you disagree with the argument but do not want to name the culprits). Also, if you want to stress whatever is being acted upon, that is, the receiver (or victim) of action, use the passive voice. “The climate is influenced by greenhouse gases” stresses the word “climate”, whereas “greenhouse gases influence the climate” focuses more on the greenhouse gases.

Avoid nominalizations—turning verbs into nouns. Nominalizations take the life out of perfectly good verbs. And because what was once a happy, active verb has now been shrunk into a sad noun that just sits there, doing nothing, a replacement verb is required. These are usually clunkier and duller, like “carry out”, “perform”, “conduct”, or plain “to be”. In addition to sounding like company-speak, these verbs make your sentences longer than they need to be.
When you happen to come across nominalizations, rescue and release the original verb from captivity and let it roam free again! Say “we compared” instead of “we performed a comparison”, say “we examined” instead of “we conducted an examination”, and say “we analysed” instead of “we carried out an analysis”.
How to spot a nominalised verb? Nouns that have a captive verb inside, waiting to escape its torment, often sound like French or Latin—they end with “-ion”, “-ence”, or ”-ment”. Adjectives can also be nominalized into nouns, and there can even be chains where “to differ” becomes “different” that becomes “difference”. The difference between these forms is that “X differs from Y” is much simpler than “there is a difference between X and Y”. And shorter by 21 characters!

Avoid words that end in “-ive”—those are adjectives that have a verb inside, struggling to get out. Release the verb! Instead of “X is indicative of Y”, say “X indicates that Y.”

Comb your text for clunky expressions that are simpler and shorter in plain English. Plain “because” is much more effective than “as a consequence of” or “due to the fact that”. “Although” works better than “despite the fact that.” Do not say “for the purpose of” when you can simply say “to”, or “for”. Search for “in order to” in your text; replace with “to”. Search for “such as” in your text; delete these words and rewrite the sentence without them and it will sound better. For more examples—and a convenient search-and-replace list—see http://plainenglish.co.uk/files/alternative.pdf.

Search for “moreover” in your text. Delete it. Rewrite the sentence using perfectly good simple words – “besides”, “in addition”, “also”– that are common in everyday speech. No-one says “moreover” anywhere else than in scientific journals; people probably use this word only because they saw someone else use it.

Avoid using jargon and complicated words as a blanket, to feel secure. Excessive amounts of jargon often result from thinking that for something to sound academic and scientific, it has to be complicated, full of expressions that no-one uses in everyday speech. This is wrong. The more simple and the more clear your writing is, the more authority it has. It is more difficult to trust a writer who hides her point behind a facade of long sentences and complicated words; these feel like smoke and mirrors, tricks to hide the absence of depth. You do not appear more intelligent if your writing is too complex, to the contrary of what many seem to believe. However, if you use words that everyone can understand to explain complicated issues, Richard Feynman would be proud of you. Science is difficult enough as it is—do not make it any more complicated with your writing.

Bonus tip: learn from the masters. Get The Elements of Style by Strunk & White, and do as the masters tell you. Your readers will thank you for it.

When editing and revising your paper’s first draft, my suggestion is to do two passes: first, a pass that focuses on the broader issues of structure and content, and then a second pass that focuses on the nitty-gritty, sentence-level details. In this post, I will present ten tips for revising your draft that can be used as a checklist for the first pass; this list contains issues that I frequently come across when working with students and revising papers.

If you have read this series this far, you won’t be surprised to see that most of the issues have to do with clarity and focus.

Check that the abstract follows the hourglass structure: broad context, narrower context, the research question, your result, implications of your result on your (sub)field, its broader implications. Also, do make sure that your abstract is as jargon-free as possible: only use words that most readers can understand.

Check that your paper is focused. Choose the point of the paper and its key conclusion before you begin writing, stick to your choice, and write the paper so that the reader gets the point already in the abstract and in the introduction. Leave out results that are not required for supporting the key conclusion, or safely tuck them away in the supplementary information document. When editing, if you feel that your paper loses its focus at some point, take a step back and do a major rewrite.

Check that there is a clear question and a clear answer. A good paper states and then solves a problem; your results are meaningful only if they solve a meaningful problem. Remember that your paper is neither an account of your work nor a lab diary; it should be a story of an important problem and its solution. Emphasise the problem, both in the Introduction where it should really stand out, and in the Results section and the Discussion. Make it clear to the reader how each result contributes to solving the problem, and what the implications of solving the problem are.

Check that the figures tell your story. If you just glance through the figures and skim their captions, do you get the point of the paper and its take-home message? If not, go back and revise—after all, skimming is what most of your readers do.

Check that the reader can replicate your results. Verify that your Methods section (and the supplementary sections if any) contains everything that the reader needs to know. Also, check that you provide links to your code and your data if it can be released without violating anyone’s privacy.

Check that you end the paper with something worth remembering. This means something concrete. “More research is needed” is a platitude and a vague one at that; better, go for something like “because of the results of this paper, we are now in a position to tackle problem X with method Y, bringing us closer to the ultimate goal of Z”. This is far more concrete and memorable. Endings have power; do not waste this power.

Check that you provide enough background information: your reader does not know what you know. Assuming that your reader knows much more than you and therefore omitting background information is a very common problem with students. A typical example would be a Methods section that directly launches into what you have done without first telling why. Although it is evident to you that to get from A to B you need to do X, this is probably far less obvious to the reader. If you only tell the reader that you did X, she is confused. Why did you do X? Never assume that the reader knows your motivation, or the details of every method you used, or why your research question is important. Tell her.
Many students seem to think that they know little while everyone else knows a lot—therefore they shouldn’t explain things that everyone probably already knows. It is only later in their careers when they realise that no-one really knows that much! Besides, there will be readers from adjacent (sub)fields and readers who are just learning the tricks of the trade. Use a colleague who works on something slightly different than you as a test reader—ask her which parts of the text are hard to follow, and revise accordingly.

Make sure that you take the reader’s hand and lead her through the text with signposts. Or, in other words, check that your writing is not confusing. Writing is, in part, psychology, and it aims to modify your reader’s state of mind and to influence what your reader thinks. Feel empathy for your readers and try to get inside their heads, assuming that they know nothing or very little. Your empathy should be reflected at the level of sentences and paragraphs: present familiar things first before moving to new concepts, use leading sentences, glue your sentences together with expressions that guide the reader. Use subheadings. Gently lead the reader from result to result, from paragraph to paragraph, and from sentence to sentence. Never leave it to the reader to connect the dots— always connect them for her. Err on the side of caution: papers where things have been over-explained are rare (if they exist at all), but papers that are all too difficult to follow are frustratingly common.

Check that you are consistent with nomenclature and notation. Because you have been immersed all too long in the world of your paper, this problem may be hard to spot for you—using an outside reader as a guinea pig is recommended. Problems with notation are easier to detect; problems with naming things are more difficult. Often, while doing research and while conceptualising the paper, there is a number of concepts floating around, and the very same things can have many names in your thinking. Writers of fiction are allowed to use synonyms for variation, but science should be precise: in the final version of your paper, everything should be called by one name only. While it may be evident to you that the thing you call the weight matrix is the same as the thing that was called the correlation matrix in the previous paragraph, your reader quickly gets confused. Never refer to the same thing with multiple terms.

If you feel that it is impossible to get some part of your text just right, this is often a sign, a message from you to you. When you are stuck with a paragraph that just won’t yield, stop trying to force it. Instead, ask yourself: why is this so difficult? Search your feelings. What would make the paragraph easy to write, what are you missing? Often, you will notice that you are not faced with a writing problem at all—rather, you miss some important piece of understanding. Perhaps your result is not clear after all, or you have not thought enough about some tricky issue and that is why you cannot express it in words. So take a time out, and look for understanding first; the words will come more easily when you have found it.

“If you feel the urge of ‘very’ coming on, just write the word ‘damn’ in the place of ‘very.’ The editor will strike out the word, ‘damn,’ and you will have a good sentence.”—William Allen White

If you have followed the advice in the last chapter, you should now be the proud owner of a crappy first draft of your scientific paper—a draft that serves as raw material, a draft that is for your eyes only, a draft that was written quickly and without too much care.

Now it is time for you to put on another hat and play a different role. It is time to look at your draft critically and to examine each and every sentence and paragraph ruthlessly so that you can cut out everything that doesn’t carry its own weight.

Before that, however, it might be a good idea to take some distance unless you are in a big hurry because a fresh pair of eyes can better spot what needs to be done.

How to revise your paper’s first draft? The process of editing and revising a scientific paper is iterative and it can take many rounds: my most-cited paper was at version number 27 or so when it was finally submitted. This may sound a bit excessive, but hey, it worked! You don’t always need to go to that length, though–just be sure to do several rounds of revisions, first alone and then with the help of your co-authors and/or your supervisor.

Just like with writing the draft, I recommend using a top-down approach when revising—begin with addressing broader issues before homing in on the details. First, read the draft quickly, without getting stuck on sentences, words, or other nitty-gritty details. Then, go through your findings: is the story logical, clear, and exciting? Does the abstract do its job and entice the reader? Is it clear what problem the paper solves? Is it clear what the solution is? Are concepts introduced in the right order? Is the paper balanced, or are there sections that are too long or sections lacking in detail?

Is it clear that your results are backed by solid evidence? Are the figures of a high quality and free of common errors such as microscopic label fonts? Does the paper begin with a proper lede – a sequence of sentences that frame the topic of the paper and entice the reader to read the rest of the story? Does the paper end on a high note?

The answers to the above questions may result in a need to “remix” the paper: to shuffle its contents around, to reorder things, and to completely rewrite some sections. This is normal: if you feel the need, just do it. Then, repeat the top-level analysis of your draft: answer the above questions again, and see if you can think of ways to improve the text further. If the answer is yes, do it. Repeat this loop until you are happy with the outcome and satisfied with the overall structure and flow of your paper. At this stage, you may even feel like returning to your research, say, to look for new results that back up your conclusion even more strongly. If so and if there is time, great, just do it, but please do remember to stop at some point because there will always be something new just around the corner. Leave some of that for the next paper.

When the overall structure is there, you should focus on the level of paragraphs and sentences. Use the same rules as for writing the paragraphs. For each paragraph, check that its topic is made clear in the first sentence or two. Check that the paragraph doesn’t stray away from the topic. If it does, cut it into two, or revise it. For each sentence, check that its meaning is clear, that it connects with the previous sentence, and that the rules outlined in the section on sentences below are fulfilled. Split sentences that are too long. Check the grammar. Use a spell checker.

Check your notation and nomenclature, and straighten them out if necessary. Do you always use the same word for describing a concept, or do you use several names for things? Is your notation consistent, do you always use the same symbols? Do you explain every symbol used in every equation?

Check your figures. Are your axis labels large enough to be seen without a magnifying class? (I repeat, this is the most common mistake in figures produced by PhD students, for reasons unknown to me: fonts whose size is measured in micrometers). Are your axis labels clear, and is the notation consistent with your body text? Are the colour schemes you use clear and informative, and most importantly, consistent across figures? Do the figure captions explain what should be learned from the figures, instead of only describing what is being plotted?

Then, finally, when all else seems in place, do a shortening edit, with the target of removing extra clutter and superfluous words. Make every sentence shorter that can be made shorter. Remove all adjectives, unless really necessary. Remove all repetition. Remove words that exaggerate things, because you sound more confident without them. Remove every instance of the word “very”, because you never need it. Remove the words “in order” from “in order to”.

When you are ready to show your improved draft to others, you can apply a technique that my research group has borrowed from the software industry: Extreme Editing.

In the software industry, extreme programming is one of the fashionable agile techniques, and part of this technique involves programming in pairs. So edit in pairs! Or, if there are more coauthors, involve as many of them as possible. Force your PhD supervisor to reserve several hours of uninterrupted quality time; you can argue that this co-editing session takes less time than several rounds of traditional red-pencil-comments.

This is how extreme editing works: go to a meeting room with a large enough screen and open the draft on the screen. Then, go through your text together, paragraph by paragraph and sentence by sentence. Be critical of each word and each sentence; look for sentences that are unclear and that can be misunderstood. Try to find ways of reducing clutter and shortening sentences. Cut out fat wherever needed. In my group, we jokingly keep a tally of points scored for every removed word. The winner is the one who has most ruthlessly killed the largest number of words that just tagged along, doing no real service to the text.

In the following two posts, I will present some more tips on how to revise your draft, first on the level of meaning and structure, and then on the level of sentences.

For the previous episode in the series on how to write a scientific paper, see here.

“To write is human, to edit is divine” -Stephen King

The best and most productive writers do not write perfect first drafts. The best and most productive writers write crappy first drafts and they do this as quickly as possible. They then edit, revise, and polish their crappy first drafts until those are no longer crappy (and no longer drafts). Or until the deadline makes them stop, whichever comes first.

This is what you should do with your scientific paper too: write the first draft quickly, and then edit, revise, polish, rinse, and repeat, until you are satisfied with the outcome. Or until the deadline comes.

If you have followed the system outlined in this blog, you are now at the point where you are ready to write your very own crappy first draft. You have a story, you have a structure, and you have notes for each section and each paragraph. If you have read the previous chapter, you have some idea of how to organize the building blocks of paragraphs and sentences (recap: the first sentences/words tell what the paragraph/sentence is about; stick to this and keep it simple; put weighty stuff at the end). This is all you need to know for now; I’ll provide plenty of tips for editing later.

So at the time being, put all rules aside, and aim to produce to a complete first draft quickly. Embrace the words of Stephen King quoted above and forget perfection when it comes to the first draft—let it be human, let it be imperfect. Let it be crappy! Why? Because producing and then polishing a crappy first draft is much, much faster than agonizing over every word and sentence and making only perfect choices that take forever to make. When all that time is spent on editing and revising instead, the outcome is much better.

Now that you have to finally produce some text, this is where the pain of writing typically hits you. Coming up with plans and storylines can be fun; writing rarely is. Writing is hard work. Writing the first draft is particularly hard work because not being self-conscious of your words is hard, and because not letting your inner critic stop you in mid-sentence is hard. These demons are difficult to wrestle but wrestled they must be, otherwise, there is no progress and the pages remain blank.

How to ease this pain, especially if you are a novice and it feels overwhelming? How to write all that text that needs to be written before you have a paper? There are some techniques that may help you.

First, make the first draft your own little (crappy) secret. It is not for your supervisor’s or co-authors’ eyes—it is for no-one else’s eyes, it is only for you, and it serves as raw material for editing only. When your supervisor asks you for the first draft, you should give her your second draft instead—by all means, call it the first draft! Keeping your first draft private should make you less self-conscious, at least in theory: no-one else will ever see it.

Second, aim at producing more text than you need. Just let the words come! At this stage it’s OK to have sentences that are too long, it’s OK to repeat yourself, it’s OK to explain the same thing over and over again with different words. In particular, if you are writing, say, one of those 4-page letters with a restricted word count, do not worry about the length at all. Just write. Cutting text is easier than producing it, and the editing phase easily reduces the length of your text by 10-30%. In my experience, the more, the better the final product.

Third, to be productive, schedule writing time and stick to it. Never wait for inspiration to strike, because it rarely strikes those who just sit there waiting. The Muses dislike idleness; they tend to show up when you are already engaged in work. Just sit down, put your phone on silent, remove all clutter from your screen, shut down your Internet access, and do it. Write. A good target is something like 30-45 minutes of uninterrupted writing, followed by a break. For a really good day’s work, four to five of such sessions are already enough. Just keep on doing this daily until you find yourself at the end of your first draft.

Fourth, if you get stuck, try changing the way you write. Take a pen and a notepad and walk away from the computer. Sit down somewhere, get a cup of decent coffee, and sketch your sentences on paper. Try to write as if you would be making lecture notes or just jotting down ideas. When unstuck, go back to your computer and use the material in your notes to continue. Or, instead of a notepad, try dictation, or go for a walk and play out imagined conversations in your head where you explain whatever it is that you are supposed to be writing to someone.

If you are very self-conscious and find it hard to make progress because of that nasty voice in the back of your head, you might want to try something along the lines of the Morning Pages technique. This technique provides desensitization by stream-of-consciousness writing: every morning you take a pen, a journal, and write longhand three whole pages, filling them with anything that comes to your mind. This may feel rather difficult at first; just keep on doing it. Morning Pages were introduced by Julia Cameron in her book The Artist’s Way as a tool for artists to connect with their creativity and overcome whatever fears hold them back. If you’d like to use this technique to help you write your paper, you can fill those three pages with thoughts on your research. See where this leads you.

If nothing else helps and it feels impossible to make progress, stop for a while and think about why this would be. What would need to change for the words to emerge from wherever it is that words come from? Usually, if I find myself in this situation, the answer is that the problem lies not with words or with writing but with thinking: there is something that I don’t yet understand, some pieces that don’t yet fit. Then, the solution is to stop writing (this part of the text, at least) and to solve the underlying problem instead. So take a time-out, and look for understanding first; the words will come more easily when you have found it.

How to write the Results section of a scientific paper? If you have followed the approach in the post on figures, you have now in practice chosen your order of presentation. You have categorized your results (and figures) into Setup, Confrontation, Resolution, and Epilogue. Or, if the movie script analogy is starting to annoy you, into categories that serve a similar purpose.

Results of the first category pave way for what follows. They introduce the reader to your data; they make your final conclusions credible by showing what is in your data and letting the reader gauge whether it looks OK. Results of the second category show that there is an open scientific problem, that there is something surprising that needs to be sorted out. Results in the third category present the Resolution, the key finding that solves the problem (or opens doors to even more problems). Results of the last category, Epilogue, are there to show what follows from the main result and why the main result is important. They would not necessarily work as stand-alone results.

This will also be the order in which your results will be presented.

Please note that the above arc is not necessarily historically accurate (it almost never is): the aim is not to present your scientific findings in the order they came to be, but to present a compelling argument and to provide a bit of entertainment on the way in the form of a good story. And remember what we discussed already in Part I: the paper should only include those results that serve the story and play a role in one of these categories. The rest should be left aside for future papers (or for the Supplementary Information document).

This is the overall arc of the Results section, or, if you are writing a letter-format paper, the arc of the bulk of the paper, sandwiched between the introduction and the conclusion. Make it sure that the reader can easily follow the arc. She should always know where she is and where she should head next.

One way of making sure that the reader is always on the map is to use informative subsection headings. This may require you to divide the four categories into further subsections. A great way of developing subsection headings is to compress each result into a single short sentence and use this sentence as the heading. This way each result gets its own subsection where it can be explained in detail. Note that here, “result” does not necessarily mean a single plot or figure, but rather a conclusion that may be based on several pieces of evidence.

If the results section is organized like this, the reader can get a quick overview of the whole section by just scanning the subsection headings; remember that most of the readers just skim. These skimmers include the editor who decides whether to desk-reject you paper or to send it out to referees.

The above technique is an example of so-called signposting, where the whole paper is made more accessible by covering it with signposts that tell the reader where she is. Clear section headings help, and so do clear figure captions, whose first sentence should tell what the figure is about. Clearly formulated key phrases are also very useful. As an example, it is good to have in the Introduction a sentence that begins with “in this paper, we show”. It is also good to begin each paragraph with a topic sentence that tells what the paragraph is about.

For the results section, the most important signposts are the section headers, the first sentences of figure captions, and something we haven’t discussed yet: the first sentences of all results subsections. The first sentence of each subsection should provide motivation and background: why was the analysis done that led to this result? Why are we discussing this result? This sentence should begin with “to understand why X, we measured Y…” or similar. Even if the motivation has been mentioned in the introduction, the reader should be reminded of it, unless the paper is a very short letter and the introduction is just a few paragraphs away.

So what else is there in a results subsection? Well, results, of course – but there are several layers here. First, the lowest layer contains your “pure” results. Those are, in a way, just data: you have measured X, here’s what you got. You have computed Y, here is a table. The second layer contains direct and unambiguous interpretations of these data: the distribution of X measured under condition A clearly has a lower mean than when measured under condition B. Y grows faster as a function of time than Z. And so on. While such statements may already contain the main conclusion, a third layer of interpretation is usually required – that of giving meaning to the findings, of asking (or telling) what the results mean, of presenting new hypotheses. How do the results bring you closer to answering the broad problem that your paper addresses?

The above layers form a logical order of presentation for each of your results subsection. Begin the subsection with motivation and background. Then briefly tell the reader what you have done, either referring the reader to the Methods section or, if you are writing mixed results and methods, presenting your methods here. Then explain the results that you have obtained and talk about how your interpret them. Make it clear to the reader what layer you are talking about: what is indisputable fact, what is interpretation, and what is speculation. Use signposts for this—saying “these results can be interpreted as follows” and continuing with interpretation helps the reader to understand there might be other ways of interpreting the results. Do not exaggerate or overgeneralise: if there are limitations that haven’t been addressed already (say, in the Methods section), be open about them.

There are traditions in some disciplines where the Results section is strictly about results, and in the fundamentalist interpretation, this would mean only including the first layer (pure data) and perhaps some of the second layer (“these data have a lower mean than those data”). Then, the meaning and interpretation of the results would only be discussed in the Discussion section and not even mentioned in the results. To me, this is, well, insane. It must have been invented by the same evil people who run journals that send papers to referees where figures are separated from the text and the captions are separated from the figures. Why, oh why, should one make the reader’s life so difficult that she has to jump back and forth between Results and Discussion? I cannot think of any other explanation than purely evil intent. But if you work in one of those disciplines and have to publish in journals that demand this, then well, I guess you have to obey the rules. Or to look like you obey the roles. But don’t do it willingly: always try to sneak in at least one sentence that explains your findings. Rejoice if you get it through the editor and the referees!

Before we move on to discussing the Discussion section, one more trick. This has to do with understanding how the mind of the reader works; we’ll talk more about that later. It is very common to begin a paragraph in the results section with “In Figure X, we see that…” Now, next time you read a paper, try to be conscious of your own response to this. What do you do? Do you directly jump to Figure X to have a look, and then try to get back to the middle of the sentence that you were just reading? No? Can… you… resist the impulse? Noooo! OK, but you still feel the impulse, don’t you? It is impossible not to feel it! And the impulse makes it harder for you to follow the sentence to its conclusion. So always refer the reader to the figure last, not first: finish the sentence with “as we see in Figure X” or similar. This way, when the reader arrives at your mental hyperlink, she has already read your sentence and knows what to look for in Figure X.

[Finally: the book. Many have asked me to write a book based on these posts. I’m working on it in an on-and-off way whenever there is time. Having three small kids and being the vice head of a large department doesn’t exactly help. I’ll be happy if the book is out in 2018. My plan is to self-publish, first on Kindle Store and then as an on-demand print version (it is nowadays easy to sell printed-on-demand books through Amazon and other online stores at very reasonable prices; the technology is there). I am at the time being not even considering traditional publishers because i) they would slap a high price tag on the book, limiting the number of readers, and ii) they would then take 85%-90% of said high price without doing much else. I don’t see this as reasonable because, with modern tools, publishers can simply be circumvented. Sorry to say, but traditional publishers: for books like this, you are no longer needed… PS Does anyone know a reasonably-priced proofreader?]

“It doesn’t matter how beautiful your theory is. If it disagrees with experiment, it’s wrong. In that simple statement is the key to science.” -Richard Feynman

How to write the methods section? While much of this series has been about writing an exciting story, we now need to put excitement aside for a while. I’ve earlier claimed that scientific papers are not only containers of information. Their Methods sections, however, are. Their role is entirely utilitarian. So before we discuss form, let’s discuss function.

A Methods section of a scientific paper serves two purposes. First, it should let other researchers gauge whether your conclusions are justified and backed up by evidence—it should let other researchers assess how credible your data are, and how credible your analysis is. Second, it should allow other researchers to replicate your study and repeat whatever it is that you have done.

Unfortunately, as any experienced researcher knows, these goals are not always met. More often than not, the authors of a scientific paper do not explain the procedures that they have used in enough detail, even if there is a Supporting Information document with an unrestricted page count. It happens all too often that when the reader attempts to understand in detail how the authors have arrived at their results, she has to give up because that information is simply not there or it is too patchy.

Not being able to understand a paper’s methods or to replicate its pipeline leads to many problems. First, this contributes to the replicability crisis and therefore erodes the very foundation of science, the scientific principle itself: only those results that can be replicated by others can be taken as facts. Second, selling your discovery to the scientific community will be hard if your fellow scientists cannot trust your findings because they do not understand how they were obtained. Third, if your pipeline—from data collection to analysis—contains new methods or ideas, those will not be adopted by anyone unless they are clearly explained (or even wrapped up and served on a plate, say, as a software package). This leads to many lost citations and your work not being discovered. If you release data, someone can also use it for things that you didn’t think of, and if you release software, there will always be someone who needs it.

So please do take replicability and reuse seriously. Explain what you have done in as much detail as possible. Release your raw data. Release your intermediate results. Release your code. Reveal everything. Hide nothing. Be a good scientist. Don’t be an evil scientist.

If you release everything that there is to release, you will probably need to use external repositories. Some journals, however, do allow submitting supplementary data and code files, to be published together with the article. If you are thinking of hosting the data and code yourself, consider that we are talking about the scientific record here: your paper, your data, and your code should, in theory, be available forever. And forever is a mighty long time, as the late artist known as Prince once put it. It certainly is longer than the lifetime of the URL that points to your www homepage on your university’s server, or of the server daemon that runs on the Linux machine in your bedroom closet. So no DIY here, please—always use long-term data and code repositories, like Zenodo. While even those might not last forever, they’ll last longer than any self-hosted repository. Note that even GitHub is not futureproofed: it is run by a commercial company that can become extinct just like any other company.

Let’s return to the paper itself, and move from function to form. First, where to describe materials, data, and methods? This, of course, depends on the journal, and there are many options. The top-tier journal style (think PNAS, Nature, etc) is to have Materials and Methods as a separate section at the end of the article, as an appendix of sorts. In these journals, methods are only briefly described in the main text and the reader is referred to the Materials and Methods appendix for details. While writing a paper this way may at first feel difficult, this structure does make sense: the short letter format is all about the story, and technical details that would get in the way of the story are pushed aside. This may make writing feel harder because one cannot hide behind technical details: there has to be a story. However, beware of the dark side: referring the reader to the Materials and Methods section where only superficial details are given and where the reader is further referred to the Supplementary Information that adds detail but still lacks essential information, or where the limitations of the chosen methods are hidden in a subordinate clause on page 28. This structure makes it dangerously easy to sweep something under the rug. Which is why it often happens.

So if you are writing for one of these journals, do resist the dark side: do not hide problems in the SI. Other than that, just strive for clarity in the Materials and Methods. Typically this section comprises independent subsections for different items, so there is not much storytelling involved. In the main text, when talking about methods, describe their purpose, not their details: “we measure the similarity of X and Y with the help of (insert name of fancy similarity measure), see Materials and Methods for details.”

Then there is the style common to biomedical journals, where Methods are described in all their detail straight after the Introduction. This makes it easier to describe everything properly and more difficult to hide problems, which is good. The downside is that being hit by several pages of painstakingly detailed method descriptions is something of a turn-off: the story suffers. While this cannot be entirely avoided, it helps if you remember to provide context: begin each subsection by reminding the reader why this data set was collected, why this experiment was done, or why you are going to next describe some mathematical methods. Often, this is not more difficult than simply saying that, e.g., “to measure the similarity of X with Y, we need some well-behaved distance measure for probability distributions that…” and then describing the chosen measure.

The third way that is common, say, to the journals of the American Physical Society, is to happily mix methods with results, explaining how things were done and what the outcome was without making a distinction between the two. In this case, things like experimental setups or data collection procedures may still be explained separately, but typically all mathematical and statistical methods are described together with the results. In my view, this makes writing a smoothly flowing story easier than the biomedical style. It is easier to motivate the methods by saying that “next, we’ll investigate X, and to do that, we need to do Y, and look, here’s the result”. In the biomedical style, this connection is harder to make because the methods and results are separated, so one has to focus on making sure that the reader understands why the methods have been chosen and why the reader should understand their details.

Before concluding, let us return back to being good versus being evil, and talk about discussing the limitations of your methods. All methods have limitations, as every scientist knows, and it is best to lay these out in the open. In my view, the Methods section is the best place for doing this: while even minor limitations of methods are often discussed in the Discussion section, it feels more natural if they are addressed when the methods are introduced. Strangely, this even feels more honest. First, at least to me, it feels a bit like having been cheated if I have read a long paper, and only in the last paragraph, it is mentioned that by the way, we’re not sure that things work the way we just told they would. Second, it is easier for the writer to explain the limitations together with the methods. Third, it is also easier for the reader to understand the limitations and their implications if the details of the methods are fresh in her memory.

When addressing limitations, you should tread carefully: being honest is different from making it sound like your study is flawed. Joshua Schimel’s “Writing Science” introduces a great principle: say but, yes instead of yes, but. Instead of saying that your quite clear results would be much more detailed if your experimental setup would have a higher resolution (or similar), say that even though the resolution of your experimental setup is limited, your results are quite clear. The latter has a much more positive ring to it, although both sentences have the same information content. So don’t make it sound like there is something wrong with your work—if there is, fix it first, before writing your paper.

The first sentence of the first paragraph of any written piece of text is crucially important, as all writers of fiction know (“Call me Ishmael.”). Make it as strong as you can.

First impressions matter. The subset of potential readers who, after getting lured by the abstract of your scientific paper, have decided to have a closer look will first encounter the first sentence of the introduction. For them, this is another decision point: to read on or to stop. The second important sentence is the second one, and the third important sentence is the third one, and so on. The reader can choose to stop reading at any point, after each and every sentence. This means that the first sentence will be the most read sentence of your paper. Your second sentence will be read by fewer readers than the first, and your third sentence will be read by fewer readers still, and so on (if we assume that readers do in fact begin at the beginning instead of jumping in at random points). You will lose readers sentence by sentence whatever you do. This cannot be avoided.

The stronger the sentences, however, the lower the rate of attrition, and the higher the chance that some readers will make it through to the last one. Make the sentences flow and your readers will stick around. Glue them together with transitional words for clarity; place signposts to guide the reader. Create contrast and tension for excitement. Use cliffhanger endings: pose a question. Answer it in the next sentence.

Journalists use the term lede for the first few sentences of a news story—that is indeed how they spell it, instead of “lead”, presumedly for historical reasons that involve mechanical typesetting and lead (the metal, that is). The lede is the lead portion of a news story—it gives the gist of the story, it sets up the story and, most importantly, entices the reader to read the rest. While the lede should give a clear picture of what the story is about, it should not give the whole game away. The lede should raise questions so that the next paragraphs of the story can satisfy the curiosity of the reader by providing answers and details. Journalists even have their standard schemas for ledes. The inverted-pyramid lede attempts to compress the who-what-where-when-why-how of a story into a single sentence or two, and then adds details in decreasing order of importance. The question lede begins with, well, a question, one that you absolutely need to hear the answer to.

Let us have a look at some great openings and powerful first sentences.

As the first example, consider the first sentence of Battiston et al., “The price of complexity in financial networks”, PNAS 113, 10031(2016): “Several years after the beginning of the so-called Great Recession, regulators warn that we still do not have a satisfactory framework to deal with too-big-to-fail institutions and with systemic events of distress in the financial system”. This is a powerful beginning that immediately tells what the general problem addressed by the paper is. It also forces the reader to read on—after all, who wouldn’t want to know where this story is going?

Another example of a great opening, from Centola & Baronchelli, PNAS 112, 1989 (2015): “Social conventions are the foundation for social and economic life. However, it remains a central question in the social, behavioral, and cognitive sciences to understand how these patterns of collective behaviorcan emerge from seemingly arbitrary initial conditions.” The problem that drives the research is clearly spelled out in the second sentence. Note that in this paper, the exact research question will not appear before the 4th paragraph. The introduction forms a funnel from the broad problem to the more detailed question.

Finally, here is the first paragraph of Altarelli et al., Phys. Rev. Lett. 112, 118701 (2014): “Tracing epidemic outbreaks in order to pin down their origin is a paramount problem in epidemiology. Compared to the pioneering work of John Snow on 1854 London’s cholera hit [1], modern computational epidemiology can rely on accurate clinical data and on powerful computers to run large-scale simulations of stochastic compartment models. However, like most inverse epidemic problems, identifying the origin (or seed) of an epidemic outbreak remains a challenging problem even for simple stochastic epidemic models, such as the susceptible-infected (SI) model and the susceptible-infected-recovered (SIR) model.”

The above paragraph gets from the topic (tracing epidemic outbreaks) to the research question (identifying the origin of an epidemic) with three sentences, and the authors have even managed to include a brief historical detour of the you-know-nothing-John-Snow variety (sorry, I had to). This a great opening. The reader gets a clear idea of what the paper is about, and becomes curious: how did they solve the seed identification problem?

In the next post, we’ll move from the introduction to methods & results.