Red flags when reading scientific papers

So, Violent Metaphors has a much better guide on how to read and understand scientific papers for non-scientists than I could ever write. It’s more concerned with biological and medical studies than chemistry, but I use a highly analogous process when I’m reading through a paper. I wish I’d received something like that back when I was in my undergrad. It would’ve made my honors project and my master’s studies a lot easier. I was debating writing up something similar, but since someone else has already done it better than I think I can, I’ll just link to their piece instead.

However, there was one oversight in the piece, and someone else already inspired me to write about it: Namely, what should be considered “red flags” when reading a scientific paper?

Red flags here don’t necessarily mean the paper in question is untrustworthy. But they do mean that something is fishy and you should approach the information laid out in the paper with caution. There are some legitimate reasons for many of the red flags here: Sometimes people do these red flags because they’re pursuing a patent and don’t want someone else to get there first. Sometimes people have had a paper shelved for years while they worked on something else and only just came back to it. But, in general, these are signs that you should view the paper with a bit more skepticism than you normally would. So, I’m going to include red flags, why they’re problematic, and plausible explanations of why they don’t necessarily mean that the paper in question is untrustworthy, just that you should give it the side-eye.

Old bibiograhy. By this, I mean a bibliography dominated by journal articles that are old enough to be out of date. How old that is depends on the field and how competitive said field is. In general, if there are fewer than a half dozen citations from the past two years in a competitive field or fewer than that many citations from the past 5 for a less competitive field, that’s a big red flag.

Why is it a red flag? It’s a red flag for several reasons: First, it indicates the authors possibly weren’t keeping up on the field’s developments, which means they might not have an up-to-date understanding of the material, which in turn can cause their experiments and conclusions to be flawed. Second, and more worrying, it indicates a possible intentional omission by the authors. “What don’t they want me to know?” is what I think when I see a dearth of recent papers in the bibliography, and then I go and look up recent papers from the field.

Why it’s not necessarily malicious: Maybe the authors had it shelved for a few years before they submitted it. That happens. This is more believable in a field that’s become competitive only recently or in a field that is not highly competitive. If you think you won’t get scooped on project A but you’re worried you will on project B, you work on B more to get it out faster, and then you come back to A. That’s the nature of science competition. This can and does mean that almost-completed papers can sit shelved for months or even years before you return to them. It doesn’t happen often, but it does happen.

Vague language in the experimental section. The experimental section is where you are supposed to detail exactly what you did, what yields you obtained, and what methods you used. In a previous post, I gave a sentence of example academicese. That sentence would have been from an experimental section of a paper or technical analysis report. You can see that good experimental section writing contains hard numbers, not wishy-washy language. If the experimental section omits yields, doesn’t identify how analyses were done, and fails to identify instruments, that’s a red flag.

Why is it a red flag? The experimental section is supposed to give enough information that anyone literate in the jargon of the field could read it and then duplicate the work presented in a given paper. Vague language makes it hard-to-impossible to duplicate work – you can’t duplicate someone’s procedure if you don’t know what the procedure is.

Why it’s not necessarily malicious: While a person with a suspicious disposition might jump to, “Maybe they fabricated it!” and certainly fabrication of results does happen, that’s not the only possible explanation: Often, if you’re working with industry partners, that means dealing with patents. Patent lawyers will ask you not to publish specific details until the patent goes through. If the field is highly competitive, you want to get your results out ASAP so you get credit for your work. If the patent hasn’t gone through yet, that means publishing with the patent-sensitive information omitted.

Incorrect terminology and typos: For example, if someone calls a bicarbonate species a carbonate when it’s clearly not, that’s a red flag.

Why is it a red flag? Put simply: I don’t trust the expertise of anyone who can’t get basic terminology right. At best, it shows a carelessness in writing, which might carry over to other parts of the paper – if they have that jarring a basic terminology error, what’s to say their tabulated data is free of similar mistakes? At worst, it shows a fundamental misunderstanding of an integral part of the field.

Why it’s not necessarily malicious: Some fields have silly conventions. One I worked in called all (CO3) species “carbonate.” In that area, CH3CO3CH3 = CH3CO3– = HCO3– = CO32– = carbonate. Which is silly. Even if it is the convention of the field, though, tread with caution: overgeneral and clumsy conventions like that indicate the field in question isn’t overly concerned with the identity of that type of species, so long as it serves its purpose. If you’re going for something specific, don’t trust resources from a field that doesn’t care about what it is so long as it does the job.

Self-inconsistency. By which, I mean: If someone claims to synthesize a strongly basic species, but later mentions the pH of their solution was neutral, and similar mutually exclusive claims.

Why is it a red flag? Anyone should be able to see why. Mutually exclusive things can’t both be true.

Why it’s not necessarily malicious: See above about silly, silly field conventions. This is the kind of mind-boggling headache caused by inaccurate field-specific terminology.

Relying only on one or two methods of characterization, especially if those methods are not absolute or if a novel compound is claimed.

Why is it a red flag? Aside from the simplest compounds, you can’t prove beyond a shadow of a doubt the identity of anything with just one characterization method. This is worse if you have a combination of products that you’re having a hard time separating.

Why it’s not necessarily malicious: Depending on what you’re working with, full characterization can be extremely difficult or dangerous. If you have something that heats severely under a Raman laser, it might not be safe to get full vibrational spectroscopy. If it’s unstable, powder XRD might be unsafe. And if your stuff is impure or highly similar to other compounds, elucidating what yours is and what the impurities are can be difficult. Finally, some research groups are extremely limited on funding and analyses can be eye-poppingly expensive.

When computational methods are used, source code and commentary for algorithms and/or programs and methods used are not provided.

Why it’s not necessarily malicious: Some things are field-standard computational methods that an experienced programmer should be able to replicate on their own, easily. I defer to computer experts for how to tell the difference.

The article is published in a questionable journal.

Why is it a red flag? Journals that are questionable are questionable for a reason. They often do not have high standards for what they publish, and their peer review may be sloppy to non-existant. Some journals have published blatantly plagiarized articles, whilst others routinely publish sloppy or even work in support of hypotheses that are laughably wrong and long-debunked (consider all of the homeopathy studies that get published).

Why it’s not necessarily malicious: Sometimes, you want to publish something to get it out there, but it’s really just not that interesting. In such a case, a questionable journal might be the only one you can get to accept your paper.

The article claims results not closely tied to their data or fails to place their results in context with other studies.

Why is it a red flag? Failing to back up your claimed results with your data means your results are not justified. As well, failing to place your results in context with related studies can falsely inflate the perceived significance of your study.

Why it’s not necessarily malicious: Some people jump to conclusions. Even still, it’s at best sloppy writing and ethically questionable.

The article exhibits circular reasoning and/or begs the question. Example: Diagnostic criteria standardized by gender disparity (i.e., in order to be considered valid, they have to give the same gender disparity as is observed in literature) are used to screen a random population for [condition assumed to have a gender disparity in distribution], which is in turn used to back up existence of gender disparity.

Why is it a red flag? Justifying an assumption based on results from something based on your assumption means your study is meaningless, as your premise lacks external validity.

Why it’s not necessarily malicious: Certain fields have certain assumptions imbedded in their history so deeply it’s entirely possible for an author to not realize their reasoning is circular. In ordinary life, it’s what I like to call the “Everybody” effect. There’s a lot of things that “everybody knows” that’s plain wrong. Case in point: There is no evidence that Marie Antoinette ever said “let them eat cake.”

One or more term(s), concept(s), or criterion(a) is defined unclearly, implicitly or not at all, or the used definition is not justified. Example: An article measuring the success of free/libre open source software (FLOSS), and doesn’t define what “success” / “success rate” / “failure” / “failure rate” / etc. is/are supposed to mean in their article.

Why is it a red flag? Without clear criteria for what something means, a statement has no clear meaning.

Why it’s not necessarily malicious: It’s possible that you know what you mean so well that you forget that your reader doesn’t know what you know. Usually, this is an issue of inadequate peer-review more than intentional obfuscation, in my experience.

Unannounced conflicts of interest (COIs). For example: A private political think tank releases a paper supporting the political position of the party or private individual that funds them, without announcing where they get their funding from.

Why is it a red flag? If the authors have a conflict of interest, they have a vested interest in getting a certain result, and as such their paper deserves extra scrutiny. If the authors do not announce this conflict of interest in the sake of transparency, one has to wonder what they’re trying to hide.

Why it’s not necessarily malicious: The scientific literature views conflicts of interest differently from most people. For example, it is it is not a COI for Paul Offit to release a review on vaccine safety even though he’s developed and brought vaccines to market, for example, as most of the vaccines he’d be reviewing are not central to his work. It is especially true if his vaccines are not within the scope of the review.