Preclinical Studies Don't Regularly Adhere to Best Practices

Animal experiments published in a handful of cardiovascular journals mostly ignore NIH guidelines.

May 8, 2017

Kerry Grens

WIKIMEDIA, RAMAIn 2014, the National Institutes of Health established guidelines for preclinical experimental design—hoping to encourage researchers to adopt best practices, such as randomization and the inclusion of both sexes of lab animals. Yet, in the years preceding and since, the majority of papers published in several cardiovascular journals show a widespread disregard for these standards, according to a recent analysis published in Circulation Research.

“If we’re falling down at this early stage, we have very little hope of having a good translation rate,” said Benjamin Hibbert, a cardiologist and researcher at the University of Ottawa who led the study.

The notable exception to the trend, Hibbert and his colleagues found, was one journal, Stroke, which saw a marked increase in the adherence to best practices following the implementation of a Basic Science Checklist for authors. The checklist, which was introduced in 2011 and updated last year, asks for roughly the same standards as NIH’s guidelines: randomization, blinding, sample size estimation, and explanations of which animals were included, among other expectations.

“It shows that [our] required preclinical checklist for treatment related animal experiments have improved the quality of our preclinical papers,” Marc Fisher, Stroke’s editor-in-chief who developed the checklist, wrote in an email to The Scientist, “but we are still trying to make them even better.”

In light of concerns about reproducibility and the quality of animal studies, Hibbert’s team wanted to see how preclinical study design had changed in recent years. They gathered nearly 4,000 papers published from 2006 to 2016 in five journals under the umbrella of the American Heart Association: Circulation, Circulation Research, Hypertension, Stroke, and Arteriosclerosis, Thrombosis, and Vascular Biology. All of the papers included in the analysis reported the results of in vivo interventions on animals. Hibbert and his coauthors then picked through each paper to see how it complied with the four standards: randomization of animals, having a researcher blinded to the treatment groups, sample size estimation, and the inclusion of both sexes of model organism.

Of these four criteria, only sample size estimation became more common. But, as the researchers pointed out, it was reported in fewer than 7 percent of papers by the end of the decade of publishing they considered. Although it didn’t change over time, blinding was more commonly reported than sample size estimation—in more than 35 percent of papers in 2016—and randomization was present in more than 25 percent of papers in 2016.

As far as the sex of the animals included, Hibbert and colleagues found 80 percent of papers disclosed the sex, with 71.6 percent of those studies using males only. Around 13 percent used only female animals, and roughly 15 percent used both (they published these data separately in February in Circulation).

“I thought, given the progress made in clinical trials [regarding the inclusion of females], that would percolate into preclinical researcher circles,” said lead author Daniel Ramirez, a research fellow at the Ottawa Heart Institute. “But there was no signal we could find that males are being used exclusively less so than before. If anything it’s increasing.”

“This paper points out a number of methodological problems in the preclinical cardiovascular literature in various journals, including Circulation Research. The fact that we elected to publish this article is an eloquent testimony to the seriousness of our concern regarding rigor in preclinical research,” Roberto Bolli, the editor-in-chief of Circulation Research, wrote in an email to The Scientist. He added that his journal is taking steps to boost the scientific rigor of the papers it publishes, including laying out expectations for experimental design.

Nathalie Percie du Sert, the experimental design program manager for the National Centre for Replacement, Refinement, and Reduction of Animals in Research (NC3Rs) based in London, said she is not surprised by the results. “Researchers don’t realize that this can have an impact on their results, so they don’t know enough how important those measures are.”

Except for the inclusion of both sexes, Stroke outperformed other journals in the percentage of papers complying with the preclinical standards, and adherence became even better after the 2011 checklist rolled out. After circulating the checklist, for instance, 64 percent of papers randomized animals, nearly 78 percent included blinding of their treatment allocation, and close to 19 percent reported sample size estimation. That was up from roughly 38 percent, 54 percent, and 3 percent, respectively.

“It’s a credit to Stroke,” said Hibbert. “Implementing that checklist and incentivizing properly is the only way you’re going to get people to comply.”

It’s also a credit to the stroke research community at large, noted Percie du Sert, who said the stroke field has been leading the charge for better preclinical design. Her organization has also been working to develop standards for proper experimental design. NC3Rs established the ARRIVE guidelines (for Animal Research: Reporting In Vivo Experiments) in 2010, and since then numerous journals have endorsed them. NC3Rs is currently funding a trial to see if requiring authors to fill out an ARRIVE checklist upon submitting a manuscript can effect positive change. And the organization has developed an online tool to help guide scientists through study design.

While Hibbert said he was encouraged by Stroke’s progress, one result was particularly discouraging: the citations given to papers of sub-par experimental design. He and his colleagues found no relationship between citations and papers that either adhered to the guidelines or didn’t.

“We naively thought that people who are doing better science must be getting rewarded,” said Hibbert. “It has no impact. People don’t recognize high quality and low quality science.”