Scientific Method —

Research funding may be going to copy-cat grants

Nearly $70 million may have gone to overlapping projects since 1985.

Grant money is the lifeblood of academia. Without it, graduate students and postdocs can’t get paid, professors won’t have any summer salary, and experiments would never get done. In addition, universities take some percentage off the top to support facilities and infrastructure. Obtaining grants is a major part of an academic researcher’s job—perhaps second only to publishing papers as a measure of his or her success.

That pressure to obtain grant funding, combined with an ever-decreasing success rate, drives many scientists to submit more and more applications. It’s fairly common practice to submit the same or similar applications to different funding agencies, in the hope that one will be funded. Usually, the applicant has to disclose if they've obtained other funding for a specific project, but it’s clear that there is potential for duplication of funding.

To get a sense of how much funding duplication may have occurred in recent years, a group of three researchers led by Harold Garner of the Virginia Bioinformatics Institute analyzed over half a million grant applications summaries using the automated text similarity engine eTBLAST. Most of the applications came from the National Institutes of Health (NIH) and National Science Foundation (NSF), with a smaller number from the Departments of Energy (DOE) and Defense (DOD) and the Susan J. Komen for the Cure foundation. In a recent Nature article, they estimate that nearly $70 million in grant money overlapped with previously funded efforts.

The text similarity tool has been used previously to look for plagiarism in biomedical research papers. It calculates similarity scores between two grant summaries by looking at shared words being used in the same location in sentences. Out of over half a million grant summaries, the tool assigned 1,300 high similarity scores, which the authors then analyzed manually.

Note that they could only use the summary of the grants published openly—a more comprehensive look would require the full applications and award contracts, which would likely be only available through a Freedom of Information Act request. In fact, the DOE stopped publishing these summaries in 2009, so recent grants weren’t available for analysis.

Through this manual analysis, they found 167 pairs of grant summaries—a little over three hundred total grants—with “suspicious” overlaps in research aims, hypotheses, or goals. These grants correspond to around $200 million in total funding, with an estimated $69 million in overlapping funds (the initial grants being, on average, 1.9 times larger than the potential duplicate).

Now, there are a number of caveats to these figures. First of all, their analysis doesn’t actually determine whether the potential duplication is inappropriate—or even whether there is true duplication occurring. Grant summaries are typically written in broad strokes, meant for the non-technical public. It’s easily possible for multiple grants, even to the same investigator, to fund research with similar overall goals, but different details, in terms of the specific experiments being done.

Furthermore, while two grant applications may be similar, the actual money handed out could have been reduced in consideration of the fact that two grants were awarded. Larger grants may fund most of the work, while a second, smaller grant could provide for some supplemental research. Determining true funding duplication would require comparisons of the full grant proposals and funding contracts.

Out of the grants with suspicious overlap, only a third ran concurrently. The remaining two-thirds could have been “recycled”, where a successful grant was resubmitted either to the same funding program or elsewhere. Again, many research programs in this category may be legitimate: once a three-year grant concludes, a longer-term research program—say, curing pancreatic cancer, which isn’t likely to be done in three years—could be funded again through a similar proposal.

However, in the manual study of duplicates, the group did find a number of worrying situations, such as followup applications that proposed studies that would produce data that was already available—in fact, they were cited in the initial applications (perhaps it was research on time travel). In other cases, similar grants from different agencies funded collaborators at different institutions to do similar sounding work.

The authors believe that their estimation of duplicate grants is conservative for a couple of reasons. For one thing, their text analysis tool was unable to consider summaries with less than 200 words—accounting for nearly 200 thousand grants that weren't included in this analysis. They also believe the tool missed some duplication in the summaries based on Garner’s previous look at plagiarism in biomedical research papers. There, the eTBLAST tool found duplication in 0.04 percent of papers, even though 1.4 percent of scientists admitted to plagiarism in a survey.

If a similar amount of duplication was missed here, then over $5 billion in grant funds awarded since 1985 could involve duplicate efforts. That’s probably an overestimate, but even the $70 million found here is concerning. Any illegitimate duplication in research funding cuts down on the money available for efforts elsewhere.

Garner and his coauthors argue that government agencies need to make more of an effort to avoid funding duplicate proposals. According to them, a good start would be a centralized database of proposals for all government agencies, which would enable a direct comparison of applications for any duplication. Considering the ever-tightening budgets of funding agencies and the growing numbers of scientists looking for grants, it's imperative that we make smart decisions on where to spend our money.

Kyle Niemeyer
Kyle is a science writer for Ars Technica. He is a postdoctoral scholar at Oregon State University and has a Ph.D. in mechanical engineering from Case Western Reserve University. Kyle's research focuses on combustion modeling. Emailkyleniemeyer.ars@gmail.com//Twitter@kyle_niemeyer

43 Reader Comments

Misapplication of funds is one thing... but I'm sorry... I'm just going to go ahead and say that making an issue out of this is bullshit.

Of course we should always be concerned about spending fiscal resources wisely; however, the more important issue is that we increase the abysmal percentage of money spent on science and research overall. Look at the bigger picture to realize that we could cut an extremely small percentage off of say, I dunno, defense spending, and almost double the total amount of public funds spent on scientific research.

This 70 million figure seems to be blown out of proportion (in terms of context, not correctness). Not to mention that duplication and repeatability are core concepts that allow science to remain accurate. Maybe I'm missing something here.

I'm not sure how much credence to put into this. I can't recall the amount the amount of times I've used "lab boilerplate" or seen it used for a wide spectrum of related, but distinct projects. Labs are commonly working toward single over-arching hypotheses with the same personnel and same facilities - the important differences in execution or particular outcome measures or model changes can be easily missed by an unfamiliar eye. Not to mention, the iterative nature of many projects - although a lot of this would be covered by similar non-consecutive grants mentioned by the author.

And the NIH, at least, definitely polices this. For example, I once wrote an SBIR grant with faculty at a major research university. Unbeknownst to me and my colleagues, he submitted the exact same research plan as a subset of an R01 about a month later. We heard from CSR and NIAMS pretty quickly - they were not happy.

(snip)However, having one place to apply for grants from all of these government agencies would be a wonderful way of saving resources from both the funding agencies and the researchers. I would be fully behind something like that.

One other factor I've heard from multiple scientists: in some fields with high startup costs (equipment, test subjects, etc.) it's hard to get grants without data suggesting your proposal will yield interesting results but almost impossible to get funding to collect that data. This means that grant applications often copied previously successful projects but at least partially funded the next stage of work along with a few derivative papers expanding previous results.

As in many areas, I think part of the problem is a misaligned accountability model: it's often better to fund credible proposals quickly and audit some small percentage after the fact rather than paying more money putting everyone through the full, exhaustive application process.