Although our analysis focused on PIs from one institute funded in a single year, NIH Deputy Director for Extramural Research Mike Lauer and his colleagues have extended the analysis across NIH over a much longer period of time. This group recently posted their analysis, including all of the underlying data, on bioRxiv. Publicly sharing this data set is a very good practice that allows others to examine the results more thoroughly and to extend the analysis.

A key graph from this analysis, and one that has attracted much attention, is shown below:

The graph shows a curve fit to data for research productivity versus grant support using recently developed measures for these parameters. Annual grant support is measured by the Grant Support Index (GSI). This measure was developed as an alternative to funding level in dollars in an attempt to show that some types of research are more expensive than others. The GSI assigns point values to each grant type, with 7 points for an R01 grant with a single PI, 5 points for a more limited R21 grant with a single PI, and so on. Research productivity is measured with the Relative Citation Ratio (RCR), a metric based on citations, developed to correct for differences in citation behavior between fields. Both metrics are presented on logarithmic scales in this graph.

The most noteworthy aspect of this curve is that it rises with a steeper slope at lower values of GSI than it does at higher levels. This suggests that, on average, the increase in productivity associated with funding an additional grant to an already well-funded investigator would be less than that for providing a grant to an investigator with no funding or providing a second grant to an investigator with only a modest amount of funding. The separation between this observed curve and a hypothetical straight line (with productivity strictly proportional to research support) has been referred to as “unrealized productivity.”

Before delving further into this point, let us take advantage of the data that were made available to plot the relationship, with two changes. First, rather than just plotting the curve fit to the data, we show the data points themselves (for all 71,936 investigators used in the analysis). Second, we plot the data with linear rather than logarithmic scales to avoid any distortion associated with this transformation. The results are shown below, with the top graph showing all of the data points and the bottom graph enlarging the region that includes almost all investigators and showing a “spline” curve fit to these data along with a linear fit for comparison.

These plots reveal that the underlying data show a large amount of scatter, consistent with my earlier observations with the NIGMS-only data set as well as with the intuitive sense that laboratories with similar amounts of funding can vary substantially in their output. The curve fit to these data again reveals that the slope of the productivity versus grant support relationship decreases somewhat at higher levels of grant support.

With these observations in hand, we can now examine some expected results of proposed NIH policies. Suppose an investigator with an annual GSI of 28 (corresponding to four R01 grants) is reduced to an annual GSI of 21 (corresponding to three R01 grants) and that these resources are used to fund a previously unfunded investigator (to move to GSI = 7). According to the fit curve, the expected annual weighted RCR values are 9.0 for GSI = 28, 7.1 for GSI = 21, and 2.6 for GSI = 7. The anticipated change in annual weighted RCR is (–9.0 + 7.1 – 0 + 2.6) = 0.7. Thus, the transfer of funding is predicted to increase productivity (measured by weighted RCR). This appears to be one of the primary foundations for the proposed NIH policy.

This approach depends on the accuracy of the fitted curve in representing the behavior of the population which, as noted, shows considerable scatter. An alternative method involves directly simulating the effects of the proposed policy on the population. For example, one can take the 968 investigators with annual GSI values over 21 and reduce them to annual GSI values of 21, scaling each investigator’s weighted RCR output by the reduction in annual GSI. The total number of annual GSI points over the threshold of 21 for these investigators is 4709. This corresponds to the ability to fund an additional 672 R01 grants. If these grants are distributed to previously unfunded investigators, the anticipated weighted RCR output can be estimated by choosing a random set of 672 investigators with annual GSI values near 7 (say 6 to 8). Because of this random element, this simulation can be repeated many times to generate a population of anticipated outcomes. This results in the distribution shown below, with an average increase in weighted RCR of 0.3.

For most simulations, there is an increase in average weighted RCR, with the average being somewhat less than that anticipated from the analysis based on the fit curve alone (0.3 versus 0.7). There are several possible explanations for this difference, including limitations in the fit curve to capture the features of the highly scattered distribution and the approach to modeling the reduction in the anticipated output from the well-funded investigators.

The same simulation method can be applied to funding an additional 672 PIs with one R01 so that they each have two R01s by selecting 672 random PIs with an annual GSI of ~7 (6 to 8), removing them from the population, and adding 672 chosen from the population with an annual GSI of ~14 (13 to 15). The results are shown below, with an average increase in weighted RCR of 0.4.

These simulations appear to confirm that, on average, transferring funding from very well-funded PIs to less well-funded PIs may result in a small increase in weighted RCR output.

Conclusions

I strongly favor examination of such appropriate data to guide policy development. Understanding the relationships between grant support and research output is, of course, one of the most fundamental questions for any funding agency. The attempts by NIH to tackle this issue are laudable. However, as I discussed above, the presentation of simple curves fit to the data masks the considerable variation in output for PIs at all levels of funding. The development of policies based on a hard cap at a particular level of GSI seems to me to be problematic. Well-funded investigators always have substantial histories of research accomplishments. NIH program officers and advisory councils should have access to data about previous research accomplishments and productivity when making recommendations about potentially funding additional grants and should be encouraged to examine such data critically, even when the application under consideration has an outstanding peer review score. The opportunity costs for providing additional funding to an already well-funded PI at the expense of an early or mid-career PI with less funding are considerable. In addition, the use of a hard cap amplifies the importance of the details of how the GSI is calculated, with the selection of particular parameters potentially discouraging collaboration, training, and other desirable outputs, as has been the topic of ongoing discussions. It seems unwise to convert the highly nuanced information contained in lists of grant support and publications and other outputs into points on a graph rather than empowering the trained scientists who serve as program officials and advisory council members to use their judgment to help fulfill the mission of the NIH.