This is going to have to be a quick post. I've got too many urgent things piling up around me. This post isn't urgent, but it is important. There will, without question, be mistakes. I trust my loyal readers to point out, preferably with glee and schadenfreude where I'm wrong.

Another caveat: This is not about Doc Becca. I count her as a friend. I am upset about her situation and where she is and the stupidity of her university. I've been following her for years, since she was stumbling towards the tenure track. I love her writing, and from what I know of her career IRL, she deserves tenure, based on her science and productivity.

Again, this is not about her. This is about all the things that got said about study section last night. Some of this is my opinion, but lots of it is about how study sections (SS) work.

So on to content, and not in a necessarily good order:

Firstly, remember reviews are two steps: Study Section and Council. Study section reviews proposals for scientific merit and assigns a score. This score, later on, by NIH staff, will be normalized into a percentage based on all the scores from the current and the last two iterations of this section. At SS, the proposal may be discussed (if it s in roughly top 50%) or not (triaged, in bottom 50%). Reviewers, any reviewer on the SS, may ask for a triaged proposal to be discussed, before SS, or even at SS. I have done this. It is not common, but not rare, either.

SS does not decide funding. Council does, with much input from NIH staff. The person who runs SS (a Scientific Review Officer) is not part of the team that decides funding or ranking of grants to be funded. SRO's usually work for OER (office extramural research), whereas staff PO's (Program Officers) work for individual IC's. IC's decide funding. There is variation in the score that is funded, and there is variation in who, at what score, gets funded. IC's do this because they have programmatic priorities. If Bunny Hopping is in this year, a worse percentile may get picked up. Something that is outside the "main mission" of an IC may not get funded.

Secondly, (and we are now on our third espresso of the morning): the way that SSs work is not a big mystery. Junior people can do a round on SS to learn how. I've got some posts on that, and will try to dig them up for you. NIH has a program for this and it is very valuable. The most grantsmanship I learned was when I sat on SS.. So now, stuff about how SS work:

Who decides who reviews which proposals? The SRO does. They have a miserable job here, and no time to worry about screwing you. Truly. They have a list of reviewers, which may not include sufficient expertise. They have to go begging for reviewers. Outside reviewers. I've been to reviews where I've been an outside person and there have been more outside people then standing members, because of the range of proposals that come in.

Sometimes outside reviewers don't go to the meeting, and call in. They haven't heard the other reviews, and they are not "calibrated" to the section. Most outside people are aware of this and defer to the ones who are there. But not always. This is another source of variation. But not everyone can drop everything for 2-3 days and go to Washington. You can say "no phone reviews" but that may mean worse reviewers. What's worse? Someone more removed from the area of the proposal, who has no appreciation of either the premise or the design.

People serve on SS for 3 or 4 years, and technically are supposed to be at all of them. Some SS have a program where you do an extra year and come 2 meetings of 3 each year. I was told that option was so over-subscribed at the one I'm joining, that its not an option for now. But, people join, people leave. Maybe the proposal had an ad hoc to start with, and that ad hoc can't do it again. You may not get the same reviewers, you may have 2 of the same, or 1 of the same, or none. There is not a set rule here. BUT! when a reviewer gets a new to them proposal that has been reviewed before, they get the entire summary sheets that the PI received. IME, reviewers read these and consider them. (but more on re-review below).

(BTW: how does your proposal go to a particular section? Thats another post, but in short: you can request, there are key words that help determine, and people who do this).

SRO's have to get expertise, but they also cannot give any reviewer significantly more than others. What is more? Varies from SS to SS. For some, its 4 or 5. But, I've been on ones where I've got 8 or 9 and thats standard for the sitting or standing members. (I sit, because I'm old, and standing hurts my back. This is a joke, do not read anything into it). It's a lot of work for everyone, and you get paid squat for doing it, and it's one of those things you do. I don't think anyone relishes the power involved. Ok, maybe there are a few antediluvian bigdogs who do. I don't. I just try to do the best damn job I can.

IN MY EXPERIENCE: SS members care. They work hard. They are obsessively concerned with being fair, and just and right. They are sensitive to the PI, and take "Investigator" criteria seriously. No one is out to screw you. But, of course, they are human, and have biases and have a lot to do. Sometimes they think proposals are bad. Sometimes they get irritated with a proposal partway through (Make the reviewer your ally, your advocate). But for the most part, the reviews that I've seen, even the ones that don't get discussed, reviewers are capable of partitioning their perspectives. They can find both good and bad in a proposal. They try to balance those things, and realistically evaluate their relative importance.

Finally, a bit about some of things that got said last night.

Scores going up and down? As much as I want Doc Becca to get funded, I do not see how one can be protected against a dropping score. The reviewers have a different proposal. Maybe the proposal is worse, to one of the ones reading. I've received conflicting advice on a first submission before: add human subjects, do not add human subjects. Take out Aim2, expand Aim2. And, yes, conflicting advice is horrible to deal with, and you can't know if the person who gave that advice is going to review again. NIH has tried to circumscribe that kind of advice with "review the proposal in front of you, do not write a new version for the PI". But even just saying "this is good" or "this is weak" can show up in the same review.

If I was told that my review had to be limited by the previous reviews, it would make reviewing very hard for me. I read each proposal. I try to give each proposal my very best thoughts. If I see something glaringly bad, that got missed (as far as I can tell) in the previous review, I am not going to give it a pass because some other reviewer didn't see the problem. But, if I think something is very very good, significant, innovative, and the previous review said "meh", I am also going to point that out, and advocate for it.

People last night said that this is a problem with reviewing, if this happens all the time. I don't know if it happens all the time. I dont have statistics. It has certainly happened to me, more than once, and I've been putting in proposals for a very long time.

You may agree, or not with the idea that scores can drop. But, if you believe that having external reviewers, peer reviewers, reviewers from the larger community assess proposals, is a good thing, and that if proposal can get worse. that you must admit that a score may drop.

If you want to limit scores, then the system will need to change. Maybe that would be a good thing. I tend to think not, as things that would limit, in general, reviewers will not improve the system. In the end, I suspect that those limits would be co-opted by those in power, those with the most grants, and the most time and resources to submit.

Is the system broken? Once again, I say no. It is not perfect. There are problems, and individuals who get lost or hurt or destroyed in the grinding of the gears. But, the alternatives to people like me reviewing grants is letting the PO's at NIH make all the decisions. Even if they could, which they can't, physically, they just don't have time, this would not be a good thing. Right now there is some flexibility there, and as is true of everything else, those people are human beings with all the attendant flaws of human beings.

I've not edited this way I usually do, because its late and I want to get it out. There's a lot more to be said, so likely another post on this.

===================

Update:

Mike Lauer did an analysis, and said >75% of proposals improved in prctile. Median improvement was ~10%. https://t.co/GthwImV8N3

Most SROs in my study section experience insist that scores for revised apps should not be benchmarked to the prior score.

Review is relative to the current round- there may be a stronger field for the revision.

The outraged voices on Twitter appear to be unaware that this proposed rule would lead us back to where the A0 has no chance of being funded b/c the section has to deal with the revisions all getting better scores. The old traffic pattern of the A2 days had quite a few problems for noobs associated with it .

Here's an aspect that I don't believe is addressed: there should be peer pressure within the SS to tamp down review suggestions/demands that will, if actually fulfilled, likely make the proposal worse.

Sounds straightforward, right? Of course we shouldn't ask for things that would make it worse! But no, really: I've seen and heard of and read many reviews that paraphrase to, this proposal would be stronger if it included X.

So, naively, applicant adds an aim for X. This means some of: removing another aim. Adding expertise in X as a collaborator. Adding introduction space to support X. Creating a link between your previous aims and expected results, and X. Finding some way to establish feasibility for X.

All of those become huge pitfalls, for someone without direct expertise in X. And sometimes, just maybe sometimes, the original remark was .... a bit of a stock critique, a throwaway line for someone who just didn't like the grant but couldn't find something more specific or functional to say at 10pm on the plane the night before the meeting. Not necessarily someone giving detailed or deep thought to how, why, where X could be added, or what that would really look like or mean. In SS, heads are nodded in agreement. Grant gets triaged on revision.

We dance on command, hoping all the "move your knee to the left this time" variations have a lot more meaning than maybe they do.

But imagine if the reviewer were themselves required to provide a "expectations and alternatives" list for the X that they suggest should be added?

Current instructions (from NIH) to SS members include: do not suggest "new things" to be added. Review what is there. I don't recall many, if any recent, examples of telling someone to add "X" to the proposal.

Grantspersonship also dictates that adding an aim for which you have neither experience nor preliminary data is very seldom a good thing to do. Rather than adding something new, reconfiguring the proposal to exclude the need for the missing thing is often a better path.

Review of NIH grants is explicitly NOT to help the applicant make her proposal better the next time. So if one believes the job is to"dance on command" and then believes that a score should get better because one did what the reviewers said to do, this is discordant with the way NIH views the process.

As I said last night on Twitter, a big factor is churn in the study section membership. On the SS I recently left, there were 9 others who rotated off the same cycle (plus another couple the same year), so more than half of the membership changed within a couple of cycles). With that level of turnover it's difficult to maintain consistency. To help out, our SRO used to send "reviewer average" scores before the meeting, so everyone knew if they were scoring higher or lower than the overall pool. For some, this would drive re-calibration prior to or at the meeting. The key here is regression to the mean - the stated goal of the CSR higher-ups is to get the overall average score for the meeting in the 4-5 range.

And as you so rightly state here, it's all about the group of proposals in the individual reviewer's hands. If your reviewer has a pile of crap then maybe yours might rise to the top. If someone has a lot of very good grants then you're S.O.L. (as a reviewer I've experienced both extremes). Also consider types of applications - some reviewers get a stack of ESI proposals, while others may only get 1 or 2 mixed in with the BSD proposals. Which group would you rather have your proposal in with? Your grant doesn't have to be good, just better than the others in that person's pile.

Another thing that got discussed last night was "any SRO worth their salt..." There are some great SROs out there, and I have heard (although not experienced personally) of some weak ones too? The SRO and the chair need to work hard, to make sure shenanigans are reined-in fast. Simple reminders like "why did you score it a 3 if you found no weaknesses?" can go a long way to ensuring that the reviews are actually useful to the applicants. But, things do slip through the cracks.

Finally, regarding the whole A0>A1>A0asA2 thing, it should be emphasized that the job of the reviewers is not to write the proposal for you. This is specifically WHY they got rid of A2s to begin with! The NIH didn't want an iterative process, and specifically wanted to avoid the whole "I did what they asked for so now I deserve to be funded" mentality. The science in the proposal, regardless of A#, needs to stand alone. If the score got worse then maybe the science is actually improved but the field moved on, and what was hot 6 months ago is now no longer so.

I am not sure what, if anything, can be done about churn. Well, actually, one thing is to start recruiting younger/more junior reviewers. But reviewing is hard, and even if you turn something around expeditiously, you will likely cross a "new year" boundary and have a change in 1/4 to 1/3 membership.

BUT! when a reviewer gets a new to them proposal that has been reviewed before, they get the entire summary sheets that the PI received. IME, reviewers read these and consider them.

One thing that remains mind-bogglingly insane to my way of thinking is that the NIH provides reviewers of revisions prior summary statements without access to the proposal itself. Someone who has reviewed the prior version has their memory to rely upon, as does anyone who was present for the discussion of that prior version (if not triaged). The new person to the panel has just the summary statement. There is an unevenness in this that disturbs me. Sometimes the only proper review response to a prior summary statement is "those guys were high so it is no problem that the applicant blew it off". Sometimes you can only tell this by reading the prior version of the application.

This gets even nuttier with competing continuations where the A0 reviewers get the summary statement from five years prior but the A1 reviewers only see the one for the renewal version.

If review isn't supposed to be benchmarked to the prior review and reviewers aren't supposed to be "fixing" the proposal...why are we assessing responsivity to prior review?

In the olden dayes, one received a copy of the original proposal, too. When there were multiple submissions, one received multiple sets. I don't remember when it changed. Recently, I have not received the A0 for competing continuations, or whatever they are called now.

All NSF proposals are treated as new submissions and panelists are not given previous panel summaries. I've seen (and have myself written) responses to reviewers, but this is not required. If the point of the panel is to review the grant at hand, I think this is a better system. I worry that giving reviewers summary sheets and requiring responses to prior reviews introduces additional bias against grants that scored poorly in the first round.

I've done NSF, and there are many pluses to this system. On the other hand, proposals also were submitted over and over (five times, six times) and it tended to favor senior people over junior. Does this mean making junior, mid-career, and senior categories? Weighting scores inversely by age?

These are difficult questions. NSF is more flexible than NIH because of the lack of strict percentiles. I think this added flexibility can greatly benefit junior folks, as POs have more power to increase equity. Of course that flexibility can also invite bias. Weighting is difficult with only a small number of scoring categories, but perhaps targeted success rates across differing demographics would work. From my conversations with POs regarding "portfolio balance," I know that NSF cares about this, but I 'm not sure that they have formal target numbers.

[…] was a conversation on twitter last night that got me kind of worked up. Potnia already posted about this - and there are some really good points about how study section works there. You should read it! […]