05 October 2010

I try to be a good reviewer, but like everything, reviewing is a learning process. About five years ago, I was reviewing a journal paper and made an error. I don't want to give up anonymity in this post, so I'm going to be vague in places that don't matter.

I was reviewing a paper, which I thought was overall pretty strong. I thought there was an interesting connection to some paper from Alice Smith (not the author's real name) in the past few years and mentioned this in my review. Not a connection that made the current paper irrelevant, but something the authors should probably talk about. In the revision response, the authors said that they had looked to try to find Smith's paper, but could figure out which one I was talking about, and asked for a pointer. I spend the next five hours looking for the reference and couldn't find it myself. It turns out that actually I was thinking of a paper by Bob Jones, so I provided that citation. But the Jones paper wasn't even as relevant as it seemed at the time I wrote the review, so I apologized and told the authors they didn't really need to cover it that closely.

Now, you might be thinking to yourself: aha, now I know that Hal was the reviewer of my paper! I remember that happening to me!

But, sadly, this is not true. I get reviews like this all the time, and I feel it's one of the most irresponsible things reviewers can do. In fact, I don't think a single reviewing cycle has passed where I don't get a review like this. The problem with such reviews is that it enables a reviewer to make whatever claim they want, without any expectation that they have to back it up. And the claims are usually wrong. They're not necessarily being mean (I wasn't trying to be mean), but sometimes they are.

Here are some of the most ridiculous cases I've seen. I mention these just to show how often this problem occurs. These are all on papers of mine.

One reviewer wrote "This idea is so obvious this must have been done before." This is probably the most humorous example I've seen, but the reviewer was clearly serious. And no, this was not in a review for one of the the "frustratingly easy" papers.

In a NSF grant review for an educational proposal, we were informed by 4 of 7 reviewers (who each wrote about a paragraph) that our ideas had been done in SIGCSE several times. Before submitting, we had skimmed/read the past 8 years of SIGCSE and could find nothing. (Maybe it's true and we just were looking in the wrong place, but that still isn't helpful.) It turned out to strongly seem that this was basically their way of saying "you are not one of us."

In a paper on technique X for task A, we were told hands down that it's well known that technique Y works better, with no citations. The paper was rejected, we went and implemented Y, and found that it worked worse on task A. We later found one paper saying that Y works better than X on task B, for B fairly different from A.

In another paper, we were told that what we were doing had been done before and in this case a citation was provided. The citation was to one of our own papers, and it was quite different by any reasonable metric. At least a citation was provided, but it was clear that the reviewer hadn't bothered reading it.

We were told that we missed an enormous amount of related work that could be found by a simple web search. I've written such things in reviews, often saying something like "search for 'non-parametric Bayesian'" or something like that. But here, no keywords were provided. It's entirely possible (especially when someone moves into a new domain) that you can miss a large body of related work because you don't know how to find in: that's fine -- just tell me how to find it if you don't want to actually provide citations.

There are other examples I could cite from my own experience, but I think you get the idea.

I'm posting this not to gripe (though it's always fun to gripe about reviewing), but to try to draw attention to this problem. It's really just an issue of laziness. If I had bothered trying to look up a reference for Alice Smith's paper, I would have immediately realized I was wrong. But I was lazy. Luckily this didn't really adversely affect the outcome of the acceptance of this paper (journals are useful in that way -- authors can push back -- and yes, I know you can do this in author responses too, but you really need two rounds to make it work in this case).

I've really really tried ever since my experience above to not ever do this again. And I would encourage future reviewers to try to avoid the temptation to do this: you may find your memory isn't as good as you think. I would also encourage area chairs and co-reviewers to push their colleagues to actually provide citations for otherwise unsubstantiated claims.

9 comments:

Anonymous
said...

Hi Hal,

I think this reflects the asymmetrical relationship between authors and reviewers. The reviewers vet what the author says, but nobody vets the reviewers.

Perhaps we should ask area chairs to explicitly vet reviews to make sure that they don't contain unsubstantiated claims? (I'm not suggesting that the area chair check everything the reviewer says -- rather, the area chair should ensure that the reviews don't contain the vague references you mention).

If this is too much work for the area chairs, perhaps we might need a small group of meta-reviewers?

How about "Nice paper that involves obviously a lot of work but as this interests only a small community working on the X language, I suggest to accept it only if there are other papers accepted on the same language"

or even better "this is interesting but the fact that the grammatical framework Y needs such a machinery and the old framework Y, proposed 30 years ago, does not, suggests that something is obviously wrong with X so I suggest to reject that paper even if it contains interesting points"...

I'm not sure that there're other ways around than relying on reviewer's good faith but at some points I wonder if being reviewed by anonymous peers is that good.

There are other forms of laziness. I once got a review with rejected my paper on account of the difficulty in interpreting one of the effects I had reported. The reviewer helpfully even cited the region of interest in question.

The only problem was that there was no such effect. Not in the figure. Not in the text. Just not there. It wasn't even an issue of confusing one effect with another, since in the entire paper, there was only *one* significant effect reported!

I think, though, that the problem is deeper than lazy reviewers. The problem is with overworked reviewers. Reviewing is something we do out of the goodness of our hearts, but it's not usually in our individual interest to do it. Our individual interest would be better served by doing research, writing grants, or possibly even by sleeping. I'm not so sure people are lazy so much as busy. We have other things to do.

I recently attended a workshop consisting mainly of mathematicians and applied scientists with a strong connection to the *primary* literature on mathematics. The thing I was immediately struck by was the extent to which subjective judgments on relevance or aesthetics are separated from objective statements such as theorem A differs from theorem B by addressing class x instead of class y. If x > y or x < y in a formal sense then the two theorems both have a 'right to exist'.

One problem we have in more applied areas is the extent to which > or < are arbitrarily applied. On the one hand, algorithms are shot down because even when they work in a precise setting, reviewers may argue that since it doesn't solve "all other" problems with that level of efficiency the result is uninteresting. On the other hand, multiple approaches to solving the same problem are not really cherished in the same way as in mathematics - we feel the need to pick winners prematurely and relegate everything else to the dustbins. Is that wise?

Every time I look at the title of this post, I read "my grant reviewing error" and expect much more detail on NSF, SBIR or similar. I then blink a few times, since the "r" in "grant" seems rather thin.

I most strongly agree with the metareview idea, and further suggest that we implement something akin to "Multilevel Coarse-to-fine Paper Parsing." Reviewers at each level should still always give constructive criticism, but the first pass would have a relatively low threshold, yet cull close to half the papers, and an individual reviewer needn't know where they fit in the scheme. It could be dynamic - say, a given review may be interpreted at the third iteration, or at the sixth (with much finer teeth).