31 March 2007

Apologies for the long silence -- life got in the way. Part of life was AIstats, which I'll blog about shortly, but I'm waiting for them to get the proceedings online so I can (easily) link to papers.

A common (often heated) argument in the scientific community is the "blind" versus "not-blind" review process -- i.e., can the reviewers see the identity of the authors. Many other bloggers have talked about this in the past (here and here, for instance). I don't want to talk about this issue right now, but rather one specific consequence of it that I don't think actually need to be a consequence. But if often is. And I think that more than anything else, it does serve to hurt research.

Recent anecdote: Less than a month ago, I was talking with my friend John Blitzer at UPenn recently over email about domain adaptation stuff. We both care about this problem and have worked on different angles of it. Initially we were talking about his NIPS paper, but the discussion diffused into more general aspects of the problem. I had been at the time working furiously on a somewhat clever, but overall incredibly simply approach to domain adaptation (managed to make it in to ACL this year -- draft here). The interesting thing about this approach was that it completely went against the theoretical bounds in his paper (essentially because those are crafted to be worst case). It doesn't contradict the bounds of course, but it shows the good adaptation is possible in many cases when the bounds give no information.

Of course I told him about this, right? Well, actually, no. I didn't. In retrospect this was stupid and a mistake, but it's a stupid mistake I've made over and over again. Why did I do this? Because at the time I knew that the paper was on its way to ACL -- perhaps I had even submitted at that point; I cannot remember. And if he happened to be one of the reviewers on it, then he would obviously know it was me and then the "double blind" aspect of ACL would be obviated. Or---perhaps worse---he would have marked it as a conflict of interest (though indeed I don't think it was) because he knew my identity. And then I would have lost the opinion of a pretty smart guy.

All this is to say: I think there's often a temptation not to talk about ongoing research because of the double-blind rule. But this is ridiculous because the people who are close to you in research area are exactly those to whom you should talk! I don't think the solution has anything to do with reversing double-blind (though that would solve it, too). I think the solution is just to realize that in many cases we will know the authorship of a paper we review, and we shouldn't try so hard to hide this. Hiding it only hinders progress. We should talk to whomever we want about whatever we want, irregardless of whether this person may or may not review a paper on the topic later. (As a reviewer, their identity is hidden anyway, so who cares!)

(Briefly about conflict of interest. I used to be somewhat liberal in saying I had a COI for a paper if I knew who the author was. This is a bad definition of COI, since it means that I have a COI with nearly every paper in my area(s). A true COI should be when I have something to gain by this paper being published. E.g., it is written by my advisor, student, or very very very close colleauge -- i.e., one with whom I publish regularly, though even that seems a bit of a stretch.)

13 comments:

Anonymous
said...

Irregardless. Should be regardless. The error results from a failure to see the neagitve in -less and from a desire to get it in as a prefix, suggested by such words as irregular, irresponsible, and, perhaps especially, irrespective.

I admit I face the same dilemas: should I mention what I'm working on to someone who might review me? should I send a draft of my work to someone who might be interested, but who might review it?

But looking at it from the side, it seems utterly dumb. By looking at your paper, it's quite obvious that you wrote it (you use Searn, and more than half of the refs are either to you or your blog). I believe this is true to a lesser extent for my current work as well.

Quite true, though this paper is a bit extreme in that regard. Part of the reason is because I was actually quite worried that reviewers wouldn't know it was me. Essentially, I wanted to say something like: "Look, two years ago we had this technique that worked well but was slow and really complicated. Now we can do exactly the same thing with zero effort." But I didn't want a reviewer to say "this paper is too mean to D+M 2006" (though presumably an area chair could catch that.)

Blind reviewing is a joke perpetrated in the interest of "fairness". If you've ever been on a program committee, you know it's pretty easy to tell who wrote what. Both researh foci and resource limitations (knowledge, equipment, corpora, previous system versions, etc.) contrive to make it easy. So what happens is that the well known researchers are not blind and this only amplifies the problem it's intended to solve. You even start to recognize LaTeX styles (e.g., one categorial grammar researcher never capitalized section titles, thus making him easy to spot).

Note that neither NIH nor NSF reviewing is blind. In fact, it's just like the business world. CEOs sit on each other's boards and vote each other raises and other benefits. Prominent researchers sit on each other's tenure/promotion/hiring committees and on each other's grant review boards. All of these are invitation-only positions. I was so appalled at the insidery-ness of my first NSF panel (about ten years ago) that I've refused to sit on these panels ever since.

My fave experience with blindness was the first ACL with blind reviewing (1992, I think). I was on the PC and was asked to leave the room when my paper was reviewed. I guess that tipped the committee off. The paper was rejected because the reviewers said it replicated results in my feature structure book! It was Gerald Penn and I proving an unsolved conjecture, but I guess we did too good a job anonymizing ourselves and not pointing out clearly enough it was a new result.

The flip side of blind submission is blind reviewing. I like to know who "recommended" a paper, and I think it keeps reviewers more honest. JMLR is nice in this regard.

John and I just had a long conversation about reviewing yesterday. I'll just say some of the relevant parts to your post.

I am a fan of double blind reviewing, but not because its really a total secret who wrote the paper. I just finished reviewing for a non-double blind conference. It was strange to know the authors of the paper, especially since I knew some of them. I don't think there was any conflict of interest, but it was difficult to separate what I knew about the author from the review. In contrast, the previous conference I reviewed for was double blind. In that setting, I was able to figure out the authors based on the work. However, I wasn't sure exactly who authored the paper (who was on it, who advised, who wrote it, etc.) though I could certainly identify which group was behind it. I found that this small amount of doubt made the difference. It was easier to separate author from paper, which is the way I think it should be.

There seems to be evidence that knowing author identify effects acceptance (based on gender, university, etc.) I'd hope this wasn't an issue, but since it is, it cannot be ignored. There are some excellent papers that come from single authors that haven't built a reputation yet. Unfortunately, I think there is a bias created by knowing this was the case on a paper.

Of course, if the issue is that not identifying yourself makes the paper worse (because you can't refer to previous work directly) then that is a problem. However, I haven't found this to be the case with what I have written or read.

In regards to your ACL paper, I don't know if John pointed out to you the title of our CoNLL paper: "Frustratingly Hard Domain Adaptation for Dependency Parsing." I hope you enjoy it.