Computational Complexity and other fun stuff in math and computer science from Lance Fortnow and Bill Gasarch

Friday, October 29, 2004

The Co-Author Conundrum

In theoretical computer science we traditionally list the co-authors
of our papers alphabetically. Done this way for "fairness"
it leads to a binary notion of author. Either you are an equal author of a
paper or you are off the paper. There is no middle ground.

In our publish or perish society, authoring papers helps you succeed,
in getting hired, promoted and receiving grants and awards. So
choosing who is an author of a paper, particularly important papers,
can be an important and sometimes messy decision complicated by the
fact that the authors have to do the choosing.

An author should have made significant contributions to a
paper. But how do we define significant? A person who produces key
ideas in the proof of a main result certainly becomes an author. A
person who simply writes up the proof should not be. But what about
the person who works out the messy but straightforward details in a
proof? What about the person who poses the questions but has no role
in the proof? Tricky situations that one needs to handle on a
case-by-case basis.

An advisor should hold him or herself to a higher standard. A good advisor
guides the research for a student and should not become a co-author
unless the advisor had made the majority of the important ideas in the
proofs. Likewise we hold students to a slightly lower standard
to get them involved in research and exposition of their work.

Computer scientists tend to add co-authors generously. While seemingly
nice, this makes it difficult to judge the role authors have played in
a paper, and sometimes makes who you know or where you are more
important than what you know.

8 comments:

Sometimes I think that this is a product of a certain amount of conflict-averseness in our community. It's messy to argue about who contributed what, and so we don't. It goes back to the whole 'discussions should be objective since the work is' theme I had talked about earlier.

And incidentally, I think this phenomenon is true primarily for theory, not all of CS. In other areas of CS, papers are most often written in contribution order, with first authorship being the usual meaningful datum that it is in the natural sciences.

I am not sure how to interpret the term majority your sentence "A good advisor ... should not become a co-author unless the advisor had made the majority of the important ideas in the proofs." For instance, among peers a 40:60 split or even 1/3:2/3 split will probably be accepted as a significant contribution. Applying a very different rule in an advisor-student co-authorship may give some students an unfair advantage (e.g. a single authored STOC paper, possibly even a best student paper award). Don't you make a distinction between guiding research and actively participating in it.

Of course, more than two authors might make things even more complicated.

I think a distinction should be made between beginning grad students and advanced grad students. In the first 2-3 years of grad school, a student is bound to have gaps in his knowledge of his area, and should be encouraged by not being held to the usual standard. Academia is a Darwinian world, some parenting may be required. There is no reason, however, for treating an advanced grad student any different from the typical researcher, as far as co-authorship is concerned.

Unfortunately, the binary nature of coauthorship in CS creates a situation in which opportunists can flourish.If a person can just sit in on a discussion for long enough or sit quietly through enough meetings, they will probably be able to get their name on a paper to which they contributed almost nothing. Hopefully this kind of behavior eventually catches up to a person, when it becomes common knowledge to members of the community. This is why, in job applications for instance, all of a person's research should be judged with respect to their letters of reference, and those references should specifically address the _leadership_ role of the applicant in performing research. Theoretical CS has long been plagued by the curse of the "least publishable unit." Unfortunately, more and more young people in the field (i.e. graduate students) seem to be aiming for the "least acceptible contribution" (it has a nice acronym). Instead of deep, well thought-out research plans, we are seeing shotgun-style fill-up-your-cv-at-minimal-cost research that is quickly forgotten and which, in the long run, only serves to degrade the status of the community as a whole.

Even within theoretical CS, different schools seem to have different traditions of the level of contribution required for co-authorship. Here are two scenarios that I suspect would have varying answers:

Scenario I: Student X is working on a problem and has a conversation with Y who gives an insightful pointer to X who then in turn works out the solution. Without the insightful comment it is not clear whether X would have found the solution. Y did not have any role in working out the details.

What should the co-authorship decision be if:Y is another student?Y is a post-doc?Y is another outside researcher?Y is the student's advisor?

Scenario II: Y has spent a while working on a problem and has not solved it. Y discusses problem with X and mentioned details of failed attempts. Some time later X solves the problem via different techniques from the ones Y was using.

What should the co-authorship decision be if:X and Y are fellow students?X and Y are post-docs?X and Y are both senior researchers?The elapsed time is one week? The elapsed time is one month?The elapsed time is one year?

My personal opinion is that our method of "binary authorship", with the tendency when in doubt to add a person on the list, is to be much preferred over other methods.

Attempting to determine author order in papers with 3 or more authors can be a nightmare, in the worst case leading to fights that ensure those authors never work together again. I decided early on that I wanted to spend my time writing papers, not fighting about who did what and who should take credit for it. If I wanted that, I could have gone into business.

The worst case for the binary method is that, for external people who need or want to judge such things, assigning true credit is difficult. For most people, this will even out in the long run. For other people, I trust in higher-level reputation mechanisms to ensure proper accounting. You can tell by things like who gives the talk at a conference (or elsewhere) and who goes on to write futher papers in the area. You could even get the information directly from the others authors if you really needed it. Or indirectly -- someone who contributes little to a group paper is less likely to be invited to the group the next time.

As someone from outside the Theory community, I do not see the advantage of the binary view of authorship, for the very reasons that have been mentioned previously. I suppose that in a few rare cases it might be difficult to determine order, but in the majority of circumstances where it is relatively clear, why not allow the freedom to give credit where it is due?

It seems to me that the "messiness" involved in determining who is _on_ the paper would be far more ugly than determining order, so I'm not sure what grief is being saved. A wrong decision in this regard could certainly cause more unease than a "3rd vs. 4th author" debate.

I'm also not sure if there is adequate justification for one of the responders making the claim that young people are aiming for "least acceptable contributions." This seems like crodgety-bitter-old-theorist talk to me. Besides, it's 2006 and most of the big problems have certainly been solved, so might concentrating on smaller problems be a necessary evil?