Monday, February 16, 2009

Some Notes on Collaboration

[For part of tomorrow's CS 222 class, we'll be talking aboutcollaboration, inspired by the Gowers's Polymath project. It fits inwith one of the themes in the class, which is getting young gradstudents or senior undergrads into research mode. While I swear I'vewritten something like this up before, now I can't find it, so I'mwriting it down now and it will be part of my class "lecture" forCS222 tomorrow. Think of this as a draft of lecture notes, which --inspired in part by Luca's online blog notes -- I'm putting online. Andfollowing the theme of collaboration, feel free to add comments,suggestions, or advice to students that you think is useful.]

In this class, you'll be doing a final research project, most of youwith one or more partners. I'd like to spend a little time talkingabout collaboration, some tricks for doing it, and how important itcan be for you as a graduate student.

I think most undergraduates -- especially those who aren't scientists-- have the wrong impression that science and especially computerscience isn't very collaborative. The common image is of a professorhiding away in his office, or a coder alone with the terminal. Incontrast, those of you who have been in the terminal room late thenight before a programming assignment probably know how collaborative,and social, coding can be. And the large majority of my papers, andmost papers written in computer science, have more than one author.

In fact, as a graduate student, collaborating successfully is likelyto be key to your success. Synergy is real. Research is inherentlynon-linear; it's not how much time you spend working on a problem,it's coming up with the right idea. And working with others, for manypeople, simply leads to better ideas more quickly. A key insightcould require knowledge that each individual doesn't have alone; adifferent perspective can move someone forward when they're stuck; anidea one might be quick to discard as being the wrong path might provefruitful in another's eyes. And beyond that, collaborating is oftenfun, and having fun while working on a problem can make people moreproductive on its own. So there are reasons House has his staff,Buffy has her Scooby gang, and even Holmes hangs out with Watson.

Some quick thoughts about collaborating. First, as a beginninggraduate student, you are probably very concerned about who gets thecredit. My advice is to put this aside. When you go into a project,you don't know who will hit on the right path -- and it's verypossible that the thought you had that broke open the problem mightnot have happened if you hadn't been working with those other people.Think of the long term -- you'll be known for your body of work overyour lifetime, people will see what you can do over a number ofprojects. You don't need to focus on specific credit for thisspecific project. And you want to foster an environment where you cancollaborate with others easily and naturally. That becomes harderwhen you're always worried about who will get the credit. For moreon this, see also Hardy and Littlewood's Four Axioms for Collaboration,or the end of Fan Chung's notes for graduate students.

Another thought about collaborating -- although, actually, you can usethis idea even when you're working on a project yourself! I'vegenerally found that when working with others in research, peopleimplicitly tend to take on different natural paired contrasting roles.In fact, I think people taking on these different roles can greatlyenhance the process -- so it may be worthwhile for people toexplicitly take on these roles (and switch off from time to time)!The sort of thing I'm thinking about includes:

Optimist/Pessimist: One person can be trying to think 3 steps ahead,making intuitive leaps forward, assuming other details will work outor as yet unproven lemmas will be proven. And another person can tryto be the skeptic, making sure that the assumptions being made to movethings forward aren't completely out of line, that details eventuallyget filled in, and that the proofs don't break.

Writer/Editor: When one person writes, the other should read like ahypercritical reviewer.

Implementer/Debugger: It can be time-consuming watching over someone'sshoulder as they code trying to catch mistakes on the fly, but it canbe an effective way for two people to code together.

There are other natural pairs of roles people can take on doing research,and I think it's useful to be aware of them so you and your collaboratorscan work together more smoothly.

Now we'll talk a bit about the Polymath project, which represents anextreme experiment in collaboration -- research via blog. Is this thefuture of research, as new communication tools make group research onthis large a scale possible? Does this paradigm seem helpful orharmful to the research process? What is the right size for acollaborative group, and why? Let's discuss...

Dear Prof. Mitzenmacher,I read your blogs regularly; they are full of information for a graduate student like me.I have a request: could you blog about the guideline regarding collaborative research and what goes into one's dissertation? Most of the work I have done in grad school has been in collaboration with others (other than my advisor), and sometime I am not sure what I can or cannot put in my dissertation.Obviously I dont expect you to resolve my individual situation, but I want to know what the general rules are regarding this.Thanks.

Anonymous #1: I think you mean isn't author order NON-alphabetical in the systems community. (In the theory community, alphabetical is the default.)

When I'm on a paper I tell people I would prefer alphabetical order, as that is my standard. If they don't want that I'm happily at a stage in my career where I can simply not care. (Generally, when this happens, it is because somebody wants a student to be first author, and generally, that's appropriate, and I don't mind obliging.) However, I do encourage everyone to adopt the alphabetical order as the standard rather than fight about ordering.

1) Do internships whenever possible. (Which reminds me, I should be doing my annual post, telling graduate students to do internships whenever possible.)2) Start a side project with your officemates or another group of graduate students -- make it a "no-professor" project if you can.3) Take a "final project" class, even if it's not in your area. Projects in many grad classes can turn into papers. (And it's nice to have collaborators and a paper outside your direct area -- it shows you can do "other stuff" when you interview.) 4) Read everything you can, and when you think you have an idea that might improve a paper, contact one of the authors and say, "I have this idea after reading your paper..." Many people won't mind working with a student on a "follow-up" paper if they have a good starting idea.5) Find a postdoc who looks like they have some time or need some help. Many postdocs don't get enough attention from their busy hosts, and they'd like to visibly "lead a project" before their next round of interviews. Maybe they could make use of an eager student?

In all these cases, best to (eventually) inform your advisor about your additional projects, but most advisors will get it if you make clear this is something you're doing "in addition to" working with them.

Thanks Michael! I find the out-of-area collaborations relatively easy to find (and definitely rewarding). It's finding other people interested in more specialized kinetic computational geometry problems that I'm finding tricky. Perhaps this means that I need to go ahead and try #4 on your list, though I worry that its hard to begin a collaboration when not in the same place.

However, I do encourage everyone to adopt the alphabetical order as the standard rather than fight about ordering.

Hear, hear! Non-alphabetical order is corrosive. Whenever there is a (sub)cast of authors with about equal credit due it forces to split hairs as to how these people are listed. This does not add anything to the group dynamic nor does it reflect any actual difference in contribution.

Anon #8: non-alphabetical order is not required in such situations, just as it's not required in any situation. It's a choice, and once you make that choice, you open yourself to arguments about credit. (A student may think an adviser has does nothing but proofread the paper; the adviser, however, might see it differently...)

Anonymous #2: My understanding is this can vary from institution to institution and even from committee to committee; my guess is that you should have an open conversation with your advisor/committee about their expectations (and yours).

Thanks for an informative article, Michael. I agree that developing collaboration relationships is very important for grad students these days. However I think that "equal credit" collaborations are often at least as frustrating and damaging to relationships between researchers as potential awkwardness involved in discussing the relative contributions. In my experience in papers with 3 or more co-authors issues with one or more of the co-authors "having other priorities" are very common ... especially when it comes to the mundane and time consuming tasks like writing up/proof-reading/revising. In general I think that the disconnect between the credit and the effort becomes more and more problematic as the average number of collaborators per paper grows in TCS.In addition, the absence of any explicit information about relative contributions adds a lot of uncertainty in selection decisions (e.g. for hiring or an award). It is very common in recent years to see a PhD graduate in theory with mostly 3+ authored publications and without a single single-authothed one. It is true that insider information is often available (e.g. from recommendation letters or who got to present the result) but such information is often quite incomplete and not necessarily fully reliable. In my opinion, ordering of names is really not an adequate way to deal with this problem. It is too crude for a such delicate matters. On the other hand including an explicit summary of authors' contributions in works with 3+ authors would certainly do much more good than harm to collaboration practices and overall transparency in our area. If the collaborators decide that they do not feel like they want to discuss their contributions they can always state something in the spirit of "contributed equally". So I see no real downsides or valid excuse not to follow the practice.I believe that the appropriate way to introduce this practice is by requiring such a summary to be included in STOC/FOCS submissions. This is a standard practice for premier science journals like Science and Nature where multi author contributions are more common.

> "I believe that the appropriate way to introduce this practice is by requiring such a summary to be included in STOC/FOCS submissions. This is a standard practice for premier science journals like Science and Nature where multi author contributions are more common."

VG,

I agree with many of your points (especially regarding the difficulty of evaluating candidates, and the difficulty for candidates to prove that their contributions were significant).

However, for theoretical collaborations it is often nearly impossible to pin down who contributed what. We could definitely keep track of who did the writing, but because of the way we (or at least I) do research, there is no bright line separating my part of the paper from anyone else's.

A relevant question: can you also comment on the importance of being able to write papers independently (i.e., having papers with one single author) as a theory graduate student? Specifically, which skill do you think should be acquired first: being able to collaborate extensively or being able to work independently?

for theoretical collaborations it is often nearly impossible to pin down who contributed what...

I don't think this is the issue -- in this case it would be easy to say that everyone contributed equally. But there are many papers where the contributions are not equal.

I think the real issue is that our community is very liberal with co-authorship. (I know of papers where a co-author's contribution was being in the room at the time the question was posed.) Whether this is good or bad, I don't know. But authors would be reluctant to divulge this information.

Hi Michael, I'm a foreign student... and I think it's not so important from where I come from... Anyway, I have just a simple question for you. I agree with everything you wrote and I am sure that collaboration and group-work could improve the efficiency and motivations, but only if you work with the right people. So, if you are allowed to, how to choose the right team for you? I mean, random people probably will not lead to an expected result...Thank you!

I agree with Adam, or would go even further. I think there could be large disagreements in writing up "who should get what credit" pieces to go with articles, creating unnecessary ill will in the community. I can see the potential of introducing gamesmanship to various proceedings. (Advisors (with tenure) might be incentivized to exaggerate student contributions; advisors (without tenure) might be incentivized to exaggerate their own, and how can the student complain?) And I think its value would be minimal, so I'd see it as a waste of time. I wouldn't voluntarily go in such a direction.

Anon 13: Both skills are important. As a graduate student, I think the emphasis is on "producing your own results" -- that tends to be the pressure on students who need to produce a thesis, establish a reputation, and who are probably a bit misguided as to the importance of collaboration. Hence my thoughts on emphasizing the other side.

I'd also recommend all graduate students have one project that's an "on their own" project -- something they can think about or work on at their own time, at their own pace, in their free moments, without pressure from collaborators (or advisors).

How do you pick people for collaborations? I tend to pick by a few simple criteria (unordered):

1) Do I like working with the person?If so, we can find something to work on, at least it will be fun.2) Skill set. If I'm faced with a problem where I think -- hey, I could use someone who knows XXX, I'll look for someone I know who knows XXX and talk to them about it.3) Locality. Local interactions just happen more naturally. Also, these days, lots of my collaborations are working with students of some form or another... but this is also the basis for my "do a project with your officemates" suggestion.4) Did they seek me out? Happily, I'm now at a point where people also seek me out to collaborate, and I'm open to working with people on interesting projects

Finally, if you talk to enough people about problems you're interested in, collaborations will probably just happen (as you find they're interested too...)