Daniel Irvine

The problematic pull request

18 November 2016

Code reviews are a wonderful thing. Multiple pairs of eyes make for higher quality code. Reviews help spot errors and also serve as a teaching tool. You write a chunk of code, pass it on to someone else, and they comb through it for errors and possible improvements. They provide you with suggestions and you then go back and change the original code.

However, the processes by which we perform reviews can be deceptively ineffective. This is particularly true when we use tools that replace direct in-person communication with written online reviews, such as GitHub’s pull request system.

There are a couple of common issues with code reviews.

The first is that code reviews vary in effectiveness. The reviewer can be careful, careless, or somewhere in-between. Even the most meticulous of reviewers have days when they’re just too busy to devote enough time to the task. And sometimes the amount of code we’re asked to review is an issue: if there’s more than a few hundred lines of code then we’re likely to do a sub-par job of reviewing.

The second issue is that code reviews take up a great deal of time. In addition to the time spent reviewing code and correcting the code, there’s the hidden time cost of context switching. Both author and reviewer need to schedule time to do the extra pieces of work.

Code review tools

Tools like GitHub’s pull request system aim to solve these issues. They:

integrate with a revision control system like Git to automatically diff changes;

allow multiple users to acts as reviewers for a single review;

notify all participants of new comments and changes to reviews;

hold code from being merged into the main work stream until it has been accepted;

negate the need for scheduling review time, which is especially important for distributed projects.

Together these features relieve you of the organizational work associated with code reviews. Commenting becomes a breeze.

However, there are now three additional problems:

Delaying the merge stops code from being immediately useful. This is a good thing if the code contains errors, but if it doesn’t then the code becomes stale, and becomes more stale as time passes. If the rest of the development stream is changing in the meantime then extra work needs to be done to keep this code up to date.

Written feedback is slower to deliver than in-person feedback. Coalescing thoughts into writing is slower than simply having a conversation. Two people sitting beside each other can correct code together. With code review tools, unclear comments require clarifying follow-up messages, so conversations stutter along in a stop-start fashion.

The ease of commenting may engender a culture where each pull request ends with multiple nit-picky comments from multiple reviewers, overwhelming the original author with additional work.

All of these problems are quite serious. Submitting a code review becomes deeply demoralizing, unless you’ve got nerves of steel.

To counter these problems, authors begin batching up changes into larger, less-frequent reviews, which only makes the problem worse. Under these conditions, the average time from commit to merge can easily spiral out of control, and developers will be mentally exhausted.

Pull requests have their use

None of this is to say that pull requests are not useful. Open-source projects, for example, are reliant on tools to organize discussion between vast, weakly connected networks of developers. The success of these projects hinges on how easy and enjoyable the authoring experience is for contributors. So it’s critical that these projects do not fall into bad habits with their code review process. If there are any road blocks, then contributions will die off.

Of course, that’s a serious risk for all projects, but commercial projects generally do not get killed like that, they just trundle along until the money runs out.

Taming your pull requests

Here are a few ideas you can apply to streamline your pull requests.

If you’re not directly involved with the code, don’t comment on it. Within larger organizations, pull requests should be open to responses from team members only. Other staff are unlikely to understand the trade-offs and decisions that led to the code being the way it is, so their comments may miss the mark and cause blockage. If you feel you can give advice as an outsider, communicate it via some other channel.

Don’t split hairs. There may be twenty, thirty, a hundred things that you’d like to suggest. Prioritise your feedback and choose the top three suggestions, and only write more if the changes are absolutely critical to the success of the code.

Remember that there will be plenty of chances to improve the code later. The secret is knowing that other people’s code doesn’t need to look like your code, nor does it need to be in perfect shape in order for it to be merged. Full test coverage is enough to convince us that the code is in a working state and ready to merge.

Pair whenever possible. Pair programming and its cousin, mob programming, can make the entire code review process unecessary. It’s an effective method for increasing quality and can decrease the time taken to merge code, because it short-circuits the typical pull request feedback cycle.

As I said at the beginning, code reviews are a wonderful thing. They can be very worthwhile. But remember to frequently retrospect on your processes—on all of them, not just your review process. Critical thinking is always required. So ask yourself: are your tools helping you or are they hindering you?