Sunday, April 10, 2011

Making Effective Use of Code Reviews

I’ve been reading chunks of Coders at Work this weekend, and the topic of code reviews has come up a few times. Code reviews are clearly a very useful development technique, but it can be tricky to apply them in a way that improves code quality without slowing productivity.

Poking around the web, I don’t see a lot of great writing about code reviews, so I wanted to share the guidelines we use at Yext for effective code reviews. The guidelines below are captured from an email I sent to the the engineering team almost a year ago, and I’m proud to say that our code reviews do a lot to improve code quality while keeping the team operating at peak efficiency.

These guidelines are heavily biased by my experience at Google. There, I saw how code reviews could identify and eliminate many preventable bugs, including many that the original developer never would have found. I also saw innumerable cases of reviewers who lost perspective on the larger goals of the team and the company, and thus acted to prevent progress on important projects as a result of matters of personal preference or sheer obstinance. My goal for Yext is that we capture the best aspects of code reviews while eliminating the worst.

With that in mind, my recommendation for code reviews is that they address the following points:

Correctness: Does the code do what it claims to do? Is the code correct in both the nominal case and the boundary cases? As a reviewer, this is your opportunity to point out edge conditions of which the original developer may not have been aware. An important special circumstance is when you may be aware of legacy systems or features that interact with the modified code in some non-obvious way.

Complexity: Does the code accomplish its task in a reasonably straightforward way? If you can point out simpler approaches that do not compromise the correctness or performance of the code, you should.

Consistency: Does the code achieve its basic goals in a way that is consistent with how similar code in the codebase achieves those goals? Is it re-using the available libraries and utility classes? Where possible, has code been refactored for re-use instead of just copying and pasting?

Maintainability: Could the code be extended by another developer on the team with a reasonable amount of effort? More than any item on the list, this is the karma investment you make by doing code reviews – the code you review today may be the code you have to update tomorrow, so taking the time to make sure it’s maintainable by others pays itself back to you.

Scalability: Will the code be performant at the expected volumes? It is important that this question always be asked in the context of expected volumes. When building a new product in an untested market, it is fine to write code that works for 100 users but not 10,000; if the product should be that successful, you can profile, optimize, and, when necessary, re-write the critical bits. The corollary is that you should not spend time optimizing code when the market demand is unproven.

Style: Does the code match the team style guide? This should rarely be controversial. The obvious assumption here is that your team should have a style guide.

There are some items I believe should only rarely be addressed during a code review:

Scope or mission feedback: “I don’t think you should be doing this project” is almost never a useful comment for a code review. If you think the team is embarking on projects that are not worthwhile, that is great feedback to share, but not in the context of a code review. The exception here is if someone is introducing a new way of doing something that is already well-handled in some other way.

Design review: A code review is not the time to evaluate the overall design of a project. For example, "I don't think you should be using the DB to store this data" is not useful. It is incumbent upon the developer to have their designs reviewed before implementation, and there will be scenarios in which the fundamental design is questioned during the implementation, but for a project that has been through a design review, let the results of that design stand.

Personal preference: “I would rather you do it my way” is an invitation to an unproductive debate. If you have a way that is demonstrably better, you should always argue for it. The hardest part about this point is identifying when a review has deteriorated to matters of personal preference; the hallmark I spot most often is when people are trading hypothetical scenarios in which alternative solutions might be advantageous, with no way of determining the likelihood of said scenarios. In these cases, the default is to use what the developer has already written.

How can you, as the developer, write your code in such a way as to make a code review? A few simple practices help.

Correctness: Comprehensive unit tests are the best demonstration that code functions as intended.

Complexity: Favoring small methods and cleanly-separated functional units makes it easy for your reviewer to see how everything fits together.

Consistency: When building new functionality, you can maximize the consistency of your code with existing work by taking the time to research how similar code solves similar problems. If you suspect someone else has solved the same problem before, ask!

Maintainability: Thorough commenting and the use of meaningful names throughout your code help ensure that others will be able to easily understand your code.

Scalability: My #1 recommendation in demonstrating the performance of new code is to just take 30 minutes and write a little driver to run your code through its paces. This can be total throwaway code, but simply being able to tell your reviewer that you’ve done a performance test makes this topic less debatable.

Style: The most important thing you can do to maintain style consistency is to configure your editor to implement your style guide. (As an aside, this also means that your team should adopt a style guide that is simple to automate in the editors used by the team.)

Even when everyone on the team follows these guidelines, there will frequently be strong debate during code reviews, and that’s a great thing - the point of these guidelines is to focus the debate on what matters.

All of this ignores some very tactical questions about code reviews like what code gets reviewed and what tools we use to aid that process. If you’re interested in hearing more about that, leave a comment and I will follow up with another post.

4 comments:

Another best practice for developers sending out code reviews is to keep the review as small as possible. It's much easier to review 5 200-line patches (assuming logical separation points exist) than a single 1,000-line one.

Small patches can also be a forcing function for better design (code is less inter-dependent). Ideally (D)VCs with easy branching like Git and Mercurial make this send-out-a-review-and-keep-going workflow easier too.

I'd encourage you to follow up with something about code review antipatterns. Here are two opposing ones that I sometimes see: 1. The perfect is the enemy of the good: particularly working on a "mature" (messy) code base, there's a wide variance between what could be done and what should be done. The former entails lots of cleanup, writing a lot more tests, etc., but it's a boil-the-ocean process where there's a huge amount of work necessary before you can get anything done. The latter is the reasonable middle ground where changes are for the good and move the codebase forward without getting too entangled, and get positive changes made quickly. I find in some code reviews it's too easy for the reviewer to ask for a lot of rework; it takes clarity on the tradeoffs to push back with "That's a good idea, but too much for right now". 2. You can't really educate someone to be a better programmer via code reviews; the engagement depends on the engineer & reviewer speaking the same language and understanding the same goals. If the original coder doesn't really understand the coding quality guidelines, addressing specific review comments may bandaid over the problems, but it reduces to having the code reviewer fix the code via e-mail, or, eventually, giving up and letting code they're not thrilled with get checked in. This is particularly true when the issues aren't particular clear-cut but fall into qualitative evaluation categories, such as your points "complexity" and "maintainability".

I think there's a pattern that sometimes emerges where a code review drags on for way too many iterations. I think the reviewer thus has a responsibility to try to think of relevant feedback as early as possible, and to balance perfectionism with the goal of moving things forward.

Sadly, I've personally observed counterexamples of this where people were actually using code review to block progress intentionally.