31 July 2015

As Code Cop, I am sometimes asked to perform an independent code review. Before I start working on it, I ask for the goal of the review. What kind of results are expected and what will happen with these results? Which decisions will be taken depending on the outcome? Usually people know when the quality of their work is not that great, so why bother?

For example I was asked to review a code base because the developers had problems with their client. They were arguing about costs and regressions. I did not approve this reason, because in such a situation there is always a loser. When doing a code review, I do not want to shift blame, I want to make people aware of potential improvements and help developers learn and grow. So I persuaded the client to change the goal of the audit. Instead of looking for blame, I worked together with the whole team to come up with a list of necessary steps to correct their customer's problems. Understanding the way that software grows over time, it was reasonable to charge the client for at least some of that work. It became a win-win situation.

A Code Review is an Opportunity to Improve
So the goal of the review is highly relevant for the mechanism and success of such an external code audit. There must be a dialogue with the team and the audit's findings must be used for constructive feedback. That means discussing the results with development and creating a plan how to fix the critical issues. My findings are always concrete and raw - I do not like to create management summaries of elaborate slide decks. (When such reports were needed, development managers worked with me to create the slides they needed from my raw results.)

Sometimes I start a coaching engagement with a quick review of the team's code. It helps me to see the team's maturity and typical problems. With this knowledge I can target the top issues in Coding Dojos right from the beginning. Also when I help teams during re-engineering efforts, at least a partial review of the code base helps me to understand the problems we try to solve.

But Is It Practical?
Checking the code quality is difficult. I am not able to look at every line in the system, that is impossible. So I use static code analysis tools and metrics to get an idea about the code. But metrics are controversial because most of them do not measure what I am really looking for - clean code which is readable and can be maintained easily. The actual review of code is always based on samples. This is far from ideal.

An outsider can never know all technologies and reasons why a large piece of software is like it is today. That is why the developers are essential. To get usable results I am relying on them. I ask them about their code, let them create lists of their best and worst classes and discuss what I find and why I do not like it. To work like that, the audit must be a friendly action, performed in cooperation with the developers. When the goal of the review is to improve the whole project and developers are asked for their input right from the start, everybody is on board.

What I do
For a code review of a large code base, e.g. 500 to 800k lines of code, I try to get as much information about the code as possible. First I read about the used technologies if I am not familiar with them. Then I try to build the complete project and play around, opening classes randomly and following references. This is just warm up for getting used to the code base.

When I feel comfortable in the code, I start with the heavy lifting: First I run some tools to get metrics, e.g. JavaNCSS to collect the size and complexity of the code, or Chidamber and Kemerer to calculate coupling and cohesion numbers. Then I use tools to scan for smelly code or potential bugs, e.g. Checkstyle, PMD and others. These tools are very common and it happens that they do not find much - because the developers already use them - which is a great sign. But unfortunately it only happened once till now. Then I move from these line based analysis tools to higher level ones. I look for violation hotspots, code duplication, unused classes, Singletons (because I do not like them) and cyclic dependencies to name a few. There are many tools for Java but depending on the programming language, there might be less tools available. Anyway I still try to use at least one of each category.

The hard work is the manual part. I verify the critical findings of all tools, which includes a lot of navigation. Then I run some semi-automatic analysis, usually by searching. I look for compiler warnings, TODO markers, @SuppressWarnings, (too many) casts and instanceofs, catch blocks, ignored tests, useless JavaDoc comments and other things. As I check each finding, I have covered a lot of code so far - although somehow out of context. Finally I select a small sample of classes and read them from top to bottom.

In the meantime, I schedule time to talk to the developers. I ask them about known issues and where I should look in particular. As I said, people usually know where the skeletons are hidden. I ask them to show me how they build their software, how they use Continuous Integration and if they have coding and design conventions. If they have time I ask them to run some of the analysis tools I mentioned above and remove false positives. I do not have to do everything myself and every team needs someone who is familiar with these tools to use them from time to time.

Before I report the result to the customer, usually a manager of some kind, I present my findings to the developers and discuss my conclusions. I explain the results in detail until they fully agree with me. Sometimes I drop findings if the team explains their conventions. I make sure that my final report only contains real issues and has full support of the development team. Usually the team already starts working on the critical issues before the audit is officially concluded.