Huge progress has been made in the past decade in the field of code quality assurance. Test-Driven Development (TDD) and Pair Programming are two key practices that have made a big difference and improved code quality substantially. Many readers will agree however, that there’s still plenty of room for improvement. One important factor is that the IT systems that we build today are among the most complicated systems mankind has ever made. On the other hand, many of the tasks related to code quality are simple yet very repetitive, and therefore also extremely boring. As a consequence, doing the simple tasks like good and consistent code formatting, or proper function and variable naming, are usually more a question of conscience and self-discipline than knowledge and training. Is there an alternative to all the things we let slip due to bad conscience on a daily basis in our industry?

Why would you need assistance?

Even though there are many good arguments for pair programming, one of them is in my opinion wrong even though it is a very popular one: the idea that the second person will act to keep you on the straight-and-narrow. However, using a colleague to watch over your shoulder so that you give your functions appropriate names, that you comment your code where it’s necessary, or that you don’t write deeply nested functions, is a waste of resources we really can’t afford in our industry. If you don’t bother doing it when you’re on your own then why do you punish your colleague with this boring task? Besides, you have a friend who’s much better at it. In fact, he’s much faster, does the job better and in a more consistent and precise manner than any of your colleagues, and he never gets tired of boring, repetitive, trivial jobs. That’s right, I’m talking about your computer. Just think of how many lines of code it can analyse in a millisecond and compare it to how much time your colleague will need just to check that a single line of code is formatted correctly.

But even if you’re a disciplined programmer who always formats his code consistently and practises TDD according to the book, your computer can be of great help. Remember the principles of TDD? The flowchart in Figure 1 gives a quick overview: first you write a failing unit test, then you write only as much source code as needed to make the test pass. Next you refactor the code if needed, and unless you’re done, go back to the start and write a new failing unit test. But how do you know whether you should refactor your code or not, and how do you know when you’ve refactored it enough? Everybody has an idea about what ‘good code’ looks like but it’s often a very subjective thing. What looks ‘good enough’ today may look ‘just below the bar’ on another day. And once you’ve started refactoring it may be hard to stop when it’s just ‘good enough’ and instead continue to polish the code that little bit more.

Figure 1

Just to give an example: unless you’re using a programming language in which calls to subroutines are very expensive, having fifteen levels of nesting in a single method is definitely a sign of bad code quality. But if you start refactoring the method how do you know when you should stop? Reducing the nesting to six or seven levels clearly isn’t ‘good enough’, but what about three or four levels? And is three levels ‘good enough’ so that you can move on to the next problem, or should you continue until all functions have no more than one level of nesting?

What’s even more difficult is getting four, five or even more programmers to agree on what will be the minimal code quality requirements for a project. And that’s before you want them to apply those minimal requirements to every piece of code in the project in a consistent way and without long (or even endless) discussions, throughout the lifetime of that project. Wouldn’t it be great to get help from an impartial judge, who can give his opinion about the quality of a certain piece of code based on a set of objective criteria? Again, this is where the computer can help you, e.g. applying some automated rules against your code in order to check that it meets the minimal code quality requirements you all agreed upon.

How your computer can help

There are many tools available that you can use to check the quality of your code. Some of them can be plugged into the build process so that all developers in your project team can follow the exact same coding standard, and that all code adheres to the same minimum code quality requirements. You’ll get the best results if you let the build process fail whenever violations are found, but that’s not always possible, or it may not fit with your organization or the project you’re working on right now. But in my experience when you use the tools to only generate reports, nobody will read the reports, let alone act on them to clean up the code.

So what are the tools that we can use to improve code quality? Automatic code formatting, static code analysis, test coverage reports and mutation testing are four examples of tools that can take care of some of our self-discipline. Let’s have closer look at each of these tools, to see how they work and what they can do for us.

Automatic code formatting makes sure that all code conforms to the same coding standard all the time. Sure enough, these tools aren’t able to format all code exactly the way you wish, but on the other hand they never forget to format it, they’re never sloppy, and they never change their mind. That is unless you change the code formatting rules, but if you ever do that the tool will reformat all the code according to those new rules instantly. Many people are sceptical about automated code formatting tools, but I still think it’s better to have 100% of the code formatted in a consistent way all the time than to have 10% inconsistently formatted because a tool would not be able to format the last 0.1% of the lines exactly to your taste.

Most modern IDEs have some sort of code formatting tool included, but it’s often possible to do better than that. Indeed, one of the problems is that if you want to use an IDE’s code formatting tool in your project you’re still relying on your developers manually running the formatting tool against every code file they’ve modified before they check in. Furthermore, making sure every developer in your project uses the same code formatting rules can be a challenge too if it can’t be distributed easily and used by the IDE automatically.

In my experience you get the best results if you can run the code formatting tool completely automatically, virtually outside the control of the developers. Maybe it can be part of the building process, e.g. using the Jalopy-plugin to Maven in a Java project [Jalopy]. Alternatively you may be able to fire the code formatting tool from a hook of your version control system, e.g. upon check-in of code files. But if neither of them is possible in your project, you may resort to using tools like Checkstyle [Checkstyle] to check whether the code conforms to all (or at least some) of your code formatting rules. And if even that’s not possible you can always run grep, e.g. using a regular expression like \{\s*\} to find empty blocks in curly-braced languages.

Static code analysis can be used to find undesired patterns in your code. These patterns range from simple things like empty blocks and writing to the console over deeply nested functions, confusing or overly complex code to known ‘anti-patterns’, and potential bugs. Notice that many compilers have options to enable some static code analysis, usually in the form of warnings, but dedicated tools have a wider range of rules and patterns they can check. If you want to apply static code analysis on an existing project, start with a small rule set and pick from time to time a new rule that looks useful for you. Remove all violations of the rule from your project, one at a time. Then, when you’re done, consolidate the rule by including it in your rule set so that you (or your colleagues) won’t create new violations of it. When you get the hang of it and you’ve implemented most of the rules that looked useful to you, maybe you should consider creating your own rules to get rid of some of those particular bad habits you or your colleagues have. And if you’re using an open source tool, maybe you even want to donate them back to the project so that others can benefit from them too.

There’s a large variety of tools that can be used to analyse your code statically. In fact, compilers often have some options you can switch on to do some very basic static code analysis. The already mentioned Checkstyle focuses mostly on coding style but it also does some static code analysis. FindBugs [FindBugs] and PMD [PMD] are two other tools for the Java language, the former being a bit more oriented towards finding bugs per se whereas the latter casts its net more broadly. If you’re a .Net programmer you should probably check out FxCop [FxCop]. Lint is the original static code analysis tool for C, and Cppcheck [Cppcheck] is probably the de facto standard static code analysis tool for C/C++. If you program in another language, or you want to check out even more static code analysis tools, be sure to check out Wikipedia’s overview [Wikipedia].

It should be noted that static code analysis on dynamically typed languages is a difficult task. In fact, one could almost say that’s so by definition if you notice the ‘static’ on the one side and the ‘dynamic’ on the other. Nevertheless, even for a language like Ruby there are some tools available, e.g. Roodi [Roodi]. It’s even possible to create your own static code analysis tools using, amongst others, regular expressions and string functions. A few years ago, I was on a project where we had a simple tool making sure all SQL scripts followed some basic rules.

Test coverage tools will reveal which parts of your code aren’t tested by your automatic tests, or maybe even not in use at all. Sometimes low test coverage isn’t an issue, e.g. when it’s difficult to set up automatic tests against a simple and stable interface that’s easy to test manually. But the core of your system, the part where most of your business logic resides, should have a test coverage rate as close as possible to 100%. Aim for the high numbers in those parts of your system, not just 60% or 70%. If you can’t reach your goal, try again before you lower your goal or make a local exception. And don’t forget to consider deleting some code – you’ll be surprised how often that’s the right decision.

In this category too there are many tools available to help you in your project. I have used both EMMA [EMMA] and Cobertura [Cobertura] in Java projects with great success, and Rcov [Rcov] in some Ruby projects. If you’re a .Net programmer, NCover [NCover] is probably the tool for you, but there are many others. Just as for static code analysis tools, Wikipedia has a page [Wikipedia2] with a good overview of tools available in a number of programming languages.

Mutation testing is a very powerful tool, but sometimes can be a bit annoying and irritating. It can be described as a sort of automated code critique, and in the beginning it can be hard to accept the results it produces. What it does is that it makes simple changes (‘mutations’) to the source code that are guaranteed to change the behaviour of the system. Switching a condition or removing a line of code are good examples of changes that should be noticed somewhere. When it has made a mutation, the tool checks that at least one unit test starts to fail. If no test fails clearly something is wrong. Maybe the tool found a condition that isn’t covered by a unit test, and you should add one? Or maybe it found a branch that can’t be reached and you can delete some code? I have to confess that, even though I have used it for years, there are still occasions where I have to manually apply the mutation to the code and run the unit tests again, just to accept that what it says is correct. On the other hand I’ve learned a lot from it, even though it can sometimes be very irritating that it so meticulously points out the mistakes I make in unexpected places.

Personally, I haven’t had much success yet running mutation testing in any language other than Ruby. There may be something particular about Ruby that makes it well suited to mutation testing, or it could be that the programmers behind Heckle [Heckle] found a clever way to make the set-up of the mutation testing easy. I would like to mention Jester [Jester], Jumble [Jumble] and PIT [PIT] as three mutation testing tools for the Java language that look promising, but so far don’t seem to be quite mature yet. I hope to see some evolution here, because mutation testing is one of the things I really miss when programming in Java.

Figure 2 explains how the code quality tools discussed fit in with TDD. Automatic code formatting doesn’t appear in the figure because it happens behind the scenes and is therefore totally transparent to whichever development method you use otherwise. Static code analysis, code coverage and mutation testing are part of the decision box in the middle and help to find an answer to the question of whether the code quality is good enough. Note that code coverage reports and mutation testing often will give you the inspiration, even if they don’t actively force you, to write new unit tests, and therefore in a sense can send you straight back to the ‘Write or modify a test’ box.

Figure 2

Getting started

If you want to use any of these tools, introduce them slowly in your project. It’s always a good idea to start with the generation of some reports just to see how you’re doing. Then try to fix the simple things, and start automating your code quality requirements. As you continue to use the tools and add more and more requirements, you’ll learn how the tools work, and you can start to create your own rules or extensions. But don’t add requirements you don’t understand, and maybe even more important, how violations should be fixed, because that will bring you problems. Over time, you’ll see that old problems and bad habits that have plagued your project over a long time, will disappear, and never come back.

It’s important, however, to understand that these tools won’t solve all your code quality issues. You and your colleagues will still have to use your brains while you’re programming, because not all software quality requirements can be expressed in rules that can be automated. But these tools can automate some of the most boring tasks, so that you can concentrate more on the creative, fun part of programming. And isn’t that why we all started programming in the first place?

References

This article was based on a lightning talk I held at the ROOTS 2011 conference in Bergen, Norway earlier this year.