Can Refactoring Produce Better Code?

One of the more common buzzwords thrown around by programmers and non-programmers over the past several years is "refactoring." Given that most English-language dictionaries don't include the word "refactor," a definition is probably a good idea.

Wikipedia defines refactoring as "any change to a computer program which improves its readability or simplifies its structure without changing its results" (emphasis added). Simply put, refactoring is moving code about to make it more maintainable. Refactoring can involve an operation as simple as renaming a local variable. Or, refactoring can entail complex design manipulations such as converting a switch statement into a polymorphic hierarchy.

Why is refactoring so important? All systems exhibit entropy, and code-based systems are no different. Without concerted efforts to keep them clean, systems will continually degrade. The predominant cost of a system is not in its initial development—it is in the ongoing maintenance of such a system. Continual refactoring is essential to keeping a system easily comprehensible and thus maintainable by its developers.

More formal definitions of refactoring exist. The original use of the term refactoring likely derives from a doctoral paper written by William Opdyke in 1993. This paper demonstrated mathematically how code could be transformed from a state A to another state B while maintaining the same recognized behavior.

How do we know whether or not we've changed "recognized behavior?" In some sense, refactoring almost always does change recognized behavior—most code manipulations change the performance characteristics of code. Most of the time, however, refactoring does not significantly impact performance. Developers can choose to verify performance if necessary.

Performance aside, other ways of baselining behavior include design by contract and unit testing. Design by contract allows programmatic specification of function preconditions, postconditions, and invariants. Unit testing allows behavior to be externally recognized by code that exercises function calls, comparing the results to baselined results.

Most developers today use unit tests instead of design by contract. In the remainder of this article, I'll talk about refactoring in the context of having supporting unit tests.

Once unit tests exist and are all passing, a developer can safely refactor the code. After a developer makes changes to the code, he or she can execute unit tests. If the developer properly refactored the code—he or she did not change the recognized behavior—all the unit tests still pass. If the unit tests fail, the developer knows quickly that he or she made mistakes while changing the code.

Without the existence of good unit tests, refactoring is a high-risk activity. It's very easy to break code by moving it around. The classic programmer's adage is "if it ain't broke, don't fix it!" The result is that most code in most systems isn't kept clean over its lifetime. These systems continually degrade in code and design quality, which makes maintenance costlier with each passing day.

Refactoring: Improving the Design of Existing Code

Martin Fowler's book, Refactoring: Improving the Design of Existing Code, is the bible for the practice. It contains a catalog of refactoring "patterns," or mechanics. The patterns each describe how to transform code from one state to another state.

Many of these catalog patterns represent very simple code transforms. The simplest pattern, named "Extract Method," describes how to break code out into its own method. Some refactoring patterns represent complex code transforms such as "Replace Switch Statement With Polymorphism," an involved transform that's comprised of many smaller refactorings.

Most developers already have an ingrained approach to executing refactorings as simple as Extract Method. Many scoff at the notion of a specific set of detailed steps to accomplish this very common operation. Nonetheless, there is value in the patterns: first, they have names; names allow developers to talk about them concisely. Second, Fowler's steps represent the optimal approach for each refactoring. His steps also support odd conditions that a developer might not have considered.

Although the book's subtitle includes the phrase "Improving the Design," it's important to keep in mind that the refactoring catalog patterns themselves are neither "good" nor "bad." It's up to each developer to determine whether a given code transformation improves the code or makes it worse.

A Sample Refactoring Pattern

Each refactoring pattern is given a name that concisely describes the goal of the refactoring. Take a further look at Extract Method, for which you can view a synopsis at Fowler's Refactoring site.

Each pattern first describes forces, or circumstances, that drive the need for applying the code transform. The force for Extract Method is: "You have a code fragment that can be grouped together." The force is followed by a brief description of the steps to be taken ("Turn the fragment into a method whose name explains the purpose of the method"). The remainder of the pattern is predominantly a detailed set of steps to accomplish the transform. Fowler also lists related refactoring patterns.

There are, of course, more changes you could make to improve this code. The addDays method probably belongs on a date utility class. The code in getHoldingPeriod could be better structured. Most importantly, however, you've taken the first step toward improving the code. The dateDue method clearly states the business logic—the high-level policy.

Most refactorings are this small, and should be this small. Having unit tests that run rapidly means that you can incrementally make simple changes to our code. Each incremental change is cheap. If you keep this attitude from day one, refactoring can usually remain inexpensive, and just part of how you craft software. Your software slowly gets better, or at least it doesn't get any worse.