Test Driven Development in .NET Part 4: Refactoring

In this post we’ll look closer at the Refactor step of the TDD cycle. Refactoring means changing the internal implementation of a method or a class without changing the way the class can be consumed by others. In other words Refactoring shouldn’t involve changing the signatures of public methods, the name of the class, the existing constructors, i.e. anything that can be used externally. The changes you make should not break the code of the consumers of the class. These consumers include the tests as well, they should not be affected by the changes either.

The benefit of having the tests in place is that you can easily check if the functions still work as they should after refactoring.

Why refactor???

What is all this fuss about refactoring? We wrote a failing test, then made it pass, implemented the method under test so we should be done, right? Not quite. This may be true of some extremely simple methods, like adding two numbers. In real life applications if you leave your code unattended after the first refactoring cycle then eventually it will very likely grow into a monster over time due to the lack of maintenance. Unless you always write perfectly S.O.L.I.D. code all the time of course, in which case you can stop reading this post.

Think of everyday objects, such as a bicycle. You buy a bike and you may think: I can ride my bicycle every day forever after without any worries. In reality you will have to tend to your bike: oil the chain, change and pump the tires, remove the rust etc. If you don’t maintain your bike then it will decay over time. This is true of many other objects: cars, houses, computers etc., they all need regular maintenance and refinement. The code you write is also such an object. You cannot let it “rust”, you need to come back to it and ask some questions: does it perform well during stress testing? Is the method trying to achieve too much? Does it follow the Single Responsibility Principle? Can I break it down into smaller methods that each carries out one thing at a time?

The following questions are just as important: are you going to understand your code when you come back to it after a year? Is it easy to follow? Is it well documented? Is the documentation up to date?

Proper refactoring ensures that the answer to the questions is yes, or at least a lot closer to ‘yes’ than without refactoring.

You may not always have the time to refactor your code and write unit tests. Developers are often pressed by deadlines, emergency bug fixes, change requests and the like. In that case you may cut corners by forgetting about TDD altogether. Keep in mind though that you will have to repay the short term time gain in the long term with interest. The extra time you will need later comes in different forms: time spent trying to understand the code, trying to change the code, fixing the new bugs that show up due to the lack of tests.

How to refactor?

We will build on the Calculator project we started in the previous post. We’ll expand it to make it a bit more complex as it’s difficult to meaningfully refactor the extremely simple methods and classes the project currently has.

Select observations.xml in the Solution Explorer and press Alt+Enter. In the Properties window select ‘Content’ for Build Action and ‘Copy if newer’ for Copy To Output Directory.

Run the project to make sure it compiles and works. You should get the below output:

I will not go through the code in detail here as it is not our focus. Step through the code step by step using breakpoints and F11, it should be easy to follow.

How to refactor?

We will concentrate on Processor.cs and its LoadData() method. We pretend that we inherited this method from another developer. After all, you would not write anything like this, right? It’s only those other developers that don’t keep testability in mind who would put together such a method.

It’s an antithesis to the Single Responsibility Principle, although there are certainly a lot worse examples out there. We can identify several problems:

The name LoadData is not descriptive of what actually goes on in the method: it performs a lot more than just ‘load the data’

The method performs many things: what do we test?

The method introduces dependencies on external objects

You cannot specify the file name from the outside, you’re stuck with ‘observations.xml’

The method is difficult to test: which part of the method do we test? If it fails, then where does it fail exactly?

Phase 1: Cleanup refactorings

Visual Studio 2012 comes with several built-in refactoring tools that come very handy. Probably the easiest thing to do is to rename the method. Go ahead and rename the method to ‘LoadAndAggregateData’. You will see a small red rectangle appear underneath the method name:

Hover above the rectangle with the mouse – or press Ctrl+’-‘ while having the cursor on the method name – and you’ll see a small context menu which allows you to rename the method. This will also rename the method where it is referenced elsewhere in the code, such as in Program.cs. If you don’t do this then you have to go through the entire project and rename the references as well. You can rename parameter names, classes, namespaces etc. the same way.

A great step towards testability is to remove the inline dependency on the ‘document’ parameter. It should be injected in the method by way of method injection known from DI – Dependency Injection. It simply means that we let the method accept a parameter instead of directly building one itself. Modify the method as follows:

Now we can pass in any type of XML document, the method doesn’t need to get it from a file. This is a breaking change but has a very positive effect so we’ll keep it. The clients will need to adjust like in our case in Program.cs:

The code that reads the XML should go into a separate method so we’ll factor it out. Highlight all rows from ‘List observationList’ to the closing bracket of the foreach loop. Right-click the selection and then select Refactor, Extract Method…:

In the Extract Method window set the method name to ParseMeasurements and click OK. You should see that the highlighted code was factored out to a separate private method. You can remove the ‘static’ keyword from the method signature. The variable observationList is now built using the ParseMeasurements method.

We’ll do the same thing to the measurement aggregation bit. Highlight the rows from ‘Aggregator aggregator’ to the return statement. Extract a new method called AggregateMeasurements using the technique described above and you should come to the following code:

The last immediate change we’ll introduce has to do with the value 2 in the DefaultPartitioner constructor. This value can be confusing to those who read your code. Why 2? What’s 2? What does it mean? It’s more meaningful to introduce constants. Add the following constant to the class:

private const int groupSize = 2;

…and replace ‘2’ with the ‘groupSize’ constant.

Phase 2: factoring out abstractions

The above changes were a great relief: they have brought with them a higher degree of maintainability and readability. You may well be aware of abstractions – polymorphism – in an object oriented programming language: abstract classes and interfaces. They are useful in many ways: they introduce loose coupling between classes, increase testability as you can pass in any implementation to the method under test, they help factor out common attributes of related classes and they also increase the extensibility of your classes as many different implementations can be added to the class.

Let’s see what can do to improve the Processor class. We’ll first extract an interface from the Processor implementation. This will allow us to introduce different types of Processors in the future, ones that perhaps carry out the data aggregation in a different way. It will also enable us to pass in any IProcessor object into a method under test that is expecting an IProcessor parameter in its method signature. Then the unit test will not be tied to a single implementation.

Place the cursor on ‘Processor’, right-click it, select Refactor and Extract Interface… The Extract Interface window will provide reasonable defaults so just press OK. You’ll see a new file called IProcessor.cs in the solution. Locate it and modify it slightly to include some using statements instead of the long fully qualified class definitions:

We’ll also introduce a base class for the Processor so that all derived classes must follow the AggregateMeasurements + ParseMeasurements setup. Add a class called ProcessorBase to the Complex folder of Calculator.Domain: