Summary
In part 2 of the series we look under the hood of legacy code we inherited and experience a moment of panic when we see how cryptic the code is. Then present a simple strategy that does not require us to fully understand the code in order to write good characterization tests.

Advertisement

In Part 1, we introduced the concept of characterization tests, pretended to have been given responsibility for some legacy sales management software, and wrote our first characterization test. We finished by asking ourselves: “What’s next?” and “How many more characterization tests do we need to write before we can safely start modifying the code?”. Let’s start answering these questions.

1. Write tests for the area where you will make your changes. Write as many test cases as you feel you need to understand the behavior of the code.

That sounds like a reasonable goal. I will create as many tests as I need to feel that I have understood – and captured – the behavior of the code. This would be a very tall order if I had to write black-box tests. Without being able to look at the implementation and without a specification it’s hard to know what’s required to get adequate test coverage and capture all behaviors. Fortunately, when writing characterization tests we are not only allowed, but encouraged, to look at the code.

I know what some of you are thinking: Trying to understand a system’s behavior by looking at a pile of code that you did not write is like trying to understand a cow’s behavior by looking at a butcher’s shelf. It’s hard to fully understand what’s actually going to happen by just looking at the code; you’d have to simulate the various paths of the code’s execution in your head, keep track of variable values, etc. – tough things to do for anything but the most trivial code. Fortunately, as you will see, with characterization testing we don’t have to do that. We don’t look at the code to gain complete understanding of what it does; we look at the code for clues and suggestions on what to test. Let’s continue with our example so you can see exactly what I mean.

The sample code we are working with is quite simple: a single method that calculates a sales commission by taking a numeric value as input and returning a numeric value as output. How hard can that be to understand? If we get lucky, the original developer will have been considerate enough to write some nice, clean, self-documenting (or at least well commented ) code. But in many cases, we get … this:

Take a minute to see if you can understand the higher level specification, or description, of what this code is trying to accomplish by simply looking at it. If you had to write the documentation for this method in English, what would you write?

What I am hoping to get across with this example is that, even for a simple, fully self contained, method that uses only basic arithmetic, understanding its higher-level behaviors by just looking at the code is non-trivial. You can imagine how much more difficult this would be if, instead of dealing with simple addition and multiplication, you had to do the same thing for a more complicated method involving several other classes and method invocations whose behavior you don’t know.

Fortunately, our job at this moment is not to understand what every single variable and operation does, let alone to decide if it’s correct or not. Our job right now is to write some characterization tests that will capture and embody the end-result of all those operations. We can worry about making the code cleaner and more readable after we have tests that will ensure we have preserved the current behavior.

Here’s how I’d go about it.

I notice that the conditional expressions involve a comparison between the parameter totSales and the constant BQ. Since this code calculates sales commissions, I guess that the Q in BQ probably stands for quota. I have no idea what the B stands for, so I assume that it stands for basic. But none of this really makes much of a difference at this point. For all we know, BQ might stand for Banana Quarks. All we care about is that this particular code will exhibit three distinct behaviors based on the relative values of totSales and BQ. Since totSales is set to 10000, we have the following three behaviors to characterize:

We have already written a test that satisfies the condition for the first behavior in Part 1 (not intentionally, we simply picked a random value of 1000 and it just happened to be a value that would satisfy the first condition). We can satisfy the two remaining conditions and capture all three behaviors by invoking the method with the values of, say, 20000, and 30000.

We write the second test as follows (remember at this point we are trying to guess a return value that will cause a failure so we can see from the error message what the actual return value is):

The current behavior of the code is such that the calculated commission on $20,000 of sales is $5,000. Again, I don’t know if that’s wrong or right, but it’s the actual behavior of the code, so I am going to characterize that behavior by editing the assertion so that the test will pass:

If you ask me, a $14,000 commission on $30,000 in sales seems awfully high. This might be a bug, so I make a note of it – just because we are writing characterization tests, does not mean that we should not keep an eye out for possible existing bugs.

I now have the luxury of three characterization tests for calculateCommissionDue code:

I run them with a code coverage analyzer and, as expected, they all pass and I get 100% statement and condition coverage. Great. I can now proceed to modify the code with much more confidence than I had before – even though I still don’t know exactly what the code is supposed to do, or what the identifiers BCR, OQM1, and OQM2 could possibly mean.

Time to wrap-up part 2. We have seen that, when it comes to characterization testing, looking at the code is very helpful even if you don’t understand everything that’s going on. At the very least, you can get some clues that will help you come up with relevant test data to improve your code coverage and get more behaviors to catch.

In Part 3, we are going to start taking advantage of the characterization tests we have written so far. Time for some payoff. Make sure you check it out.

Talk Back!

Have an opinion?
Readers have already posted
3
comments
about this weblog entry. Why not
add yours?

RSS Feed

If you'd like to be notified whenever Alberto Savoia adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Alberto Savoia is founder and CTO at Agitar Software, and he has been life-long agitator and innovator in the area of software development and testing tools and technology. Alberto's software products have won a number of awards including: the JavaOne's Duke Award, Software Development Magazine's Productivity Award, Java Developer Journal's World Class Award, and Java World Editor's Choice Award. His current mission is to make developer unit testing a broadly adopted and standar industry practice rather than a rare exception. Before Agitar, Alberto worked at Google as the engineering executive in charge of the highly successful and profitable ads group. In October 1998, he cofounded and became CTO of Velogic/Keynote (NASD:KEYN), the pioneer and leading innovator in Internet performance and scalability testing. Prior to Velogic, Alberto had 13-year career at Sun Microsystems where his most recent positions were Founder and General Manager of the SunTest business unit, and Director of Software Technology Research at Sun Microsystems Laboratories.