The Meaning of 100% Test Coverage

When I release components, example code or even just helper classes, I often
tout 100% test coverage as a feature. Which (as I probably also state often
enough :P), means that my unit tests execute 100% of all lines in the source code.
But what advantage does this actually bring to a developer, and, just as
interesting, what does having complete test coverage not mean?

For people practicing true TDD (test first -> red, green, refactor),
100% coverage is nothing unusual, though even they may decide to not write tests
for all invalid inputs possible: if a piece of code satisfies all the tests and
the tests cover everything the code should do, it’s enough. If you’re
building a library on the other side, the use case of a customer providing invalid
inputs will be a valid concern worthy of a test.

I, however, am currently adding unit tests to an existing code base and I decided
to go for 100% test coverage. In this short article, I will explain why I see
complete test coverage as a worthwhile goal, what effect going for that level of
test coverage has on a project and what it says about the code.

What Does 100% Coverage NOT Mean?

Correctness. While having 100% coverage is a strong statement about
the level of testing that went into a piece of code, on its own, it can not
guarantee that the code being tested is completely error-free.

The method ImCovered() has 100% test coverage. But I bet you can
spot one or the other problem in there, right?

That’s the problem. 100% test coverage does say that effort went into testing
something, but it doesn’t guarantee anything about the quality of the tests and
therefore, the quality of the code being tested.

This has lead many developers to regard full test coverage as completely
useless. After all, if it doesn’t guarantee correctness of the code and going
from maybe 95% to 100% takes that much time more, what good can it be?

Advantages of 100% Test Coverage

Having complete test coverage still has some things going for it:

Modular Design

Remember that unit testing is about design as much as it is about correctness?
If you unit test, you are forced to design your classes so they can be isolated
from each other. Why? Imagine you had written this bad guy, so to say:

It takes a reference to the game world and a sprite batch to render itself.
But when it’s time to load its content, this bad guy goes shopping for references
in your game’s object model. It accesses the game’s ContentManager
behind the scenes, makes assumptions about some HudComponent
being in place and of said component providing a Radar.

If you tried to unit test this class, you would have to initialize half your
game with it. And then you’re no longer unit testing, you’re integration testing
and writing small, focused test cases is out of the question.

The same class designed with unit testing in mind might look like this:

Still not a great design, but at least it’s testable. Unit tests can supply
mocked loader, renderer and radar
implementations and check whether the bad guy actually does register itself
on the radar.

So what 100% unit test coverage tells about a project is that it is
likely following a design where classes can be easily isolated from
the rest of the system, meaning easier reuse and less resistance to
design changes.

Notice I’m not using absolute statements here. A sly programmer who has not
understood unit testing might just go ahead and write an integration test that
runs half the game, but still achieves 100% test coverage.

Testability

When a programmer who uses unit tests discovers a bug, he does the same as
any other programmer – he debugs his code until he has located the bug.
But then he doesn’t immediately fix it – he will first write a test that checks
for the bug and only if that test fails will he fix the bug and then verify
that his test now passes.

This, of course, is to prevent the bug from ever coming back – a so-called
regression. It also increases the quality of the tests. A problem that could
earlier not be caught by unit tests can now be detected.

But we’re making an assumption here: that the programmer can actually write a
test that reproduces the bug.

If the code the error occurs in wasn’t designed with testability in mind,
it might be that the bug depends on, say, a System.Threading.Timer
that is triggered only once per three seconds. Unit tests should be fast, so
writing a unit test that waits 3+ seconds is out of the question.

Our programmer has to first refactor the design – possibly introducing other
bugs or hiding the bug he’s trying to fix – to allow for the timer to
be mocked.

That is something that will never happen if a code base has 100% test
coverage (and the unit tests are actually unit tests and not integration
tests). Because everything is testable, the programmer can write his test,
reproduce the bug, solve the bug and be done with it.

Correctness

There, I said it. While 100% test coverage doesn’t guarantee that some code
is error-free, it does say that someone greatly cares about the code.

Getting from maybe 95% coverage to 100% coverage can be a lot of work.
Take a look at this piece of code, for example:

We can’t mock the time source. So to allow for testability, we refactor
our constructor like this

publicclass Scheduler {/// &lt;summary&gt;Initializes a new scheduler using the default time source&lt;/summary&gt;public Scheduler():this(CreateDefaultTimeSource()){}/// &lt;summary&gt;Initializes a new scheduler using the specified time source&lt;/summary&gt;/// &lt;param name="timeSource"&gt;Time source the scheduler will use&lt;/param&gt;public Scheduler(ITimeSource timeSource){this.timeSource= timeSource;}/// &lt;summary&gt;Creates a new default time source for the scheduler&lt;/summary&gt;/// &lt;returns&gt;The newly created time source&lt;/returns&gt;publicstatic ITimeSource CreateDefaultTimeSource(){if(WindowsTimeSource.IsAvailable){returnnew WindowsTimeSource();}else{returnnew GenericTimeSource();}}}

Now we can test the scheduler with a mocked time source, we can get
coverage on the default constructor (which is trivial) and we can
test the method for creating the default time source.
But one branch of that if stays uncovered.

We either have to expose the decision logic or create another
mockable interface just for deciding whether the windows time source
should be used. Because You
Ain’t Gonna Need It, we chose the former:

publicclass Scheduler {/// &lt;summary&gt;Initializes a new scheduler using the default time source&lt;/summary&gt;public Scheduler():this(CreateDefaultTimeSource()){}/// &lt;summary&gt;Initializes a new scheduler using the specified time source&lt;/summary&gt;/// &lt;param name="timeSource"&gt;Time source the scheduler will use&lt;/param&gt;public Scheduler(ITimeSource timeSource){this.timeSource= timeSource;}/// &lt;summary&gt;Creates a new default time source for the scheduler&lt;/summary&gt;/// &lt;param name="useWindowsTimeSource"&gt;/// Whether the specialized windows time source should be used/// &lt;/param&gt;/// &lt;returns&gt;The newly created time source&lt;/returns&gt;internalstatic ITimeSource CreateTimeSource(bool useWindowsTimeSource){if(useWindowsTimeSource){returnnew WindowsTimeSource();}else{returnnew GenericTimeSource();}}/// &lt;summary&gt;Creates a new default time source for the scheduler&lt;/summary&gt;/// &lt;returns&gt;The newly created time source&lt;/returns&gt;publicstatic ITimeSource CreateDefaultTimeSource(){return CreateTimeSource(WindowsTimeSource.Available);}}

Only now can a unit test achieve full coverage. As before, we can use a mocked time source,
we can test the default constructor, we can test whether a default time source can be
created and, at last, we can also test that both time sources can be created.

But this also demonstrates a risk we run into when going for 100% coverage: that of
writing unit tests that depend on internal implementation details of our classes. The test
for the CreateTimeSource(bool) overload was only introduced to get coverage
and tests an internal method.

Such tests are sometimes required if you want to keep encapsulation intact while still
testing the logic of some private algorithm of a concrete class, but you better not let them
creep into tests for your public interfaces where unit tests should verify the interface
contract, not the implementation details. Otherwise, unit tests become a road block instead
of a tool to enable changes.

16 Responses to “The Meaning of 100% Test Coverage”

Very nice, I think unit testing is often overlooked / discouraged in game development. I wonder if you would consider writing more about writing unit tests (as opposed to just writing code that is compatible with the concept of unit tests)?

Excellent article! Very accurate and honest representation of what 100% code coverage is and is not. I always have a hard time explaining this, but you do it perfectly here. Just because 100% code coverage does not mean that there are no bugs, it does not mean that is has no value. One point that I think was briefly hit on is that a developer who has achieved 100% code coverage is likely to have written good code and used good practices. 100% code coverage says, someone cars about this code and doing things right.

What I see in that dependency chart is a clean layering scheme with minimal dependencies — all of which are a direct result of controlled code reuse, not haphazard interactions between random classes.

What is it that scares you in that dependency chart?

I’d say my claim is validated by the fact that my unit tests succeed in isolating any of those classes from their dependencies to test them. Extracting a component is a matter of copying its associated classes into a separate project, [i]that’s[/i] how loosely the components in my framework are coupled to each other.

If your claim in the modularity paragraph is that a unit-test-driven-design leads to testable units then I agree. That’s a platitude. I don’t get what being able to simulate the dependencies between modules has to do with eliminating dependencies between modules though. It seems like all 100% coverage is giving you is the ability to simulate them.

Being able to “simulate dependencies” is one and the same thing as breaking them.

If you can’t test component A without component B, you make an interface that is used by A instead.

Now you can mock (“simulate”) B. But you can also make A use any implementation that follows the interface set forth for B. In other words, A no longer depends on B, it now depends on [i]something[/i] that fulfills the interface.

I think the results you achieve with 100% test coverage depend on the programmer’s skill a lot. If you care about your code (which, as you say, is likely if you go to great lengths to get 100% test coverage), you will have good code with good tests in the end. But it would be possible to have 100% test coverage with very badly written tests and ugly code as well if you arent motivated and its just company policy.

The problem is that you have to put a probably or likely in front of all advantages the coverage gets you. It is not a hard guarantee for good code.

I think you might be confusing modularity with encapsulation. The difference is encapsulation means to wall off a chunk of code and provide public interfaces to it. Modularity means to ensure that the dependencies that one module has to other modules are minimized.

Even if I’m wrong and your definition of modularity is correct, you aren’t doing a great job of providing a public interface for your modules.

You provide public interfaces to your classes, but as you intend your framework to be used, your classes are not modules. Your modules are on the project level. However, there is no public interface between projects. They are all just thrown into the same namespace and use each others’ classes wantonly.

It appears our definitions of encapsulation and modularity agree with each other. In your definition of modularity, you also refer to modules, not classes, which I take as meaning you too understand that modularity applies at any level – methods, classes, components or even projects.

Now if modules can be isolated from their peers for testing, that is the very definition of modularity for me. If a test can replace a dependency with a mock, so can normal code replace the dependency with a different implementation. Thus, by making sure that external modules can be mocked, one also gets rid of dependencies, thereby improving modularity.

The rest of your comment gives me the feeling that you haven’t actually bothered to even look at my code.

I don’t want to come across as arrogant – I’m willing to consider constructive criticism and there surely are things I can still learn – but I can only consider those claims as outlandish.

If I had to name anything my framework is especially good at, it would be that it provides extremely clean and well-designed public interfaces to any of its modules.

My modules are not on the project level. I do build what I refer to as “components”: small groups of interacting classes. For example, my particle system consists of the classes IParticleAffector, AffectorCollection and ParticleSystem. Each of these have carfully considered public interfaces and the component’s single dependency is its internal use of the custom thread pool from my Utility library.

There are no public interfaces /between/ components because components usually do not interact with each other. Public interfaces are /to/ components – to be used from the application code. Most public interfaces are defined by the public methods a component exposes instead of through an explicit ISomething interface – simply because it wouldn’t add any value and YAGNI. Don’t let that fool you into thinking a component’s interface is anything but carefully considered and precisely engineered.

Some components do internally use other components, but I maintain tight control over such dependencies and they are always minimal, logical and intentional. This can mean that eg. to use the SpecialEffects project, you need the Graphics and Support projects. There is no point in creating an interface for an internally used component (that makes no sense for the user to replace with another implementation) and letting the user wire up that association himself.

My classes are not thrown into the same namespace. Classes that share the same namespace belong to the same component or provide additional functionality based to the component. Not counting the unit test classes, a namespace on average has maybe 6 classes.

It sounds like I made you angry, or was approaching that. I made this discussion too much about your library instead of what you are trying to explain about 100% test coverage. And instead of just asking you about why you did things the way you did I made poorly thought out statements that sounded very accusative and critical. I’m sorry.

Anyway, you shed a lot of light on why you did things the way you did and how things work in your framework. I also had never heard of 100% coverage before, so that was good as well. I got a lot from this conversation, but I’m guessing talking to a noob beginning programmer like me really doesn’t benefit you. So thanks for giving me some of your time.

Anyway, you’ve convinced me on most of your points. Thanks for your time explaining your practices and thanks for your library. I really did get a lot out of this conversation and out of studying your library.