Testability Explorer: Measuring Testability

Testability Explorer: Using Byte-Code Analysis to Engineer Lasting Social Changes in an Organization’s Software Development Process. (Or How to Get Developers to Write Testable Code)

Presented at 2008 OOPSLA by Miško Hevery a Best Practices Coach @ Google

Abstract

Testability Explorer is an open-source tool that identifies hard-to-test Java code. Testability Explorer provides a repeatable objective metric of “testability.” This metric becomes a key component of engineering a social change within an organization of developers. The Testability Explorer report provides actionable information to developers which can be used as (1) measure of progress towards a goal and (2) a guide to refactoring towards a more testable code-base.Keywords: unit-testing; testability; refactoring; byte-code analysis; social engineering.

1. Testability Explorer Overview

In order to unit-test a class, it is important that the class can be instantiated in isolation as part of a unit-test. The most common pitfalls of testing are (1) mixing object-graph instantiation with application-logic and (2) relying on global state. The Testability Explorer can point out both of these pitfalls.

1.1 Non-Mockable Cyclomatic Complexity

Cyclomatic complexity is a count of all possible paths through code-base. For example: a main method will have a large cyclomatic complexity since it is a sum of all of the conditionals in the application. To limit the size of the cyclomatic complexity in a test, a common practice is to replace the collaborators of class-under-test with mocks, stubs, or other test doubles.

Let’s define “non-mockable cyclomatic complexity” as what is left when the class-under-test has all of its accessible collaborators replaced with mocks. A code-base where the responsibility of object-creation and application-logic is separated (using Dependency Injection) will have high degree of accessible collaborators; as a result most of its collaborators will easily be replaceable with mocks, leaving only the cyclomatic complexity of the class-under-test behind.

In applications, where the class-under-test is responsible for instantiating its own collaborators, these collaborators will not be accessible to the test and as a result will not be replaceable for mocks. (There is no place to inject test doubles.) In such classes the cyclomatic complexity will be the sum of the class-under-test and its non-mockable collaborators.

The higher the non-mockable cyclomatic complexity the harder it will be to write a unit-test. Each non-mockable conditional translates to a single unit of cost on the Testability Explorer report. The cost of static class initialization and class construction is automatically included for each method, since a class needs to be instantiated before it can be exercised in a test.

1.2 Transitive Global-State

Good unit-tests can be run in parallel and in any order. To achieve this, the tests need to be well isolated. This implies that only the stimulus from the test has an effect on the code execution, in other words, there is no global-state.

Global-state has a transitive property. If a global variable refers to a class than all of the references of that class (and all of its references) are globally accessible as well. Each globally accessible variable, that is not final, results in a cost of ten units on the Testability Explorer.

2. Testability Explorer Report

A chain is only as strong as its weakest link. Therefore the cost of testing a class is equal to the cost of the class’ costliest method. In the same spirit the application’s overall testability is de-fined in terms of a few un-testable classes rather than a large number of testable ones. For this reason when computing the overall score of a project the un-testable classes are weighted heavier than the testable ones.

3. How to Interpret the Report

By default the classes are categorized into three categories: “Excellent” (green) for classes whose cost is below 50; “Good” (yellow) for classes whose cost is below 100; and “Needs work” (red) for all other classes. For convenience the data is presented as both a pie chart and histogram distribution and overall (weighted average) cost shown on a dial.

Clicking on the class ClassRepository allows one to drill down into the classes to get more information. For example the above report shows that ClassRepository has a high cost of 318 due to the parseClass(InputStream) method. Looking in closer we see that the cost comes from line 77 and an invocation of the accept() method.

As you can see from the code the ClassReader can never be replaced for a mock and as a result the cyclomatic complexity of the accept method can not be avoided in a test — resulting in a high testability cost. (Injecting the ClassReader would solve this problem and make the class more test-able.)

4. Social Engineering

In order to produce a lasting change in the behavior of developers it helps to have a measurable number to answer where the project is and where it should be. Such information can provide in-sight into whether or not the project is getting closer or farther from its testability goal.

People respond to what is measured. Integrating the Testability Explorer with the project’s continuous build and publishing the report together with build artifacts communicate the importance of testability to the team. Publishing a graph of overall score over time allows the team to see changes on per check-in basis.

If Testability Explorer is used to identify the areas of code that need to be refactored, than compute the rate of improvement and project expected date of refactoring completion and create a sense of competition among the team members.

It is even possible to set up a unit test for Testability Explorer that will only allow the class whose testability cost is better than some predetermined cost.

4 responses so far ↓

Very nice. Alternatively, one could encourage test-first (followed by test-driven) development. Which also has additional benefits in design thinking and interaction with users.

Having to write the test before the code is a powerful disincentive to writing code that is hard to test. Having to retrofit tests, and worse yet having to re-write your code so that it is easier to retrofit tests seems like a disincentive to both testing and writing testable code.

[...] Many thanks to Reik Schatz who has writen a Testability Explorer plug-in for Hudson. Great job Reik! And many thanks for helping out. You can check out his experience about writing the plug-in here. To read abut the plug-in visit the Hudson repository of plugins here. To find out more about the Testability Explorer read the OOPSLA paper here. [...]