Blogroll

Misc

Test to Code Ratio

By Dave, on March 8th, 2012

I’ve just been watching the following talk over on InfoQ: Software Quality — You know it when you see it. Thanks to Craig over at SoftViz for pointing me to it. The talk is quite interesting, with the focus being primarily around using innovative visualizations of software to gauge quality.

But, that’s not what I want to talk about. Rather, there was one thing in particular the presenter said which I found intriguing. He was talking about the test-to-code-ratio — the number of Lines of Production Code (LPC) versus the number of Lines of Test Code (LTC). A ratio of e.g. 1:4 indicates that, for every Line of Production Code, there are 4 Lines of Test Code.

Now, here’s the thing: the presenter claimed that for Java (and possibly .Net), the ratio should be roughly 1:1, where as for Ruby it should be around 1:2 or even 1:3. I should emphasise that he based these claims on the research of others (although it’s not clear exactly who). And, he went on to discuss the reason for the higher ratio required for Ruby (and other dynamically typed languages): that the greater expressivity of these languages makes it harder to write tests for them.

If you follow this blog at all, you’ll know I’m a fan of static typing. In fact, my language, Whiley, is about going even further along that spectrum. One of the main advantages claimed by proponents is that static typing catches errors ahead of time. In contrast, many detractors claim that, since static typing only catches a small class of error, you still have to rigorously test your code anyway — so why burden yourself with static types? Naturally, then, the above claim about the test-to-code ratio of Java versus Ruby leads to the question: in looking at the test-to-code ratio, are we also looking at the trade-off between static and dynamic types? Because, if we are, then it might seem to indicate that, actually, static typing does quite a lot for us.

But, obviously, it’s not that simple. For example, it could well be that Ruby programs are, on average, significantly shorter than their equivalent Java programs. If this ratio was, say, 1:3 (that is, Java programs are three times longer than Ruby programs) then the burden of having to write more tests wouldn’t seem so bad…

3 comments to Test to Code Ratio

“the presenter claimed that for Java (and possibly .Net), the ratio should be roughly 1:1″

I don’t agree. You’ll typically need a much ratio of LTC to LPC. Think how many lines of JUnit it takes to cover a single condition.

The only empirical evidence I’ve seen was a survey that Agitar commissioned of open-source projects, and it’s own code base. This concluded that a ratio of about 1:4 (LPC:LTC) was sufficient for about 80% test coverage. [This was published at openquality.org, but the site has been taken down.]

And using a TDD approach, I’d say I write the same number of tests for static and dynamic languages.

When using dynamic languages, I do find that the test code is shorter. However this is negated by the reduction in IDE support for generating production code from failing tests (eg. create missing classes, create missing methods, add parameter etc). This is certainly the case for Groovy in Eclipse, maybe less so for other languages/IDEs?

In my opinion, in very large systems, the differences between Ruby/Python/Java/C# etc tend to bleed together, and the question of test ratios in static vs dynamic languages is probably more relevant in the context of testing smaller domain specific libraries.

Once you reach a certain scale, you‘re no longer thinking in terms of a single codebase—the boundaries of any significantly large system are very blurred. It’s difficult to determine where the system system starts and ends, when it’s interacting with large volumes of data, external services, and real-time feeds from the outside world.

In these situations (especially where very large codebases are unavoidable), a strict compiler might be useful in constraining local boundary conditions of a function, but in any case, regardless of static or dynamic, the architectural emphasis needs to be on immutability. If a codebase cannot be constrained to control mutable objects and data structures, then the amount of unit+integration tests needed will increase enormously, and this is just as true for static languages as dynamic ones.

Here is another article (and research experiment) which seems to also indicate that using a static-typed compiled language means less of a need for certain unit tests. http://evanfarrer.blogspot.ca/2012/06/unit-testing-isnt-enough-you-need.html The experiment involved comparing dynamic-typed Python code to equivalent (?) strong-typed Haskell code, and showing that even full unit-test coverage of the Python cannot catch API-misuse and input errors that static typing catches.