Published

The Measure Of Code

I’ve gotten a lot of questions about how big our codebase is, how fast does it build, how many tests we have… Fear not, Gentle Reader, all your burning questions will be answered here.

Size

Charles and I were priding ourselves in keeping things small and minimal. But truth be told, it’s not like we were keeping track of how many lines of code we had written. Were things as small as we hoped they were?

The most convenient way of counting lines of code that I know is CLOC. It’s an extremely easy to use open source program which counts the lines of code in a codebase, gives very detailed information, strips out whitespace, breaks things down by language, and does just about everything you’d want from a program like that.

Running it on the latest version of our code (not including any 3rd party libraries) produces this:

Almost 60K lines of C++ code seemed very high. At first I thought it was because CLOC was counting files twice: once in their regular location and once in the .svn directory, but apparently it’s already removing all duplicates, so that wasn’t it.

Almost more scary than the amount of C++ code (which is all our runtime and some of our tools) is the amount of C# code. For a language that claims to be of significantly higher level than C++, that’s quite a mouthful of code!

Another surprising count in there is the number of lines with comments. Since we make heavy use of TDD, I really didn’t expect more than a couple dozen lines of code in the whole codebase. Still, I’m kind of proud that we have less than one line of code per file on average 🙂

Here’s a more detailed breakdown, with the line count just for our runtime (engine and game):

That means that our average C++ test is about 11.5 lines long, and C# tests 14.4. Frankly, that sounds rather high. We make heavy use of fixtures whenever possible and each test usually only checks for a single condition (even if it involves a couple check statements). I suppose that number is higher than expected because it probably includes all the lines from #include statements and all the fixtures as part of the average.

Language

Lines

Non test lines

Test lines

% of non test code

Number of tests

Lines per test

C++

58156

33246

24910

57% *

2163

11.5

C#

22966

12402

10564

54%

735

14.4

* If we only count cpp files, that goes down to 49%

I was curious about that last part of checking a single thing per test, so I ran a grep for the number of CHECK statements in our code:

That’s 1.8 CHECK statements per TEST, which is about right. Even though we’re checking for a single condition, we’ll often check a couple things about it (i.e. the camera stopped and it reached its final destination).

Build Times

So, given that amount of code, how long does it take to build it? Clearly it depends on your hardware. Since we’re not exactly rolling in money, we don’t have particularly powerful machines. Here at home, I’m using a modest Core 2 Duo E4300 (overclocked to 2.6 GHz) with fast memory and a relatively fast SATA hard drive, so that’s what I used for all my timings.

A full build of our game, plus all the libraries, all the tests, and running all the tests takes exactly 1 minute and 10 seconds. That’s pretty good for two reasons:

When we work with the game we don’t build and run the unit tests for the engine. We have a separate solution for that. A full build of just the engine, the game, and the game unit tests only takes 43 seconds.

The game itself is a fairly large project and devenv doesn’t know how to paralellize that build, so it’s only using half the available CPU power for about half the build time.

An incremental build after changing a single cpp file takes slightly over a second (including half a second of unit test execution).

As you can imagine, working with that codebase is a dream come true. Snappy, responsive. Nothing is hard enough that can’t be changed.

Unfortunately that’s where the fairy tale ends. The tools are another story altogether. Our C# tools, with all their unit tests, build in a mere 18 seconds, and the C++ tools in 1 minute and 10 seconds. That’s not too bad, except that it’s a surprisingly large amount of time for the C++ tools since there aren’t that many of them.

Here’s the kicker, doing another build without changing a single thing take 38 seconds. Whoa! We’re doing some C++/CLI trickery and apparently dependency checking is totally broken in VS2005 (either that, or we just don’t know how to set it up right).

Keeping things fast

What’s the secret of a lighting-fast build? Clearly, keeping the code size down is crucial. If your codebase is 2 million lines of code, builds are going to be painful no matter what. But they can be a little less painful with some gentle care.

One of the main build-time killers that we’re avoiding is the use of STL or Boost. Those libraries pull in everything and the kitchen sink, and their heavy use of templates make build and link time go through the roof. No thanks.

Our template use is pretty minimal. We have a couple containers (which I love and I’ll write about it one of these days) and that’s about it.

We’re pretty anal when it comes to keeping physical dependencies to a minimum. We forward declare aggressively, and we only include the headers that are necessary for each cpp file (PC Lint is “kind” of enough to remind us every time we have unnecessary #includes). We’re not using external include guards or #pragma once.

Precompiled headers are either not used, or kept to a minimum. I think the only project that uses them is the game and only for Havok headers. We don’t even have windows.h in a precompiled header (which would be a really bad idea because you’d be putting all the junk in windows.h available to your whole program).

Finally, we are using incremental links whenever possible. I remember a few versions of Visual Studio ago they were pretty broken, but they’re not giving us any problems. The only caveat is that if you modify a static library your program is linking with, it will force a full link. So they’re really only good for modifying the executable itself.

We’re not using any distributed builds. First of all, we don’t have enough computers to make it worthwhile. And second, I had horrible experiences with distributed builds in the past. They would help with a badly structured codebase, at the cost of longer incremental builds and mysterious spurious bad builds. Besides, once they’re in place, they tend to encourage even further disregard for keeping dependencies to a minimum.

How about you?

So, that’s it for the Power of Two codebase. How about you? Want to share your size, build times, or any other data?

Published

Have you (or do you regularly) run a code coverage tool? TDD should show near 100% depending on how religiously it’s practiced. If not 100%, any patterns arising that explain the remaining percentage points?

I’m not a fan of code coverage tools. To me, TDD is not about achieving 100% coverage, but about helping drive design. It might be an OK metric to run on a large team that is not sold on TDD to make sure things aren’t sliding, but I wouldn’t get anything from running it on my own code.

And guess what, if I did run it, I doubt my code coverage would be higher than 80%. There is some glue code that is more of a pain to test than any benefits I get from testing it. Same thing with some “leaf” code that nothing depends on it.