Published

Stepping Through the Looking Glass: Test-Driven Game Development (Part 1)

If you’re a typical game developer, you probably don’t write any tests for the code you create. So how would you feel about not just writing tests, but creating them before the code they are testing? What if I told you those tests don’t even verify that the code you write is correct? “It’s madness,” you might say; “it’s all backwards!” Not really. It all makes sense in its own way. Follow me through the looking glass and I’ll show you the wonderful upside-down world of test-driven development and how we can apply it to games.

Traditional development

Let’s take a moment to think about how you write code without using test-driven development. You decide you need to implement some piece of functionality, you mentally break the problem into smaller problems, and you start working on each of them. While you write code, your only guide for whether you’re heading in the right direction or not is whether your code continues to compile. Once you’ve assembled a minimum number of those smaller subtasks, your program might even be able to run. So you fire up the game or tool and “see” if it works. If it’s not obvious (or if it doesn’t work at all), maybe you break in the debugger and try to step into the code you just wrote to make sure the program is doing what you intended. Once you see it do its thing, you happily commit it to version control and move on to some other task.

Does that more or less describe how you write code? You can replace any specific details in that paragraph, but in general that seems to sum up in my experience how most programmers approach writing code.

What is Test-Driven Development?

The idea behind Test-Driven Development (TDD) is extremely simple: before you write a piece of functionality, you write a test for it. Only once you have that test, which should fail at first, do you actually implement the functionality and see the test pass.

I know it sounds like a backwards approach if you’re not used to it, but it makes a lot more sense than it seems to at first. The thing to keep in mind is that you’re not writing a bunch of tests first and then making them pass. You’re just writing a single test for a very small bit of functionality that you can implement in a minute or two.

TDD development has a very well defined, short cycle, usually only taking a few minutes per cycle:

Write a single test for a very small piece of functionality.

Run it and see it fail (or not even compile in C++).

Write code to make the test compile and pass.

Run the test and see it pass.

Refactor code. See test pass.

Refactor tests. See tests pass.

Comparing it to the non-test-driven development approach, you’re replacing all the mental checking and debugger stepping with code that verifies that your program does exactly what you intended it to do.

There really isn’t much more to test-driven development than that. The devil, as they say, is in the details. You need to learn to effectively do that cycle over and over during your development, how to tackle the right task size, how to organize things so you can test them, how to write tests easily, how to make sure tests can be executed constantly, etc.

A very good starting point is Kent Beck’s book Test Driven Development. He’ll take you through an example and give you an idea of the kind of pacing and techniques you’re likely to use. Don’t be put off by the use of Java as their development language. Java is close enough to C++, and besides, it’s not the language that’s important, it’s the concept.

Benefits of TDD

So, what exactly do you get by doing things apparently backwards? A lot, as it turns out.

Safety net

Developing your code using TDD means that every single piece of functionality you add will have a test coverage (or close to it anyway). If anything ever changes, you’ll know right away. This is extremely important for many reasons.

Refactoring becomes a lot easier, so you can always keep your codebase healthy (I always like to say that the quality of a codebase is directly tied to how easy it is to change—if it ever becomes difficult or “a pain” to change, the quality is already on its way down and it’s just going to get worse if untreated).

Making changes late in the project also becomes a lot easier, effectively flattening out the change vs. cost curve.

This means you can be making significant changes late in the project based on real playtest feedback or finding out what really works and what doesn’t, and still be confident you’re not breaking anything major. This alone can make the difference between a so-so and an outstanding game.

We all know about those dusty corners of codebases. You know, the places with all the cobwebs, and sometimes gnarled oak trees and a troll or two hiding in the shadows. The places no sane programmer dares approach, especially as a milestone looms over the horizon. TDD gives you the courage and the tools to dive in head first into any part of the code, even the scary parts, make any necessary changes, and walk away like a hero.

Usable design

Here’s something that might come as a surprise: Code that you develop through TDD is going to end up looking very different from code you would have written otherwise. Why? After all, the only thing we’re doing differently is writing the tests first. TDD is not a design paradigm or anything like that.

That’s because with TDD, the first thing you’re forced to do is to think about how you want to use your code. You don’t start with a UML diagram, or a detailed class header file, or anything like that. You start by using the feature you’re about to implement. You could call it “extreme dog food eating.” You’re going to be your first user of your code, and you’re going to think about how to use it from the beginning.

What does that mean in practice? You’ll find that it’s very easy to create new objects of the classes you’re designing. Most of the time you’ll be able to create them on the stack and that will be it. It also means you won’t have to have a complicated sequence of registration, creation, binding, yada, yada, yada. You create an object and it does what it’s supposed to do.

Quick test: Think about your current game codebase. Could you write something like the following code?

Chances are you can’t. You probably need a world for the entity to live in. And the world needs to pull in physics and graphics, and initialize the video card, and the sound system, and the networking… Then you probably have to go through a complex creation procedure. Or maybe you can’t even create an entity that way and you can only load it from disk along with some resources.

This is another side effect of having to eat your own dog food all the time and use the code you’re about to write. You’ll quickly see that testing code that depends on lots of other code is a nuisance that can be avoided most of the time. As a result, the code you write will be extremely modular. It’ll be very easy to use an object in isolation from other classes or systems.

To draw on the earlier example, you will be able to create a game entity without a world, and certainly without expensive operations like initializing the graphics system and pulling in the rest of the codebase.

Modularity schodularity. Why should you care? There are the usual reasons: refactoring, reusing the code, adding new features later on, etc. But there are also much more practical ones. A highly modular codebase will compile much more quickly than one that has grown organically (without resorting to the likes of Incredibuild). It’ll also allow you to break it up into smaller, separate libraries which can be tested and linked separately (which means really fast iteration times instead of always having a 2-minute link step if you’re dealing with the full codebase).

Documentation

What’s wrong with comments in code? I used to be a big one for having a header on every class and every function, generating cute help files with Doxygen and all that, but I eventually gave up. When you’re developing at a real-world pace, those pretty comments quickly fall by the wayside. Then you’re left either with obvious comments (assign a to b!), outdated ones, or, even worse, incorrect ones.

The tests you write while doing TDD are the ideal low-level documentation for your code. For every feature, you have at least one example on how to use it correctly and how it’s expected to be used. And the best part is that it can’t ever get out of date.

Comments in code still have their place, but it should be very rare. You should still add comments explaining why you did something a certain way, or including the URL of the web page where you got certain algorithms or code snippets. Other than that, I believe documentation should come in the form of good class and function names and a comprehensive set of TDD-generated unit tests.

WYGIWYM (What You Get Is What You Meant)

When you’re doing TDD, even though you’re writing unit tests, you’re not verifying that the algorithm you’re implementing is correct. That’s not the goal. What TDD gives you is the confidence that the code you just wrote does what you wanted it to do. If you meant to do the wrong thing, you’ll only be testing that you did the wrong thing. TDD isn’t going to help you in that respect (other than make you realize really early on that you’re doing the wrong thing, since you’re using your own code before you even write it). A bit more on this later on.

Instant feedback

When you write any code doing TDD, you get instant feedback. You write a test, build, run the tests, and you see them fail. Then you write the code, and you see the tests pass. You refactor a few things, and see the tests pass still. Instant feedback is part of the TDD cycle, and since the cycle is so short, you’ll get feedback on how things are going every few minutes, or even several times per minute.

Apart from keeping you honest and showing that things are progressing in the right direction, this instant feedback can have a very curious effect on the mood of the programmer. I don’t know exactly how to describe it, but I hadn’t had as much fun programming in a long, long time. Somehow, TDD brought back the excitement and satisfaction I felt writing Basic programs on an Amstrad CPC twenty years ago.

I know that’s a very subjective “benefit” of TDD, but it’s one that can have a profound effect on the team and the overall productivity. Don’t underestimate the power of a happy programmer!

TDD in practice

Enough theory. I hope I have convinced you of the potential benefits and you’re willing to give it a try. Let’s get into the nitty-gritty details. How exactly do we go about doing TDD? I’m assuming you’re using C++ since that’s the most-used language for game development today.

Get a unit-testing framework.

You need a unit-testing framework. The goal is to make writing a test as painless as possible. The easier it is to write a test, the less people will resist writing new tests. Since unit tests are at the core of TDD, the more streamlined the process the better.

I recently looked at some of the most popular C++ unit-testing frameworks. If you’re just starting out, don’t worry about the details. Use whatever is most convenient. I’m currently using a modified version of CppUnitLite (which I’m hoping to put up here in the next few days) just because it’s so small, easy to use, and easy to modify and port.

With my current unit-testing framework, adding a new test is as easy as writing this:

For each library, create a new executable that contains the tests for that library. My preferred directory structure looks something like this, but feel free to organize it in any way you like

I definitely don’t want my tests in the same file as the class they’re testing. I try to keep the class as simple as possible, and that would just add clutter. Some people prefer to put the test files in the same directory as the files for the library itself. I like to keep them separate but nearby, especially because there’s no one-to-one correspondence between library files and test files.

Build and run tests often.

Make sure to make the test executable dependent on the library it’s testing (and linking with). As the final step of the build process, run the test itself. That way it’ll be impossible to build a library without building the tests and running them. Since the tests should be very fast (taking less than a few seconds), this shouldn’t get in the way of fast iteration.

Most unit-testing frameworks return by default the number of failed tests they encountered. Since calling that executable was part of the build process, if it returns anything other than zero, the build framework considers the build failed, which is exactly what you want. If you’re using an IDE such as Visual Studio or KDevelop, you’ll also get a nicely formatted error message that you can select and it’ll go to the failing test. Tests have become an integral part of the build, and a failing test is treated just like a syntax error.

If you’re using make files, just run the executable as your last step of the build rule:

test:
$(objects) ${COMPILER} -o test $(objects) ${LIBFLAGS}
./test

With Scons, just use AddPostAction:

test = env.Program('test', list)
env.AddPostAction(test, './test')

In Visual Studio, call $(TargetPath) as your PostBuild rule.

Make sure the test executables are built and run in your automated build machine as well. Do both debug and release, and if you can also run them in your target platforms (Xbox, PS2, or whatever), so much the better.

Step size.

Before letting you loose with the TDD tools in hand, it’s important to talk about step size. Before you start a TDD cycle, you need to decide how much functionality you should tackle in that one cycle. It’s extremely important that you start by taking baby steps. Don’t worry about the fact that it looks silly or pointless. The point at first is to get comfortable with the cycle and to get the hang of it.

A full iteration of the TDD cycle should take just a few minutes (and sometimes less than a minute). If it takes significantly longer than that to get your tests running again (>15 minutes), it probably means that you tried taking too big a step. If you find yourself in that situation with no obvious way to make the tests pass, you might even want to consider undoing what you did and starting again with an even simpler task.

The concept of step size is very important. Choosing the right step size is probably the hardest part of TDD, and it only comes with experience and tripping a few times. At the beginning, you should start with trivial steps and stick with it. Once you’re totally comfortable with it, you can take steps a bit larger. But as soon as you find yourself with unexpected failing tests or taking too long before getting passing tests again, you should back off and take baby steps again for a while.

We’ll see some specific examples of TDD in action in the second part of this article, which will show how you can take tiny steps.

Now you should have a good idea of what TDD is and what’s involved in actually doing it. But TDD is like riding a bicycle. You can read all you want about it, but eventually, you’ll have to try it for yourself if you want to learn it. I hope this is enough information for you to go out and take it out for a quick spin.

The second part of this article will deal with how to applying TDD to game development (with specific examples) and will cover tips from the trenches and what you can expect when applying TDD to your projects.

Published

In your future article, could you please discuss the problem cases of TDD? I mean, please debunk the sort of counterexamples given by people who don’t want to do TDD 🙂

For example, testing something complex like a reliable network layer, or graphics code – the sorts of things that are hard to test in small peices. Since their performance depends on everything working together all at once, you can’t verify they’re working by checking the output of a single function.

Hi Noel! I’ve been waiting for this artcile for too long. You promised it days ago! 🙂

As you know I’ve been trying out TDD myself. I have found it to be quite effective for some problems but less effective for others. Specifically, I find it works when the solution to a problem is quite formulaic and the direction I am heading is clear to me. But when I don’t know what the solution is, e.g. when I am doing more exploratory programming, I find it is more effective to write throw-away prototypes in an adhoc fashion. This allows me to quickly get a sense of direction before writing the code “properly”. Admittedly, this does not apply to 99% of production code in games.

So in your experience, when is TDD appropriate and when, if ever, is it inappropriate?

Very good articles, Noel. It’s good to read about people using agile techniques with C++ because there seems to be a rather conservative attitude in general in that environment, I think.

For Ian: graphics is indeed hard to test (though not impossible), whereas network layers is a bit more approachable. I would probably use mock objects (see mockpp for a framework that is rather well supported).

Ian: Yes, that’s going to be one of the main subjects of the second part (which I’m writing today and I’ll try to put up by the end of the day).

This is also by no means the last time I’m planning on writing about TDD (noticed I created a whole category for it), so I’ll be coming back to it on a regular basis.

Al: I did have a section in the outline to talk about when TDD is not appropriate. As you said, exploratory programming does not benefit from TDD (being throwaway code that you just want to write quickly and learn from it–spikes in XP terminology).

There are also times where doing TDD is more trouble than it’s worth: writing a wrapper around an existing library, dealing very close to the hardware, the GUI layer of a tool, or super high-level game code in a scripting language. But I do believe that TDD is a great way to go for 99% of the C++ code written in games.

TDD is a great thing to do – except in a C++ environment it has quite an impact on build times. Even if you keep dependencies low. And if you keep dependencies low, you’re only testing on the lowest level, the unit – there’s no testing of the interop of all the components you have.

All this from a tools, not a game perspective – but I doubt things are that different. Most of the drawbacks only hit after a while – we’ve got about 700 tests or so covering our code base, and it’s slowly having an impact.

Out of curiosity – how much code that gets written at your place is actually TDDed? How do you keep build times low? (And what do you consider low?)

There are a few other drawbacks to TDD. It remains a great practice, but like all “best practices”, you’ve got to know the constraints. Since I’m doing the complaining, I guess that means I need to sit down and write a bit about it – stay tuned 🙂

Chris Bi

1.) F. Röken of Trinigy held a very interesting presentation at an IGDA chapter meating in Frankfurt, Germany. They are with great success applying unit tests and test driven development in developing their game engine. He also shows that TDD can be very supportive in continuous integration and distribution.

2.) I personaly build test/test-suites into modules (DLLs on win32) and store them in a directory hierarchy (engine/audio/oggStreaming.dll). That has in my opinion some advantages over using executeables. Memory tracking and timing is implemented in one place, tests can be run with different “views”, a command line exe for a post build step in VS, an UI that supports result history comparing …. A test DLL’s source and project files are created from a template so that only the code for the test case itself needs to be written. Multiple test can also easily be refactored into a test-suite contained in a single module.

Paul Higinbotham

Good article on TDD and I look forward to seeing more. My experience with TDD was in my last project building a middle tier and UI layers for a business application. We ended up with hundreds of unit-tests that took a fair amount of time to run. Nevertheless we maintained the discipline of requiring all unit-tests to pass before any new code is checked in. Overall this was a very good thing. We required unit-tests for the middle layer but couldn’t think of any way of writing unit-test for most of the UI. Our testers found and adopted some automated UI testing framework, but unfortunately it didn’t work out very well. Testers spent most of their time adjusting the automated tests to pass when small layout changes occurred (which happens regularly). Also many spit and polish problems weren’t found until near ship time, during hand testing of the UI.

One major problem was that we left out the “refactor code, refactor tests” iteration of TDD. Once the tests and code were written and passing they became a kind of spec/contract and were “locked down”. This was done to meet a very tight schedule. If you wanted to find out how an interface was supposed to work then go look at the unit-tests.

The problem, of course, was that owner written unit-tests don’t always equate to how code is really used by customers. After some severe customer performance/functional problems we finally incorporated code/unit-test refactoring based on actual code use and need. Locking down the interfaces early just caused product slip as customers first tried to work around interface problems, then finally allowing the code and unit tests to be refactored to meet customer needs. There is a fine balance between process driven development and maintaining flexibility.

Robert and Paul: Build times can be an issue if it’s not done correctly. I’ll definitely talk about that in the third part of the article. For now I’ll just say that using TDD gave me the fastest iteration times I’ve ever seen in my life because you’re working with one particular library and not with a whole bunch of code at once.

Each unit test should really be blazingly fast. Probably in the order of a microsecond. You should be able to run 100,000 of those in one second, so that shouldn’t be an issue. Remember, these are *unit tests*, so they work directly on one class, not on the whole program.

What do you see at some of the drawbacks of TDD? There’s the extra code, but frankly, I see that more of an advantage than a disadvantage.

Chris: Good idea about the DLLs! Using DLLs might make it easier to run all the tests. I hadn’t considered before because most of my work had been with static libs, but it’s definitely worth looking into.

Do you have a link to Röken’s presentation? I Googled for it but nothing came up. I’d love to find out more about what they’re doing.

Noel, I didn’t mean to imply that I thought TDD was not a good method. I think it is a great way to create and test public methods and functions. Forcing developers to run/pass unit-tests before checking in code is a bit of a pain, but well worth the effort based on my limited experience.

The drawback I was talking about earlier is really a more general problem and not related directly to TDD. I guess it is more related to agile programming. I really like the idea of constantly refactoring code/classes/interfaces and corresponding unit-tests. I rarely get all of my class methods right the first time I create them. It is not until I start using the class do I realize the best methods I need and how they should work. Being able to refactor is crucial to getting to the right design and implementation.

But most projects I’ve been on tend to resist any changes to “public” code that could affect documents, developers, or testers. Instead developers are encouraged to work-around any deficiencies. Work-arounds can be Ok but they can also lead to hacky bloated code, duplicate functionality, and performance problems. The longer the work-arounds are in the more likely they become a permanent part of the code base. Changes to the code (and unit-tests) sometimes finally occur at the eleventh hour when ship stopping functional and performance problems become apparent.

This is really a management problem and not a TDD problem. I just brought this up because in my last project we used unit-tests to not only drive code development but to also document and specify public interfaces/classes. Later in the project during coding that actually used the public classes it was almost impossible to get a method changed or added. This in part caused severe end-game problems and a seven month slip in ship date.

Baraclese

Okay, I just worked a few hours on setting up a unit test suite for my project. Most of that time was spent in my SConstruct/script files, I pretty much had to throw out all of my old code to support more than one executable, to get the build directories right and handle different environments more flexibly, it’s much better now. I use the boost testing framework cause I love boost.

Guess what?! I discovered a bug in my first test! I was alittle puzzled at first because I thought “it can’t be! it works fine in the game!”. I made an unconscious assumption about a container never being empty, dereferencing the iterator in a tiny hidden loop. BOOM! I stepped in with gdb, found the function that was called and repaired the sourcecode.