Main menu

Post navigation

TDD Guided by ZOMBIES

Have you had a hard time figuring out where to start with Test-Driven Development. What if ZOMBIES could help you build code that does exactly what you think it is supposed to do? What if ZOMBIES at the same time help you to build a test harness that can help you keep your code clean and behaving properly for a long and useful life? What if ZOMBIES could help!

I’m not talking about those zombies! ZOMBIES is an acronym.

One of the seemingly odd things I saw back in 1999 when Kent Beck, Ron Jeffries and others demonstrated Test-Driven Development, was how they always started with the simplest cases, working their way deliberately to the more involved cases. For example, they’d first ask how would the object respond right after it is initialized? They would add one behavior at a time. Initially each behavior specified in a test scenario was an opportunity to try interface ideas. The early tests usually had hard-coded return results. The implementations were so simple ZOMBIES could do it.

People new to TDD often struggle with what test to write. It’s hard to know where to start or what test to write next. It’s hard to know when you are done, and it’s scary to think you will leave some incomplete code behind. Scary! How can ZOMBIES help with scary?!

I’ve come to rely on ZOMBIES to help explain how I figure out the next test, and what I am considering as I am writing the test. I find that ZOMBIES help me when I am stuck on a programming problem. ZOMBIES help find a logical next step. ZOMBIES help me keep on the firm footing of continually establishing cause and effect. ZOMBIES help me hone my procrastination skills, by suggesting what to do now and what to put off until later. Procrastination skills!? It is a skill to develop and master. Only use your procrastination skills for good.

Let me spell out ZOMBIES and explain the algorithm:

Z – Zero

O – One

M – Many (or More complex)

B – Boundary Behaviors

I – Interface definition

E – Exercise Exceptional behavior

S – Simple Scenarios, Simple Solutions

When test-driving, guided by ZOMBIES, the first test Scenarios are for Simple post-conditions of a just created object. These are the Zero cases. While defining the Zero cases, take care to design the Interface and capture the Boundary Behaviors in your test Scenarios. Keep it Simple, both Solutions and Scenarios. You’ll find that hard. Once progress is made on the Zero cases, move to the next special Boundary case, testing the Behavior desired when transitioning from Zero to One. To do so there are likely other Interfaces to define and use in new test Scenarios. Once the Boundary Behaviors between Zero and One (and possibly back to Zero from One) have been captured in tests, move on to start to generalize your design now dealing with More complex Scenarios and Many items being managed. Often there are new Boundary conditions to be concerned with. Finally review your work and make sure you consider and Exercise the Exceptional things that might happen.

(I was hoping to work ‘P’ for procrastination into the acronym. ZOMBIE Apocalypse?)

ZOMBIES is not your usual sequential acronym. It is only partially sequential. It has two dimensions. One Axis is ZOM[ZOM] the orthogonal axis if BIE, with simple test scenarios (several reasons for the S) bringing them together.

Zombies are chaotic, though ZOMBIES are orderly and purposeful.

Initial test Scenarios follow the ZOM pattern from simple to complex, while the things we consider come from BIE, all the time aiming for Simplicity in test Scenarios and production code Solutions.

Are you ready for some code? Nothing like an example to understand ZOMBIES. The rest of the article explains ZOMBIES role in test-driving a simple C module that implements a CircularBuffer or First-In First-Out (FIFO) data structure. This CircularBuffer will hold a series of integers. We can Put() a new integer in and Get() the oldest out. If it IsFull() it will reject all new attempts to put. If it IsEmpty(), a get returns a default value that you can specify during Create(). There are quite a few usage scenarios that have to be tested, specially around the boundaries and exceptional things that can happen. Just to refresh your memory, here is a diagram to illustrate a CircularBuffer implementation.

Before starting, make a list of Scenarios to test, in no particular order:

Wrap around

Overflow

Underflow

Empty

Full

Happy path – FIFO

I use this example in my TDD for Embedded C or C++ training courses. Engineers are drawn to the more challenging parts of the implementation, like wrap around, overflow and underflow. They were taught to go after the tough problems first. In TDD guided my ZOMBIES, start with the easy stuff and build a foundation of simple behaviors first, procrastinating skillfully. Then work out the more involved scenarios and behaviors one at a time.

For CircularBuffer, the zero scenario focuses on the test cases for the newly created container: it is empty; it is not full. Testing that the new CircularBuffer is empty and not full leads to defining interfaces for the production code. The test cases record critical boundary behaviors. Let’s see what these tests look like in CppUTest, an open source test harness (designed for embedded C and C++ programmers in mind).

NOTE: In all the examples, the tests are written one at a time, and the code to pass each test is written incrementally. I’m just showing them in batches. I’ll also take advantage of calloc()’s behavior of initializing allocated memory to zero. So while using calloc() I won’t explicitly initialize member variables to zero.

The thing that really bothers people new to TDD, is that to pass these tests, this is all the code that is needed:

With that code, there are often gasps and shaking heads from the people new to TDD. The fear of past programming mistakes shows on their faces. “You are not using or storing the Create parameters.”, “IsEmpty and IsFull are nowhere close to right” “What if you forget to come back and change those hard coded results!?”

With ZOMBIES helping, it may seem scary, but the next action is Simple, attack one of the hard coded return statements right now, and add the other to your test list if its not already there.

If you look at the effort to get the code and tests to this point, most the effort is spent to keep the compiler and linker happy. Due to the Simple Scenarios coming first, and the incomplete but Simple Solution that passes all the test, we can be confident that tests pass for the intended Behavior and fail for unintended behavior.

If the next test Put() a value into the CircularBuffer it would not be empty. Hard coding IsFull() would not work for both Scenarios. So write this Boundary Behavior test that defines the Put()Interface that stores One item.

Expanding the Interface further and defining another Boundary Behavior Scenario, this tests transitions the CircularBuffer back to empty. Next make sure Get() returns what was Put() for this One item in the FIFO Boundary Behavior.

When we do the Simplest thing that moves the code toward the solution we have in mind[DTSTTCPW], very little production code is needed to pass these tests. Whenever we can get by with an incomplete solution in the production code, it means one or more tests are needed to fully exercise the code.

More gasps and groans, as another hard coded value is introduced. The horror of not even saving the value that is Put(). It can be pretty scary to program with ZOMBIES, until you get to know them.

I’ve seen thousands of programmers solve this problem (in my training classes). Many cannot resist putting in an implementation for IsFull() right now. I’ve seen virtually no programmers get it right on the first try, especially if they use the index and outdex to implement IsFull(). To the TDD learner, it is scary to procrastinate, but you can always add anything you think you might forget to the test list. I think it is scarier to leave behind untested code for such an important case.

What have we accomplished so far with help from ZOMBIES? To the novice, “you are testing nothing!”. Sure enough, but I think I’ve accomplished a several important things:

The interface is nearly complete and we can see where it is going. If it was inconvenient to use, we’d know already!

The the code is proving to be testable.

A lot of complier syntax has been tamed for our needs.

Several important boundary conditions have been captured in tests we are confident in.

I can devote less of my brain to those boundary cases as I define the rest of the behaviors for the CircularBuffer. The tests will tell me if my code stops following the behaviors defined in the test scenarios.

We have explored a specific mechanism that the CircularBuffer can use to report that it is empty or not empty. Saving the value has nothing to do with determining IsEmpty().

Now that the Zero/One Boundary Behavior Scenarios have been cataloged and the Interface has evolved, let’s finally make this a FIFO as we define the first scenario for Many contained items.

There are a couple things to consider at this step. How should we dynamically size the array to hold the values? We could do two allocations or one. If we are using the heap, we better make sure we put the allocated memory back when the CircularBuffer is destroyed. Then we also have to save and retrieve the value in a First-In First-Out manner.

Keeping it Simple, let’s make those changes one at a time. First FIFO, then dynamic allocation. For now, we can hard code the size of the values array. It will be a little easier to keep to code working with a singe change. Repeat after me: It’s easier to keep code working than to fix it after you break it.

I’d like to add Boundary tests for IsFull(), but up to this point there is no notion of capacity. So let’s introduce capacity to the tests and code. It will be handy for callers to access the CircularBuffer’s capacity. Also we’ll have to add a capacity parameter to Create() function. These tests drive the initial Iterface, and the adding of capacity to Create().

Now the code is ready to add dynamic allocation. Given that: 1) we already have a fixed size array working 2) capacity has been introduced. There is not a really good way for the test to force the array allocation with malloc() or calloc(), so we treat it as a refactoring, changing the structure without changing the external behavior.

This step usually does not go too smoothly for people in my training. There are a lot of details to get right. I chose the single allocation implementation, where int values[] has to be last member of the struct, the malloc() size must take into account the size of the struct, and the space needed for capacity number if ints, not to mention that we can’t allow a memory leak. Thankfully, CppUTest has leak detection. You can see that changing from calloc() to malloc() has to be accompanied with explicit member variable initializations.

With capacity and dynamic sizing complete it is finally possible to completely fill the buffer.

fillItUp() is a helper function. It started life as an in-line for loop in the test case, primitively filling the buffer. I generally don’t like loops in unit tests. I’d rather read a test top to bottom as a scenario specification. Extracting fillItUp() from the test cleans up the test; fillItUp() could be handy for other tests too.

Here is a wrong IsFull() that works as long as we have not yet wrapped.

This implementation passes the test, but we know it won’t survive wrapping. This simple and wrong implementation makes me thing of another Boundary Behavior that should be tested. Like we did with IsEmpty() let’s transition away from being full.

CppUTest reports after writing this test that memory was corrupted. Because wrapping is not yet implemented, the int after the end of the allocated memory was overwritten. CppUTest overrides memory allocation and adds a guard value at the end of the allocated memory. If the guard is changed, CppUTest lets you know.

The existing tests help keep the code working during this change. It is a small change to Put() and Get().

That failure may have been surprising. It’s time to look at the sketch of a wrapped full buffer, and what it means to our current implementation.

After wrapping self->index and self->outdex are the same! Full and empty can’t be the same! That’s not logical.

During my training exercise, many programmers get stuc here trying to get IsFull() working using only self->index, self->outdex and self->capacity. I usually suggest they look for a Simple Solution that will work.

In training, I’ll sometimes provide this nudge: “How many items are in an empty buffer?”. “How many are in a full buffer?”

A simple counter will do. (there are other solutions)

Here is the code just after introducing self->count for IsEmpty() and IsFull(). I also extracted the duplcate wrapping logic out of Put() and Get(). Notice I also extracted duplicate code into nextIndex(), a local helper function.

Hmmm, while we are exploring things that can go wrong, let make sure putting to full and getting from empty does not harm the buffer’s integrity. These are belt and suspender tests, I don’t really expect them to fail.

Notice how this test is not as simple as the earlier tests. We try to make all the tests Simple, though some thwart that goal. This test could be split into several tests, but I don’t think in this case it helps much.

What else can go wrong! Are there other Eceptional or abusive scenarios. Could wrong parameters be passed to the CircularBuffer functions. A Quick review suggests tests may be warranted for these abuse Senarios:

Create with a zero or negative length

Passing in a NULL pointer where a buffer is expected

Running out of heap

How to react to these is outside the scope of this article. You’d have to consider these for your application.

The First Bug Fix!

Let’s toss a curve ball at the CircularBuffer. What if we discovered that this CircularBuffer had to be populated from an interrupt routine and read by an application task.

You may think nothing of it, but then again you might. Or will get some very mysterious behavior using the CircularBuffer in that concurrent environment. Put() and Get() have a shared variable (self->count)! There is a race condition! This code will eventually experience a catastrophic failure because Put() and Get() are not atomic operations. self->count will eventually be corrupted.

Doing some research we come across this wikipedia article on circular buffers. There is a solution that instead of shared counter, Get() is the only function to change self->index and Put() is the only function to chance self->outdex. The algorithm requires that there be an extra cell in values[], while Put() will consider the buffer full when self->index gets within one cell of self->outdex.

This sounds like a significant change to the algorithm, but we have tests to notify us of any test scenarios that break during this Exceptional refactoring.

That went really smoothly. Even though how we decided to implement some of the key decisions in CircularBuffer, in the end we could take them in stride. Early behaviors we easy to get right and keep right as the actual final solution took form.

Final Thoughts

TDD guided by ZOMBIES helps me make progress in growing the behavior of code I am working on. It changed how I program long ago. Instead of writing out whole files and functions and then figuring out what is wrong, I define one behavior at a time and implement it as simply as possible, moving the code closer to the end goal as I envision it. Keeping the code working the whole time I am changing it. Hand in hand with that is refactoring when I see a way to make the code better. Again, keeping the code working because it is easier to keep a system working than to fix it after you break it.

Let me know if you would like to see the progression of the code. I could post that on another article or on github.

Footnotes

[ZOM] When I wrote Test-Driven Development for Embedded C, I described a behavioral pattern of Test-Driven Developers that I called 0,1,N.. My friend Tim Ottinger said, oh yeah, ZOM. There are only three numbers important in computing: Zero, One and Many. Zero and One are special cases. Many is the first generalization. ZOM has been helping me for years as I practiced and taught TDD. Thanks Tim! [back]

[DTSTTCPW] Kent Beck showed this handy unpronounceable acronym to me back in 1999: DTSTTCPW. Spelling it out: Do The Simplest Thing That Could Possible Work. Thanks Kent! [back]

10 thoughts on “TDD Guided by ZOMBIES”

Hello James,
Thank you very much for your post. I have a couple of questions about it.

Do you think ZOMBIES could be used as a guide for acceptance tests?

I have another question, non related to the main topic of the post. I liked the way in which you declared the CircularBuffer, hiding the definition of the structure from the interface. However, hiding the definition forces you to use CircularBuffer always with a pointer, using calloc and free each time and preventing you from declaring CircularBuffer variables by value (without a pointer). Do you think it’s a good approach to expose the definition of the CircularBuffer to allow declarations by value?

It is possible to implement a circular buffer like this without using dynamic allocation and still hiding the definition within the .c file.

The issue is: How can an external module statically declare a data structure, of the correct size, without knowledge of the structures internals? I think this is clearly an impossibility – isn’t it?

What I like to do in this circumstance is to pass in to CircularBuffer_Create a pointer to the memory to be used for the buffer, and the size of the buffer. CircularBuffer_Create can then internally validate that the passed in size is correct, if the size passed in is incorrect then an error would be returned from the create, in the same way that I would return an error if the calloc failed.
The one other step I would take would be to add a Macro to the header that calculates the memory required. This would not use the CircularBuffer data structure in the macro, but would instead have to replicate the compilers rules on alignment, padding and type size

Like any code I would develop the new CircularBuffer_Create and macro step by step using TDD. That way it is possible to verify that the macro is calculating the correct size without ever exposing the declaration of Circular buffer.

It would make quite a nice little practice task to either use TDD to develop a CircularBuffer using TDD. What better way to try out ZOMBIES?

There may be a couple other things to do. One simple convention would be to also define CircularBufferPrivate.h, letting the struct be declared there, and implementing some macro like this in CircularBuffer.h:

Another thing that could be done is to configure a block of memory that all CircularBuffers could draw upon. In production, in your scenario, the individual circular buffers would never be freed. Depending on how much you really want to trim memory to the bone, you could.

I am sure we could come up with some other ideas.

To keep the code generic, I’d probably prefer the CircularBufferPrivate.h idea and warn users not to depend on stuff in CircularBufferPrivate.h.

Regarding what David commented, I’ve seen this approach in some POSIX implementations
to hide the internals of its types. For example, in the glibc implementation, pthread_mutex_t
is declared using a placeholder whose size is defined by

“`C
#define __SIZEOF_PTHREAD_MUTEX_T 40
“`

If I’m not wrong, I think this directive is generated via scripting. I think it’s a useful
way of hiding the internals in some cases, but I’ll find it a bit cumbersome if used regularly.

There are a number of reasons why I think being able to declare a variable statically is useful.
In general it’s because you can avoid the dreaded malloc. I posted an article where I mention a solution for polymorphism for C that you may find interesting https://softwaredailycrafts.com/2016/09/30/a-polimorphism-solution-c/ that takes advantage of this property.

Thank you for the talk in Copenhagen, and this article as an extension. It’s actually really useful, since I also used to start with “problems” (boundries and such), like you say most trainees struggle with.

Even if C is not my strong suit, it comes across as easy to understand, and you arrive at the safety zone (where you trust your code because of tests) much earlier than I usually do, due to the simple early tests.