ACCU Buttons

Testing State Machines

State machines are a common design pattern. Matthew Jones seperates their concerns to make testing easier.

Anyone can code up a state machine, but can you make such a machine fully testable? Can you prove that it is? Can you do this repeatably? In other words, can you present an inquisitor with a test suite that proves your state machine fully implements a given state transition diagram?

Depending on your background, you might choose to write your state machine using the C switch() idiom, with a #define per state, or better still, an enum. You might go so far as to have a function per state, with another switch() for each input event. It meets the requirements and it will probably work. But you know you can do better than that.

You might decide to go for a fully fledged GoF
[Gamma]
pattern-based design with a class per state and maybe a state factory. Now you are in the familiar territory of 'proper' OO and patterns. If you take this approach, surely the result will be perfect.

The trouble with testing

Whichever approach is taken, most people will naturally include code, in the state machine, that is concerned with implementing the outputs of the states. It makes a lot of sense, and it is the path of least resistance. The effort has already been put into decoding the state and handling the new input event. Having got that far it is very easy to simply add a line or two of code to finish the job, i.e. implement the 'action' part of the transition. While this code might be trivial, its impact is often not so. In the worst case (e.g. an embedded system), it might involve writing to hardware, turning on motors, lighting lights etc. Whatever the context, the state machine will be exposed to the application, and will therefore have dependencies on that application code. And this in turn makes the testing complicated.

To test a 'traditional' state machine, i.e. one where the output code is mixed with the state transition code, typically you would have to run the application, stimulate it somehow, and look for secondary evidence that the state machine is working. You might even have to resort to primary evidence: good old printf(). It is even worse in the embedded scenario: you could be forced to have real equipment, or an adequate simulation, just to run the code. This is obviously not good. You will have do an awful lot of work to produce any form of automated test. TDD will be hard because of the intrusion of the application. In the worst cases, where 'application' involves hardware, repeatable testing could even be impossible. At this point we would naturally give up on the goal of automated testing and resort to the bad old ways of testing the code once, manually, declaring it fit, and never going back. And with this approach comes the inevitable fear of later changes to that area since it is rightly considered fragile.

There are any number of tricks to get round this problem, but they will all emit 'bad smells'
[Fowler]
. You might stub the application by substituting a test version of application.cpp. You might add test instrumentation to the application. It might even be conditionally compiled so you can switch it off in the 'real' system. These are all poor solutions and will have you be tying yourself in knots of test-only code which will pollute the deliverable code and make it hard to read, understand, and maintain.

So by starting out innocently enough and harmlessly mixing state machine logic with the application, you can easily end up seriously compromising your development. But there is a better way. Of course there is: there are probably many, and if you are already a master or mistress of testable state machines, congratulations: stop reading now. For everyone else, the rest of this article describes an approach I developed recently. The background to this work is embedded software, and so the problem of testing a state machine is far more apparent than where hardware is not involved.

What is a state machine?

If we ask ourselves 'what is a state machine?', the answer (in a software context) should be something like 'code that manages the state of something, responding to external events, and translating them into actions to be implemented by the system'. State transitions will result in output actions that are communicated to the system, but only in an abstract, or event-like way. The detail of carrying out these actions is not part of the state machine, because 'detail' implies exposure of the state machine to knowledge of the application. It is this last point that is usually overlooked, leading to the blurring of the state machine and the application. It might appear to be somewhat picky, but if we allow the state machine to do two jobs (state management and controlling the application), we lose separation of concerns
[Wikipedia1]
and reduce cohesion
[Wikipedia2]
.

This stripped down definition translates perfectly to an object oriented approach: we have an interface describing the input events, and an interface describing the output actions. The state machine implements the events interface, and the application implements the actions interface. It really can be as simple as that. See Figure 1.

Figure 1

We have isolated the state machine from the application with two interfaces: Events and Actions. This is one of the fundamental principles of good design: partitioning
[Griffiths]
. We reduce the coupling between the state machine and the application to two simple interface classes. This allows us to test the state machine with mock, or test, objects
[Mackinnon]
. Later, we can implement the 'proper' version of the interface in all its application-ridden glory, safe in the knowledge that the state transition logic is perfect. It also allows the application to be tested with a mock state machine, should we wish to, by substituting the implementation of the Events interface.

An example

At this point we need to introduce an example and start talking in more direct terms. Figure 2 shows a state transition diagram for an external security light. The example is obviously trivial but I tried working through a few larger ones (e.g. 10 states) and it quickly turns from a useful example to 500+ lines of code showing most of a real system. Crucially, this example also includes interaction with hardware, so that a traditional implementation would require manual testing.

Figure 2

The security light moves between two high level states: day, when the lamp is off; and night, when the lamp is controlled by a movement sensor. Transition between these states is controlled by an ambient light level sensor. In the night state, when movement is sensed, the lamp is turned on and a timer is started. When movement ceases, the lamp is turned off by the timer. Note that although the sensors and timers might have thresholds, or return variable readings, in the realm of this state machine they are reduced to valueless events.

To turn this into code, we need four main classes: the Events interface, the Actions interface, the StateMachine and a State base class.

The Events interface declares the events that stimulate the state machine. These are the state machine inputs.

The Actions interface declares the actions that the state machine may cause. These are the state machine outputs.

The State class is the base class from which all states are derived. It inherits the Events interface because every state must be able to react to every event. There are a lot of details missing here. (If you are really interested, the fully worked example is available here:
[TestableStateMachines.zip]
). For instance the state must have some way to change to a new state. In practice each state should be constructed with a StateContext, which includes a StateFactory for creating new states; a StateChanger, to allow the new state to be passed to the state machine; and an Actions instance. There is one important detail, though, and that is that all State classes are themselves state-less. They are constructed with sufficient context to function, but no more. It might be that in a more complex system this would not be practical, but in this example, and all my real world implementations so far, it has held true. Incidentally, having stateless State classes also simplifies the problem of creating and changing state: one permanent instance of each State can be created by the StateFactory, and repeatedly handed out when required. There is no need to create new objects dynamically.

The StateMachine class brings everything together. It inherits the Events interface so that the application can signal events to it. Every event is delegated to the current State. This is the classic State pattern
[Gamma]
. The StateMachine must be constructed with an Actions instance. The Actions instance is added to the StateContext (not shown) which is passed to every State on construction.

Given this framework, and a number of helper classes already alluded to, we can concentrate on implementing the state transition diagram correctly. The realisation of the state transition diagram is the implementation of the Event interface, in each of the State class. Given the State class hierarchy in Listing 2, the translation of Figure 2 into code is completed in Listing 3.

The simple example has turned into two interfaces and eight classes. It should already be obvious that there is one thing missing: the application, and this is precisely the point of this whole approach. Describe and write the state machine in terms of state transition logic and nothing more.

Testing the example

All we need to do now is write a test implementation of the Actions interface, and then we can start some serious testing. Listing 4 shows one way to do this.

In our test harness, we are now able to construct a StateMachine and pass in a TestActions object. We can then devise a set of state transition tests, run them, and inspect the contents of TestActions::v. The term 'devise' it rather strong, in fact, since we should be pedantic and test all the inputs to all the states, there isn't much to tax the imagination. In other words we should extract a complete state transition table from the code and compare this to what is expected. Listing 5 shows an abbreviated version of such a harness. It makes assumptions about a number of features to aid testability, such as StateMachine::ChangeState() and StateMachine::ReportState(). Although it is clearly excessive to test such a simple example, it scales very well to realistic levels of complexity. The important point to note is that given the enum, struct, and helper functions, main() is straightforward, clear, and self-documenting. With a bit more effort, the helper functions can also output helpful information when tests fail, helping debugging.

At this point we have exhaustively tested our state machine, which is designed to control real hardware, in a software-only test harness. We can prove that it is a faithful implementation of the original design. Armed with this powerful approach to testing, we can start to write state machines with a new level of confidence.

When I was working all this out for the first time, I stopped at this point and offered up my 'perfect' new module of code for system testing. It worked, of course, but system testing revealed a number of subtle defects in the design of the state machine itself. I had perfectly implemented a flawed design, and I could prove it. Fortunately, the solution was close at hand.

Testing transition sequences

The test mechanism can be very easily extended to provide a second extremely useful facility. If more than one input event is allowed in a test vector, and it tests more than one expected output action, we can test sequences of transitions. This means we can test what amounts to use cases for the state machine. For example a single sequence test might be:

day --> off(night) --> moving --> timing --> off --> day

For realistic levels of complexity this testing offers more value than simply proving correct transition logic. Of course we would still retain the simple transition tests. This is what I did for my development system: I worked out the normal, and abnormal, routes round the state transition diagram, and added them to the test harness. And the problems jumped out immediately in the form of unexpected actions. Although each transition on the diagram seemed right, I had not worked through real examples, and the results of combinations of transitions. The original design allowed the state machine to get the application into an illegal state. But now that we had automatic transition and use case tests, it was very easy to change the design, and then prove it again.

Like all good examples, our simple security light has a bug, and the tests in Listing 5 do not reveal it. A carefully chosen extended sequence test would show that it does not restart the timer each time movement is detected while the lamp is still on. There should be a transition from timing to moving for the movement event. For example the sequence of events dark, movement, no_movement, movement, no_movement, timeout (i.e. a second movement while the lamp was still on) would result in lamp_on, start_timer, lamp_off, when it should cause lamp_on, start_timer, start_timer, lamp_off. Therefore we find that we need to add:

Dealing with values

The overriding theme throughout has been to keep the application at arms length. Reducing the world outside the state machine to void (void) actions and events is an extreme simplification. In many cases it might appear to be a step too far. What about actions that need parameters? What about events that carry information? I would argue that a state machine deals with logic, not quantities. The code immediately surrounding the state machine, its immediate context, needs to deal with these quantities, and translate them to events and from actions on behalf of the state machine.

A very contrived example might be that our security light should control the brightness of the lamp according to the speed of movement. This simple control function would sit outside the state machine, storing speed and converting it to brightness when LampOn is called.

It would be feasible to allow properly encapsulated application logic inside the state machine, but validating the outputs would turn a simple test harness into a monster. I suspect the resulting pressure to revert to the bad old ways would be great when faced with such a complex task.

Further work

The example above has rather a lot of code for such a simple state machine. This is because it is a condensed version of a real implementation that was complex enough to warrant that approach. Now that we have a complete regression test harness, we could easily, and safely, refactor it into to a leaner and more concise version.

Something I have not tried yet is to add the Actions and Events interfaces to an existing state machine as a way to instrument it and help to bring it under better coverage of unit tests. Once a full set of transition tests has been written they provide enough of a safety net to allow refactoring.

Conclusions

This all started as an innocent attempt to write a 'nice clean' state machine using principles of coding to interfaces, and good separation of the roles of classes. It quickly turned into a revelation that 'there is a better way' to approach state machines and their testing in general. We all think we know how to write a state machine, but it is healthy to challenge this every now and then.

In the embedded world that spawned this work, faulty state machines are often the root cause of defects. It is the inability to test them effectively and repeatably that is the root cause of their unreliability. Eliminating this problem yields a significant improvement in the intrinsic quality of the code. By lifting the state machine up to the same level as more general application code, to which TDD is easily applied, it is no longer a poor relation, and can be treated equally.

I have only applied this technique a couple of times so far, but with no problems. I would be foolish to assume it will always work, but I look forward to confirming the assumption. n