Category: Dev Ops

Recently I’ve been looking into Specification by Example, which people keep defining to me as BDD done the right way. Specification by Example fully implemented includes the idea of an executable specification. A concept that has led me back to FitNesse having given it the cold shoulder for the last six or seven years.

I’ve always thought of FitNesse as a great idea but I struggled to see how to use it correctly when as a developer I was mainly focused on continuous delivery and Dev Ops. I didn’t see where in the development cycle it fit or how tests in a wiki could also be a part of a CD pipeline. Revisiting FitNesse with a focus on Specification by Example gave me the opportunity to work some of this out and I think I’m very likely to suggest using this tool in future projects.

Before it’s possible to talk about FitNesse in a CI environment, there are a few basics to master. I don’t want to go into a full breakdown of all functionality, but I do want to communicate enough detail to allow someone to get started. In this post I’ll concentrate on getting FitNesse working locally and using it to execute different classes of test. In a later post, I’ll cover how to make FitNesse work with CI/CD pipelines and introduce a more enterprise level approach.

This post will focus on using FitNesse with the Slim test runner against .NET code but should be relevant for a wider audience.

Installing FitNesse

Installing and running FitNesse locally is really easy and even if you’re intending to deploy to a server, getting things running locally is still important. Follow these steps:

Create a batch file to run FitNesse when you need it. My command looks like:

java -jar .fitnesse-standalone.jar -p 9080

Once you’re running the FitNesse process, hit http://localhost:9080 (use whichever port you started it on) and you’ll find a lot of material to help you get to grips with things, including a full set of acceptance tests for FitNesse itself.

What FitNesse Isn’t

Before I get into talking about how I think FitNesse can be of use in agile development, I’d like to point out what it isn’t useful for.

Tracking Work

FitNesse is not a work tracking tool; it won’t replace a dedicated ticketing system such as Jira, Pivotal Tracker or TFS. Although it may be possible to see what has not yet been built by seeing what tests fail, that work cannot be assigned to an individual or team within FitNesse.

Unit Testing

FitNesse can definitely be used for unit testing and some tests executed from FitNesse will be unit tests, so this might seem a bit contradictory. When I say FitNesse isn’t for unit testing, I mean that it isn’t what a developer should be using for many unit tests. A lot of unit testing is focused on a much smaller unit than I would suggest FitNesse should be concerned with.

This test should never be executed from FitNesse. It’s a completely valid test but other than a developer, who cares? Tests like this could well be written in a TDD fashion right along with the code they’re testing. Introducing a huge layer of abstraction such as FitNesse would simply kill the developer’s flow. In any case, what would the wiki page look like?

Deploying Code

Fitnesse will execute code and code can do anything you want it to, including deploying other code. It is entirely possible to create a wiki page around each thing you want to deploy and have the test outcomes driven by successful deployments. But really, do you want to do that? Download Go or Ansible – they’re far better at this.

What FitNesse Is

OK, so now we’ve covered a few things that FitNesse is definitely not, let’s talk about how I think it can most usefully fit into the agile development process.

FitNesse closes the gap between specification and executable tests, creating an executable specification. This specification will live through the whole development process and into continued support processes. Let’s start with creating a specification.

Creating a Specification

A good FitNesse based specification should be all about behaviour. Break down the system under test into chunks of behaviour in exactly the same way as you would when writing stories for a SCRUM team. The behaviours that you identify should be added to a FitNesse wiki. Nesting pages for relating behaviours makes a lot of sense. If you’re defining how a service endpoint behaves during various exception states then a parent page of ‘Exception States’ with one child page per state could make some sense but you aren’t forced to adhere to that structure and what works for you could be completely different.

Giving a definition in pros is great for communicating a generalisation of what you’re wanting to define and gives context for someone reading the page. What pros don’t do well is define specfic examples – this is where specficiation by example can be useful. Work as a team (not just ‘3 amigos’) to generate sufficient specific examples of input and output to cover the complete behaviour you’re trying to define. The more people you have involved, the less likely you are to miss anything. Create a decision table in the wiki page for these examples. You now have the framework for your executable tests.

Different Types of Test

In any development project, there are a number of different levels of testing. Different people refer to these differently, but in general we have:

Unit Tests – testing a small unit of code ‘in process’ with the test runner

Component Tests – testing a running piece of software in isolation

Integration Tests – testing a running piece of software with other software in a live-like environment

UI Tests – testing how a software’s user interface behaves, can be either a component test or integration test

Load Tests – testing how a running piece of software responds under load, usually an integration test

Manual tests are by their nature not automated, so FitNesse is probably not the right tool to drive these. Also FitNesse does not natively support UI tests, so I won’t go into those here. Load tests are important but may require resources that would become expensive if run continuously, so although these might be triggered from a FitNesse page, they would probably be classified in such a way so they aren’t run constantly. Perhaps using a top level wiki page ‘Behaviour Under Load’.

In any case, load tests are a type of integration test, so we’re left with three different types of test we could automate from FitNesse. So which type of test should be used for which behaviour?

The test pyramid was conceived by Mike Cohn and has become very familiar to most software engineers. There are a few different examples floating around on the internet, this is one:

This diagram shows that ideally there should be more unit tests than component tests, and more integration tests than system tests etc. This assertion comes from trying to keep thing simple (which is a great principle to follow); some tests are really easy to write and to run whereas some tests take a long time to execute or even require someone to manually interact with the system. The preference should always be to test behaviour in the simplest way that proves success but when we come to convince ourselves of a specific piece of behaviour, a single class of test may not suffice. There could be extensive unit test coverage for a class but unless our tests also prove that class is being called, then we still don’t have our proof. So we’re left with the possibility that any given behaviour could require a mix of unit tests, component tests and integration tests to prove things are working.

Hooking up the Code

So, how do we get each of our different classes of tests to run from FitNesse? Our three primary types of test are unit, component and integration. Let’s look at unit tests first.

Unit Tests

A unit test executes directly against application code, in process with the test itself. External dependencies to the ‘unit’ are mocked using dependency injection and mocking frameworks. A unit test depends only on the code under test, calls will never be made to any external resources such as databases or file systems.

Let’s assume we’re building a support ticketing system. We might have some code which adds a comment onto a ticket.

This will look for a class called DoCommentsGetAdded, set the properties Comment, Username and CommentedOn and call the method CommentWasAdded() from which it expects a boolean. There is only one test line in this test which tests the happy path, but others can be added. The idea should be to add enough examples to fully define the behaviour but not so many that people can’t see the wood for the trees.

We obviously have to create the class DoCommentsGetAdded and allow it to be called from FitNesse. I added the Ticketing.CommentManager to a solution called FitSharpTest, I’m now going to add another class library to that project called Ticketing.Test.Unit. I’ll add a single class called DoCommentsGetAdded.

Now make sure your solution is built (if you have referenced the debug assemblies then make sure you build in debug) save your changes in FitNesse and hit the ‘Test’ button at the top of the page.

There are of course lots of different types of tables you can use to orchestrate different types of tests. The full list is beyond the scope of this post but if you dip into the documentation under ‘Slim’ then you’ll find it quite easily.

The mechanism always remains the same, however – the diagram below outlines the general pattern.

This is not really any different to what you would see if you were to swap Slim for NUnit. You have the system driving the test (Slim or NUnit), the test assembly and the assembly under test. The difference here is that rather than executing the test within your IDE using Resharper (or another test runner) you’re executing the test from a wiki page. This is a trade off, but like I said at the beginning: not all tests should be in FitNesse, we’re only interested in behaviour which a BA might specify. There will no doubt be other unit tests executed in the usual manner.

Integration Tests

Integration tests run against a shared environment which contains running copies of all deployables and their dependencies. This environment is very like production but generally not geared up for high load or the same level of resilience.

The sequence diagram for unit tests is actually still relevant for integration tests. All we’re doing for the integration tests is making the test assembly call running endpoints instead of calling classes ‘in process’.

Let’s set up a running service for our ‘add comment’ logic and test it from FitNesse. For fun, let’s make this a Web API service running in OWIN and hosted with TopShelf. Follow these steps to get running:

Add a new command line application to your solution. Call it TicketingService (I’m using .NET 4.6.1 for this).

Right, run this service and you’ll see that you can add as many comments as you like and retrieve the count.

Going back to our sequence diagram, we have a page in FitNesse already, and now we have an assembly to test. What we need to do is create a test assembly to sit in the middle. I’m adding Ticketing.Test.Integration as a class library in my solution.

It’s important to be aware of what version of .NET FitSharp is built on. The version I’m using is built against .NET 4, so my Ticketing.Test.Integration will be a .NET 4 project.

I’ve added a class to the new project called DoCommentsGetAdded which looks like this:

Now, make sure the service is running and hit the ‘Test’ button in FitNesse.

Obviously, this test was configured to execute against our localhost, but it could just as easily have been configured to execute against a deployed environment. Also, the code in the test class isn’t exactly what I would call first class, this is just to get stuff working.

The important thing to take from this is that the pattern in the original sequence diagram still holds true, so it’s quite possible to test this behaviour as either a unit test or as an integration test, or even both.

Component Tests

A component test is like an integration test in that it requires your code to be running in order to test. Instead of your code executing in an environment with other real life services, it’s tested against stubs which can be configured to respond with specific scenarios for specific tests, something which is very difficult to do in an integration environment where multiple sets of tests may be running simultaneously.

I recently wrote a short post about using Mountebank for component testing. The same solution can be applied here.

There are so many things wrong with this code, but remember this is just for an example. The functioning of this controller depends on an HTTP endpoint which is not part of this solution. It cannot be tested by an integration test without that endpoint being available. If we want to test this as a component test, then we need something to pretend to be that endpoint at http://localhost:9999/ticketservice.

To do this, install Mountebank by following these instructions:

Make sure you have an up to date version of node.js installed by downloading it from here.

Our heavily hacked TicketController class will function correctly only if this imposter returns 201, anything else and it will fail.

Now, I’m going to list the code I used to get this component test executing from FitNesse from a Script Table. I’m very certain that this code is not best practice – I’m trying to show the technicality of making the service run against Mountebank in order to make the test pass.

I’ve added a new project to my solution called Ticketing.Test.Component and I updated the config.xml file with the correct namespace. I have two files in that solution, one is called imposters.js which contains the json payload for configuring Mountebank, the other is a new version of DoCommentsGetAdded.cs which looks like this:

A Script Table basically allows a number of methods to be executed in sequence and with various parameters. It also allows for ‘checks’ to be made (which are your assertions) at various stages – it looks very much like you would expect a test script to look.

In this table, we’re instantiating our DoCommentsGetAdded class, calling Setup() and passing the path to our imposters.js file, calling AddComment() to add a comment and then checking that CommentWasAdded() returns true. Then we’re calling TearDown().

Setup() and TearDown() are there specifically to configure Mountebank appropriately for the test and to destroy the Imposter afterwards. If you try to set up a new Imposter at the same port as an existing Imposter, Mountebank will throw an exception, so it’s important to clean up. Another option would have been to set up the Imposter in the DoCommentsGetAdded constructor and add a ~DoCommentsGetAdded() destructor to clean up – this would mean the second and last lines of our Script Table could be removed. I am a bit torn as to which approach I prefer, or whether a combination of both is appropriate. In any case, cleaning up is important if you want to avoid port conflicts.

Ok, so run your service, make sure Mountebank is also running and then run your test from FitNesse.

Again, this works because we have the pattern of our intermediary test assembly sat between FitNesse and the code under test. We can write pretty much whatever code we want here to call any environment we like or to just execute directly against an assembly.

Debugging

I spent a lot of time running my test code from NUnit in order to debug and the experience grated because it felt like an unnecessary step. Then I Googled to see what other people were doing and I found that by changing the test runner from:

!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}

to:

!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunnerW.exe}

FitSharp helpfully popped up a .NET dialog box and waited for me to click ‘Go’ before running the test. This gives an opportunity to attach the debugger.

Summary

A high level view of what has been discussed here:

Three types of test which can be usefully run from FitNesse: Unit, Integration and Component.

FitNesse is concerned with behaviour – don’t throw NUnit out just yet.

The basic pattern is to create an intermediary assembly to sit between FitNesse and the code under test. Use this to abstract away technical implementation.

Clean up Mountebank Imposters.

You need FitSharp to run FitNesse against .NET code.

Debug with FitSharp by referencing the RunnerW.exe test runner and attaching a debugger to the dialog box when it appears.

Not Quite Enterprise

In this first post on FitNesse, I’ve outlined a few different types of test which can be executed and listed code snippets and instructions which should allow someone to get FitNesse running on their machine. This is only a small part of the picture. Having separate versions of the test suite sat on everyone’s machine is not a useful solution. A developer’s laptop can’t be referenced by a CI/CD platform. FitNesse can be deployed to a server. There are strategies for getting tests to execute as part of a DevOps pipeline.

This has been quite a lengthy post and I think these topics along with versioning and multi-environment scenarios will be best tackled in a subsequent post.

I also want to take a more process oriented look at how tests get created, who should be involved and when. So maybe I have a couple of new entries to work on.

Probably not exactly how everyone structures their delivery pipelines but probably not that far off. It allows instant feedback on whether what a developer is writing actually works with the code other developers are writing. And that’s a really good thing. Unfortunately, it misses something…

Each environment (other than the developer’s own machine) is shared with other developers who are also deploying new code at the same time. So how do you get an integration test for component A that relies on component B behaving in a custom manner (maybe even failing) to run automatically, without impacting the people who are trying to build and deploy component B?

If we were writing a unit test we would simply inject a mocked dependency. Fortunately there’s now a fantastic piece of kit available for doing exactly this but on an integration scale: enter Mountebank.

This clever piece of kit will intercept a network call for ANY protocol and respond in the way you ask it to. Transparent to your component and as easy to use as most mocking frameworks. I won’t go into detail about how to configure ‘Imposters’ as their own documentation is excellent, but suffice to say it can be easily configured in a TestFixtureSetup or similar.

So where does this fit into our pipeline? Personally, I think the flow should be:

The step where Mountebank comes in is obviously ‘integration testing’.

Keep in mind that installing the component and running it on the build agent is probably not a great idea, so make good use of the cloud or docker or both to spin up a temporary instance which has Mountebank already installed and running. Push your component to it, and run your integration tests. Once your tests have run then the instance can be blown away (or if constantly destroying environments gets a bit slow, maybe have them refreshing every night so they don’t get cluttered). Docker will definitely help keeping these processes efficient.

This principle of spinning up an isolated test instance can work in all kinds of situations, not just where Mountebank would be used. Calls to SQL Server can be redirected to a .mdf file for data dependent testing. Or DynamoDb tables can be generated specifically scoped to the running test.

What we end up with is the ability to test more behaviours than we can do in a shared environment where other people are trying to run their tests at the same time. Without this, our integration tests can get restricted to only very basic ‘check they talk to each other’ style tests which although have value do not cover everything we’d like.