After the last instalment I thought that the code I listed for generating a SQL select query based on a large quantity of values in an ‘IN’ clause could probably be generalised. So I had a bash at that here: Helpful.SQL.

I’m attracted to the highly scalable proposition of scaling out the consumer, so many requests can be made individually rather than returning a huge result from the service. It places the complexity with the consumer, rather than with the service which really shouldn’t be bothered about how it’s being used. At the service side, scaling happens with infrastructure which is a pattern we should be embracing. Making multiple simultaneous requests from the consumer is reasonably straight forward in most languages.

But let’s say our service isn’t deployed somewhere which is easily scalable and that simultaneous requests at a high enough rate to finish in a reasonable time would impact the performance of the service for other consumers. What then?

In this situation, we need to make our service go as fast as possible. One way to do this would be to pull all the data in one huge SQL query and build our response objects from that. It would definitely run as quick as we can go but there are some issues with this:

Complexity in embedded SQL strings is hard to manage.

From a service developer’s point of view, SQL is hard to test.

We’re using completely new logic to generate our objects which will need to be tested. In our example scenario (in Large JSON Responses) we already have tested, proven logic for building our objects but it builds one at a time.

Complexity and testability are pretty big issues, but I’m more interested in issue 3: ignoring and duplicating existing logic. API’s in front of legacy databases are often littered with crazy unknowable logic tweaks; “if property A is between 3 and 10 then override property B with some constant, otherwise set property C to the value queried from some other table but just include the last 20 characters” – I’m sure you’ve seen the like, and getting this right the first time around was probably pretty tough going, so do you really want to go through that again?

We could use almost the same code as for our chunked response, but parallelise the querying of each record. Now our service method would look something like this:

Firstly, the loop through the customer id’s is no longer done in a foreach loop, we’ve added a call to Parallel.ForEach. This method of parallelisation is particularly clever in that it gradually increases the degree of parallelism to a level determined by available resources – it’s one of the easiest ways to achieve parallel processing. Secondly, we’re now populating a full list of customers and returning the whole result in one go. This is because it’s simply not possible to yield inside the parallel lambda expression. It also means that responding with a chunked response is pretty redundant and probably adds a bit of extra complexity unnecessarily.

This strategy will only work if all code called by the Get() method is thread safe. Something to be very careful with is the database connection, SqlConnection is not thread safe.

Don’t keep your SqlConnection objects hanging around, new up a new object for every time you want to query the database unless you need to continue the current transaction. No matter how many SqlConnection objects you create, the number of connections are limited by the server and by what’s configured in the connection string. A new connection will be requested from the pool but will only be retrieved when one is available.

So now we have an n+1 scenario where we’re querying the database possibly thousands of times to build our response. Even though we may be making these queries on several threads and the processing time might be acceptable, given all the complexity is now in our service we can take advantage of the direct relationship with the database to make this even quicker.

Let’s say our Get() method needs to make 4 separate SQL queries to build a Customer record, each taking one integer value as an ID. It might look something like this:

To stop each of these .Get() methods hitting the database we can cache the data up front, one SQL query per repository class. This preserves our logic but presents a problem – assuming we are using Microsoft SQL Server, then there is a practical limit to the number of items we can add into an ‘IN’ clause, so we can’t just stick thousands of customer ID’s in there (https://docs.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql). If we can select by multiple ID’s, then we can turn our n+1 scenario into just 5 queries.

It turns out that we can specify thousands of ID’s in an ‘IN’ clause with a sub-query. So our problem shifts to how to create a temporary table with all our customer ID’s in it to use in our sub-query. Unless you’re using a very old version of SQL Server, then you can have multiple rows in a basic ‘INSERT’ statement. For example:

INSERT INTO #TempCustomerIDs (ID)
VALUES
(1)
(2)
(3)
(4)
(5)
(6)

Which will result in 6 rows in the table with the values 1 through 6 in the ID column. However we will once again hit a limit – it’s only possible to insert 1000 rows in this way with each insert statement.

Fortunately, we’re working one level above raw SQL, and we can work our way around this limitation. An example is in the code below.

ToPagedList() is an extension method that returns a list of lists of the number of items passed in. So .ToPagedList(500) will break down a list into multiple lists, each with 500 items. The idea is to use a number which is less than the 1000 row limit for inserts. You could achieve the same thing in different ways.

The string insertCustomerIdsQuery is the result of concatenating all the insert statements together.

CustomerQuery.SelectBase is the select statement that would have had the ‘select by id’ predicate, with that predicate removed.

The main SQL statement first checks whether the temp table exists, and then creates it if it doesn’t. We then insert all the ID’s into that table. Then we select all matching records where the ID’s are in the temp table, and finally delete the temp table.

Cache is a simple dictionary of customers by ID.

Using this method, each repository can have the data we expect to be present loaded into it before the request is made. It’s far more efficient to load these thousands of records in one go rather than making thousands of individual queries.

In our example, we are retrieving addresses and bank details by the ID’s retrieved on the Customer objects. To support this, we need to retrieve the bank detail ID’s and address ID’s from the cache of Customers before loading those caches. Then all subsequent logic will run, but pretty blindingly fast seeing as it’s only accessing memory and not having to make calls to the database.

Summing Up

The strategy for the fastest response, is probably to hit the database with one big query, but there are down sides to doing this. Specifically we don’t want to have lots of logic in a SQL query, and we’d like to re-use the code we’ve already written and tested for building individual records.

Loading all the ID’s from the database and iterating through the existing code one record at a time would work fine for small result sets where performance isn’t an issue, but if we’re expecting thousands of records and we want it to run in a few minutes then it’s not enough.

Caching the data using a few SQL queries is far more efficient and means we can re-use any logic easily. Even most of the SQL is refactored out of the existing queries.

Running things asynchronously will speed things up even more. If you’re careful with your use of database connections, then the largest improvement can probably be found by running the queries in parallel, as these will probably be your longest running processes.

I recently published a post about how to stream large JSON payloads from a webservice using a chunked response, before reading this post it’s probably best to read that post here. Streaming is a fantastic method of sending large amounts of data with only a small memory overhead on the server, but for JSON data there could well be a better way. First of all, let’s think about the strategy for generating the JSON stream which was discussed in the earlier post:

Query the ID’s of all records that should be included in the response.

Loop through the ID’s and use the code which backs the ‘get by ID’ endpoint to generate each JSON payload for the stream.

Return each JSON payload one at a time by yielding from an enumerator.

Seems straight forward enough, and I’ve seen a service sit there for nearly two hours ticking away returning objects. But is that really a good thing?

Let’s list some of the things which might go wrong.

Network interruption.

In these days of Dev Ops, someone re-deploying the service while it’s ‘mid stream’.

A crash caused by a different endpoint, triggering a service restart.

An exception in the consuming app killing the instance of whatever client is processing the stream.

These things might not seem too likely but the longer it takes to send the full stream, the greater the chance that something like this will happen.

Given a failure in a 2 hour response, how does the client recover? It isn’t going to be a quick recovery, that’s for sure. Even if the client keeps a track of each payload in the stream, in order to get back to where the problem occurred and continue processing, the client has to sit through the entire response all over again!

Another Way

Remember that nifty bit of processing our streaming service is doing in order to loop through all the records it needs to send? If we move that to the consumer, then it can request each record in whatever order it likes, as many times as it needs to.

The consumer requests the collection of ID’s for all records it needs in a single request.

The consumer saves this collection, so even if it gets turned off, re-deployed, or if anything else happens, it still knows what it needs to do.

The consumer requests each record from the service by ID, keeping track of which records it has already processed.

This strategy actually doesn’t require any more code than streaming, in fact given that you don’t have to setup a handler to set the chunked encoding property on the response, it’s actually less code. Not only that, but because there will now be many discrete requests made via the HTTP protocol, these can be load balanced and shared between as many instances of our service as is necessary. The consumer could even spin up multiple processes and make parallel requests, it could get through the data transfer quicker than if it had to sit there and accept each record one at a time from a single instance.

We can even go one step further. Our client is going to have to retrieve a stack of ID’s which it will use to request each record in turn. Well it’s not that difficult to give it not just a collection of ID’s but a collection of URL’s. It’s not a step that everyone wants to take, but it has a certain cleanliness to it.

And So

If you’re faced with a requirement for a large JSON response, unless you’re running your applications on the most stable tech stack in the world and you’re writing the most reliable, bug free code in the world, then you could probably build something much better by throwing out the idea of returning a single huge response, even if streamed, in favour of multiple single requests for each record individually.

The long slog from a 15 year old legacy monolith system to an agile, microservice based system will almost inevitably include throwing some API’s in front of a big old database. Building a cleaner view of the domain allows for some cleaner lines to be drawn between concerns, each with their own service. But inside those services there’s usually a set of ridiculous SQL queries building the nice clean models being exposed. These ugly SQL queries add a bit of time to the responses, and can lead to a bit of complexity but this is the real world, often we can only do so much.

So there you are after a few months of work with a handful of services deployed to an ‘on premises’ production server. Most responses are pretty quick, never more than a couple of seconds. But now you’ve been asked to build a tool for generating an Excel document with several thousand rows in it. To get the data, a web app will make an http request to the on prem API. So far so good. But once you’ve written the API endpoint and requested some realistic datasets through it, you realise the response takes over half an hour. What’s more, while that response is being built, the API server runs pretty hot. If more than a couple of users request a new Excel doc at the same time then everything slows down.

Large responses from API calls are not always avoidable, but there are a couple of things we can do to lessen the impact they have on resources.

Chunked Response

Firstly, lets send the response a bit at a time. In .NET Web Api, this is pretty straight forward to implement. We start with a simple HttpMessageHandler:

Now we need to associate this handler with the controller action which is returning our large response. We can’t do that with attribute based routing, but we can be very specific with a routing template.

In this example, the url ‘customers’ points specifically at CustomersController.GetByBirthYear() and will only accept GET requests. The handler is assigned as the last parameter passed to MapHttpRoute().

The slightly tricky part comes when writing the controller action. Returning everything in chunks won’t help if you wait until you’ve loaded the entire response into memory before sending it. Also, streaming results isn’t something that many database systems natively support. So you need to be a bit creative about how you get data for your response.

Let’s assume you’re querying a database, and that your endpoint is returning a collection of resources which you already have a pretty ugly SQL query for retrieving by id. The scenario is not as contrived as you might think. Dynamically modifying the ‘where’ clause of the ‘select by id’ query and making it return all the results you want would probably give the fastest response time. It’s a valid approach, but if you know you’re going to have a lot of results then you’re risking memory issues which can impact other processes, plus you’re likely to end up with some mashing of SQL strings to share the bulk of the select statement and add different predicates which isn’t easily testable. The approach I’m outlining here is best achieved by breaking the processing into two steps. First, query for the ID’s for the entities you’re going to return. Secondly, use your ‘select by ID’ code to retrieve them one at a time, returning them in an enumerator, rather than a fully realised collection type. Let’s have a look at what a repository method might look like for this.

Each customer is loaded from the same Get(int customerId) method that is used to return customers by ID.

We don’t want to terminate the whole process just because one customer couldn’t be loaded. Equally, we need to do something to let the caller know there might be some missing data. In this example we simply return an empty customer record with the exception that was thrown while loading. You might not want to do this if your API is public, as you’re leaking internal details, but for this example let’s not worry.

The controller action which exposes this functionality might look a bit like this:

There’s no attribute based routing in use here. Because we need to assign our HttpHandler to the action, we have to use convention based routing.

At no point are the results loaded into a collection of any kind. We retrieve an enumerator and return one result at a time until there are no more results.

JSON Stream

Using this mechanism is enough to return the response in a chunked manner and start streaming the response as soon as there’s a single result ready to return. But there’s still one more piece to the puzzle for our service. Depending what language the calling client is written in, it can either be straight forward to consume the JSON response as we have here or it can be easier to consume what’s become known as a JSON stream. For a DotNet consumer, sending our stream as a comma delimited array is sufficient. If we’re expecting calls from a Ruby client then we should definitely consider converting our response to a JSON stream.

For our customers response, we might send a response which looks like this (but hopefully much bigger):

This response is in a format called Line-Delimited JSON (LDJSON). There’s no opening square bracket to say this is a collection, because it isn’t a collection. This is a stream of individual records which can be processed without having to wait for the entire response to be evaluated. Which makes a lot of sense; just as we don’t want to have to load the entire response on the server, we also don’t want to load the entire response on the client.

A chunked response is something that most HTTP client packages will handle transparently. Unless the client application is coded specifically to receive each parsed object in each chunk, then there’s no difference on the client side to receiving an unchunked response. LDJSON breaks this flexibility, because the response is not valid JSON – one client will consume it easily, but another would struggle. At the time of writing, DotNet wants only standard JSON whereas it’s probably easier in Ruby to process LDJSON. That’s not to say it’s impossible to consume the collection in Ruby or LDJSON in DotNet, it just requires a little more effort for no real reward. To allow different clients to still consume the endpoint, we can add a MediaTypeFormatter specifically for the ‘application/x-json-stream’ media type (this isn’t an official media type, but it has become widely used). So any consumer can either expect JSON or an LDJSON stream.

This formatter only works for types implementing IEnumerable, and uses Newtonsoft’s JsonConvert object to serialise each object in turn before pushing it into the response stream. Enable the formatter by adding it to the Formatters collection:

config.Formatters.Add(new JsonStreamFormatter());

DotNet Consumer

Let’s take a look at a DotNet consumer coded to expect a normal JSON array of objects delivered in a chunked response.

Here we’re using the HttpClient class and we’re requesting a response as “application/json”. Instead of using a string version of the content, we’re working with a stream. The really cool part is that we don’t have to do much more than just throw that stream into a JsonTextReader (part of NewtonSoft.Json). We can yield the response of each JSON token as long as we ignore the first and last tokens, which are the open and closing square brackets of the JSON array. Calling jReader.Read() at the next level reads the whole content of that token, which is one full object in the JSON response.

This method allows each object to be returned for processing while the stream is still being received. The client can save on memory usage just as well as the service.

Ruby Consumer

I have only a year or so experience with Ruby on Rails, but I’ve found it to be an incredibly quick way to build services. In my opinion, there’s a trade off when considering speed of development vs complexity – because the language is dynamic the developer has to write more tests which can add quite an overhead as a service grows.

To consume our service from a Ruby client, we might write some code such as this:

Here we're using HTTParty to manage the request, passing it ‘stream_body: true’ in the options and ‘Accept’ => ‘application/x-json-stream’ set. This tells HTTParty that the response will be chunked, the header tells our service to respond with LDJSON.

From the HTTParty.get block, we see that each chunk is being passed to a JSON parser Yajl::Parser, which understands LDJSON. Each chunk may contain a full JSON object, or several, or partial objects. The parser will recognise when it has enough JSON for a full object to be deserialized and it will send it to the method assigned to parser.on_parse_complete where we’re simply returning the object as a hash with indifferent access.

The Result

Returning responses in a chunked fashion is more than just a neat trick, the amount of memory used by a service returning data in this fashion compared to loading the entire result set into memory before responding is tiny. This means more consumers can request these large result sets and other processes on the server are not impacted.

From my own experience, the Yajl library seems to become unstable after handling a response which streams for more than half an hour or so. I haven’t seen anyone else having the same issue, but on one project I’ve ended up removing Yajl and just receiving the entire response with HTTParty and parsing the collection fully in memory. It isn’t ideal, but it works. It also doesn’t stop the service from streaming the response, it’s just the client that waits for the response to load completely before parsing it.

It’s a nice strategy to understand and is useful in the right place, but in an upcoming post I’ll be explaining why I think it’s often better to avoid large JSON responses altogether and giving my preferred strategy and reasoning.

A friend tweeted recently about how it isn’t always possible to decide late on which product to use for data storage as different products often force an application to use different patterns. This got me thinking about making other decisions in software design. In general it’s accepted that deciding as late as possible is usually a good thing but I think people often miss-interpret ‘as late as possible’ to mean ‘make it an after-thought’.

‘As late as possible’ is a wonderfully subjective term. It suggests that although we want to wait longer before make a decision, we might not be able to. We might need to make a decision to support the design of the rest of the system. Or perhaps in some cases, making the decision early might be more important than making the ‘perfect’ decision.

I started thinking about how to decide whether a decision should be put off and I was reminded of the work of Roy Osherove. He suggests that development teams transition through different states and should be managed differently in each state. There is a similar methodology which relates the approach to software design with different categories of problem space. It’s called Cynefin (pronounced like ‘Kevin’ but with a ‘n’ straight after the ‘K’).

To quote Wikipedia:

The framework provides a typology of contexts that guides what sort of explanations or solutions might apply.

There’s a diagram that goes with this and helps give some context (thanks Wikipedia):

I don’t want to do a deep dive into Cynefin in this article (maybe another day) but to summarise:

Obvious – these solutions are easy to see, easy to build, and probably available off the peg. They’re very low complexity and shouldn’t require a lot of attention from subject matter experts.

Complicated – these solutions are easier to get wrong. Maybe no-one on the team has implemented anything similar before, but experience and direction is available possibly from another team.

Complex – these solutions have lots of possible answers but it’s unclear which will be best. While this is new to your company, other companies have managed to build something similar so you know it is possible.

Chaotic – these solutions are totally new and you have no point of reference as to what would be the best way to implement. You may not even know if it’s possible.

In general, an enterprises core domain will sit in both Complicated and Complex. Chaotic might be something you do for a while but the focus is to move the solution back into one of the other categories.

So what does this have to do with making decisions?

Well, I suggest that the decision making process might change depending on what category your solution falls into.

Obvious – obvious solutions probably don’t have many difficult decisions to make. This is not your core domain (unless you work in a really boring sector) so throwing money and resources at obvious solutions is not sensible. The driver for choosing tech stack might well be just “what’s already available” and might be a constraint right up front. You may want to buy an off the shelf product, in which case a lot of decisions are already made for you. If SQL Server is the quickest path to delivery then it might well be the right thing to use here, even if you’re longing to try a NoSQL approach.

Complicated – complicated solutions are often solved by taking advice from an expert. “We found a separate read concern solved half our problems.” and “A relational database just doesn’t work for this.” are both great nuggets of advice someone who’s done this before might put forward. These solutions are in your core domain, you want to avoid code rot and inflexible architectures – deciding late seems generally sensible, but the advice from your experts can help scope those decisions. Focus on finding the abstractions on which to base the solution. You might know that you’ll need the elasticity that only the cloud can provide, but you might leave the decision on which provider until as late as possible.

Complex – complex solutions are where experts are harder to get involved. They might be in different teams or hired from a consultancy. The focus should still be on finding the right abstractions to allow critical decisions to be delayed. Running multiple possible solutions in parallel to see what works best is a great approach which will give your team confidence in the chosen option. A subject matter expert might be more useful in explaining how they approached defining a solution rather than just what the solution was.

Chaotic – it might seem a terrible idea to try and make decisions in this situation but there are advantages. Chaotic can become Complex if you can find an anchor for the solution. “How do we solve this with ‘x’?” is a lot easier for a team to decide than a more general approach. You’ll almost certainly want to run with two or three possible options in parallel. Keep in mind that whatever decision you make may well eventually be proved incorrect.

I think this shows how the approach to decision making can be affected by what category of solution you’re working on. By picking the right strategy for the right kind of problem, you can focus resources more cost effectively.

Recently I’ve been looking into Specification by Example, which people keep defining to me as BDD done the right way. Specification by Example fully implemented includes the idea of an executable specification. A concept that has led me back to FitNesse having given it the cold shoulder for the last six or seven years.

I’ve always thought of FitNesse as a great idea but I struggled to see how to use it correctly when as a developer I was mainly focused on continuous delivery and Dev Ops. I didn’t see where in the development cycle it fit or how tests in a wiki could also be a part of a CD pipeline. Revisiting FitNesse with a focus on Specification by Example gave me the opportunity to work some of this out and I think I’m very likely to suggest using this tool in future projects.

Before it’s possible to talk about FitNesse in a CI environment, there are a few basics to master. I don’t want to go into a full breakdown of all functionality, but I do want to communicate enough detail to allow someone to get started. In this post I’ll concentrate on getting FitNesse working locally and using it to execute different classes of test. In a later post, I’ll cover how to make FitNesse work with CI/CD pipelines and introduce a more enterprise level approach.

This post will focus on using FitNesse with the Slim test runner against .NET code but should be relevant for a wider audience.

Installing FitNesse

Installing and running FitNesse locally is really easy and even if you’re intending to deploy to a server, getting things running locally is still important. Follow these steps:

Create a batch file to run FitNesse when you need it. My command looks like:

java -jar .fitnesse-standalone.jar -p 9080

Once you’re running the FitNesse process, hit http://localhost:9080 (use whichever port you started it on) and you’ll find a lot of material to help you get to grips with things, including a full set of acceptance tests for FitNesse itself.

What FitNesse Isn’t

Before I get into talking about how I think FitNesse can be of use in agile development, I’d like to point out what it isn’t useful for.

Tracking Work

FitNesse is not a work tracking tool; it won’t replace a dedicated ticketing system such as Jira, Pivotal Tracker or TFS. Although it may be possible to see what has not yet been built by seeing what tests fail, that work cannot be assigned to an individual or team within FitNesse.

Unit Testing

FitNesse can definitely be used for unit testing and some tests executed from FitNesse will be unit tests, so this might seem a bit contradictory. When I say FitNesse isn’t for unit testing, I mean that it isn’t what a developer should be using for many unit tests. A lot of unit testing is focused on a much smaller unit than I would suggest FitNesse should be concerned with.

This test should never be executed from FitNesse. It’s a completely valid test but other than a developer, who cares? Tests like this could well be written in a TDD fashion right along with the code they’re testing. Introducing a huge layer of abstraction such as FitNesse would simply kill the developer’s flow. In any case, what would the wiki page look like?

Deploying Code

Fitnesse will execute code and code can do anything you want it to, including deploying other code. It is entirely possible to create a wiki page around each thing you want to deploy and have the test outcomes driven by successful deployments. But really, do you want to do that? Download Go or Ansible – they’re far better at this.

What FitNesse Is

OK, so now we’ve covered a few things that FitNesse is definitely not, let’s talk about how I think it can most usefully fit into the agile development process.

FitNesse closes the gap between specification and executable tests, creating an executable specification. This specification will live through the whole development process and into continued support processes. Let’s start with creating a specification.

Creating a Specification

A good FitNesse based specification should be all about behaviour. Break down the system under test into chunks of behaviour in exactly the same way as you would when writing stories for a SCRUM team. The behaviours that you identify should be added to a FitNesse wiki. Nesting pages for relating behaviours makes a lot of sense. If you’re defining how a service endpoint behaves during various exception states then a parent page of ‘Exception States’ with one child page per state could make some sense but you aren’t forced to adhere to that structure and what works for you could be completely different.

Giving a definition in pros is great for communicating a generalisation of what you’re wanting to define and gives context for someone reading the page. What pros don’t do well is define specfic examples – this is where specficiation by example can be useful. Work as a team (not just ‘3 amigos’) to generate sufficient specific examples of input and output to cover the complete behaviour you’re trying to define. The more people you have involved, the less likely you are to miss anything. Create a decision table in the wiki page for these examples. You now have the framework for your executable tests.

Different Types of Test

In any development project, there are a number of different levels of testing. Different people refer to these differently, but in general we have:

Unit Tests – testing a small unit of code ‘in process’ with the test runner

Component Tests – testing a running piece of software in isolation

Integration Tests – testing a running piece of software with other software in a live-like environment

UI Tests – testing how a software’s user interface behaves, can be either a component test or integration test

Load Tests – testing how a running piece of software responds under load, usually an integration test

Manual tests are by their nature not automated, so FitNesse is probably not the right tool to drive these. Also FitNesse does not natively support UI tests, so I won’t go into those here. Load tests are important but may require resources that would become expensive if run continuously, so although these might be triggered from a FitNesse page, they would probably be classified in such a way so they aren’t run constantly. Perhaps using a top level wiki page ‘Behaviour Under Load’.

In any case, load tests are a type of integration test, so we’re left with three different types of test we could automate from FitNesse. So which type of test should be used for which behaviour?

The test pyramid was conceived by Mike Cohn and has become very familiar to most software engineers. There are a few different examples floating around on the internet, this is one:

This diagram shows that ideally there should be more unit tests than component tests, and more integration tests than system tests etc. This assertion comes from trying to keep thing simple (which is a great principle to follow); some tests are really easy to write and to run whereas some tests take a long time to execute or even require someone to manually interact with the system. The preference should always be to test behaviour in the simplest way that proves success but when we come to convince ourselves of a specific piece of behaviour, a single class of test may not suffice. There could be extensive unit test coverage for a class but unless our tests also prove that class is being called, then we still don’t have our proof. So we’re left with the possibility that any given behaviour could require a mix of unit tests, component tests and integration tests to prove things are working.

Hooking up the Code

So, how do we get each of our different classes of tests to run from FitNesse? Our three primary types of test are unit, component and integration. Let’s look at unit tests first.

Unit Tests

A unit test executes directly against application code, in process with the test itself. External dependencies to the ‘unit’ are mocked using dependency injection and mocking frameworks. A unit test depends only on the code under test, calls will never be made to any external resources such as databases or file systems.

Let’s assume we’re building a support ticketing system. We might have some code which adds a comment onto a ticket.

This will look for a class called DoCommentsGetAdded, set the properties Comment, Username and CommentedOn and call the method CommentWasAdded() from which it expects a boolean. There is only one test line in this test which tests the happy path, but others can be added. The idea should be to add enough examples to fully define the behaviour but not so many that people can’t see the wood for the trees.

We obviously have to create the class DoCommentsGetAdded and allow it to be called from FitNesse. I added the Ticketing.CommentManager to a solution called FitSharpTest, I’m now going to add another class library to that project called Ticketing.Test.Unit. I’ll add a single class called DoCommentsGetAdded.

Now make sure your solution is built (if you have referenced the debug assemblies then make sure you build in debug) save your changes in FitNesse and hit the ‘Test’ button at the top of the page.

There are of course lots of different types of tables you can use to orchestrate different types of tests. The full list is beyond the scope of this post but if you dip into the documentation under ‘Slim’ then you’ll find it quite easily.

The mechanism always remains the same, however – the diagram below outlines the general pattern.

This is not really any different to what you would see if you were to swap Slim for NUnit. You have the system driving the test (Slim or NUnit), the test assembly and the assembly under test. The difference here is that rather than executing the test within your IDE using Resharper (or another test runner) you’re executing the test from a wiki page. This is a trade off, but like I said at the beginning: not all tests should be in FitNesse, we’re only interested in behaviour which a BA might specify. There will no doubt be other unit tests executed in the usual manner.

Integration Tests

Integration tests run against a shared environment which contains running copies of all deployables and their dependencies. This environment is very like production but generally not geared up for high load or the same level of resilience.

The sequence diagram for unit tests is actually still relevant for integration tests. All we’re doing for the integration tests is making the test assembly call running endpoints instead of calling classes ‘in process’.

Let’s set up a running service for our ‘add comment’ logic and test it from FitNesse. For fun, let’s make this a Web API service running in OWIN and hosted with TopShelf. Follow these steps to get running:

Add a new command line application to your solution. Call it TicketingService (I’m using .NET 4.6.1 for this).

Right, run this service and you’ll see that you can add as many comments as you like and retrieve the count.

Going back to our sequence diagram, we have a page in FitNesse already, and now we have an assembly to test. What we need to do is create a test assembly to sit in the middle. I’m adding Ticketing.Test.Integration as a class library in my solution.

It’s important to be aware of what version of .NET FitSharp is built on. The version I’m using is built against .NET 4, so my Ticketing.Test.Integration will be a .NET 4 project.

I’ve added a class to the new project called DoCommentsGetAdded which looks like this:

Now, make sure the service is running and hit the ‘Test’ button in FitNesse.

Obviously, this test was configured to execute against our localhost, but it could just as easily have been configured to execute against a deployed environment. Also, the code in the test class isn’t exactly what I would call first class, this is just to get stuff working.

The important thing to take from this is that the pattern in the original sequence diagram still holds true, so it’s quite possible to test this behaviour as either a unit test or as an integration test, or even both.

Component Tests

A component test is like an integration test in that it requires your code to be running in order to test. Instead of your code executing in an environment with other real life services, it’s tested against stubs which can be configured to respond with specific scenarios for specific tests, something which is very difficult to do in an integration environment where multiple sets of tests may be running simultaneously.

I recently wrote a short post about using Mountebank for component testing. The same solution can be applied here.

There are so many things wrong with this code, but remember this is just for an example. The functioning of this controller depends on an HTTP endpoint which is not part of this solution. It cannot be tested by an integration test without that endpoint being available. If we want to test this as a component test, then we need something to pretend to be that endpoint at http://localhost:9999/ticketservice.

To do this, install Mountebank by following these instructions:

Make sure you have an up to date version of node.js installed by downloading it from here.

Our heavily hacked TicketController class will function correctly only if this imposter returns 201, anything else and it will fail.

Now, I’m going to list the code I used to get this component test executing from FitNesse from a Script Table. I’m very certain that this code is not best practice – I’m trying to show the technicality of making the service run against Mountebank in order to make the test pass.

I’ve added a new project to my solution called Ticketing.Test.Component and I updated the config.xml file with the correct namespace. I have two files in that solution, one is called imposters.js which contains the json payload for configuring Mountebank, the other is a new version of DoCommentsGetAdded.cs which looks like this:

A Script Table basically allows a number of methods to be executed in sequence and with various parameters. It also allows for ‘checks’ to be made (which are your assertions) at various stages – it looks very much like you would expect a test script to look.

In this table, we’re instantiating our DoCommentsGetAdded class, calling Setup() and passing the path to our imposters.js file, calling AddComment() to add a comment and then checking that CommentWasAdded() returns true. Then we’re calling TearDown().

Setup() and TearDown() are there specifically to configure Mountebank appropriately for the test and to destroy the Imposter afterwards. If you try to set up a new Imposter at the same port as an existing Imposter, Mountebank will throw an exception, so it’s important to clean up. Another option would have been to set up the Imposter in the DoCommentsGetAdded constructor and add a ~DoCommentsGetAdded() destructor to clean up – this would mean the second and last lines of our Script Table could be removed. I am a bit torn as to which approach I prefer, or whether a combination of both is appropriate. In any case, cleaning up is important if you want to avoid port conflicts.

Ok, so run your service, make sure Mountebank is also running and then run your test from FitNesse.

Again, this works because we have the pattern of our intermediary test assembly sat between FitNesse and the code under test. We can write pretty much whatever code we want here to call any environment we like or to just execute directly against an assembly.

Debugging

I spent a lot of time running my test code from NUnit in order to debug and the experience grated because it felt like an unnecessary step. Then I Googled to see what other people were doing and I found that by changing the test runner from:

!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunner.exe}

to:

!define TEST_RUNNER {D:ProgramsFitnesseFitSharpRunnerW.exe}

FitSharp helpfully popped up a .NET dialog box and waited for me to click ‘Go’ before running the test. This gives an opportunity to attach the debugger.

Summary

A high level view of what has been discussed here:

Three types of test which can be usefully run from FitNesse: Unit, Integration and Component.

FitNesse is concerned with behaviour – don’t throw NUnit out just yet.

The basic pattern is to create an intermediary assembly to sit between FitNesse and the code under test. Use this to abstract away technical implementation.

Clean up Mountebank Imposters.

You need FitSharp to run FitNesse against .NET code.

Debug with FitSharp by referencing the RunnerW.exe test runner and attaching a debugger to the dialog box when it appears.

Not Quite Enterprise

In this first post on FitNesse, I’ve outlined a few different types of test which can be executed and listed code snippets and instructions which should allow someone to get FitNesse running on their machine. This is only a small part of the picture. Having separate versions of the test suite sat on everyone’s machine is not a useful solution. A developer’s laptop can’t be referenced by a CI/CD platform. FitNesse can be deployed to a server. There are strategies for getting tests to execute as part of a DevOps pipeline.

This has been quite a lengthy post and I think these topics along with versioning and multi-environment scenarios will be best tackled in a subsequent post.

I also want to take a more process oriented look at how tests get created, who should be involved and when. So maybe I have a couple of new entries to work on.