High-Quality Load Testing Models

Well-written, well-executed load tests are critical to ensuring a good level of service for your customers and users.

Sadly, many load tests are hampered right out of the gate by lacklustre test plans that don't accurately model production-like traffic. In this post, I'll explain how to develop a load testing plan that will enable you to run load tests that closely mimic your production traffic.

Population not Individual

It's likely that all your users behave in slightly different ways. Some users of a shopping site might browse through the catalogue before taking a buying action, others might buy straight from the home page, yet others might not buy at all.

Trying to reduce this behaviour down to some kind of idealised user is a one of the worst mistakes to make when designing a test plan. Instead of trying to capture how such an ideal user behaves, instead capture how your user population behaves, and use that population behaviour to drive your tests.

Probabilistic not Deterministic

Once you start modelling your user population rather than individual users, it helps to start thinking probabilistically rather than deterministically.

A traditional test plan has each virtual user follow the same deterministic set of test actions. Armed with knowledge of how your user population behaves, you can make each virtual user in your test a probabilistic simulation of your actual user population. No more load tests made up of 5000 virtual users all doing exactly the same thing.

Model not Plan

I like to think of these population-based, probabilistic load tests as being driven by a load test model rather than a load test plan.

A load test model describes what kinds of actions users can take (the web requests), how often they take these actions, and what orderings exist. A load test plan describes how a single user behaves, a load test model describes how the user population behaves.

A plan is prescriptive and deterministic. Each load test you run from such a plan is going to look much like all the others you can run. A model is descriptive and probabilistic.

A load test model allows you to simulate any number of virtual users. This in turn allows you to run any number of interesting scenarios with virtual audiences that look like the real deal.

Designing a Load Testing Plan

An effective load testing plan is made up of four things:

The set of request templates that the virtual users can make

The relationships between these requests

The probability of transition between the requests

The waiting times after each request

As a good approximation, these load models can be represented graphically as Markov Chains. Consider this basic load model for the skipjaq.com website:

Each ellipse in this diagram is called a state and represents a request that can be made by a virtual user. The transitions between the states indicate valid navigation actions that the virtual user can perform. Each transition is labelled with the probability that the user will take that particular transition.

Note that the transition probabilities leaving the Home state don't sum to 100%, instead they sum to 80%. At every state there is a chance that the user simply leaves the site. In this case, we say there is a 20% chance that the user will simply leave after visiting the home page.

Holding Distributions

Missing from the skipjaq.com load model diagram is how long, on average, the users spend waiting in each state. Let's add that in to create the full picture:

The average waiting times are shown in red. When we come to realise the load model in our load testing tool, we need to choose the distribution of these waiting times. A useful starting point for web traffic, is to assume that wait times are exponentially-distributed.

Constructing the Plan in JMeter

You might not be surprised to find that constructing a high-quality load test model in your favourite open source load runner isn't as easy as it should be. It's unusual to find load testing tools that support the Markov-based model natively. Thankfully, it's not impossible to construct a Markov-like plan in most test tools if you're prepared to get a bit creative.

Let's walk through constructing a test plan for the skipjaq.com load model in Apache JMeter.

Basic Configuration

From a blank test plan, we start by adding some default settings. I prefer to set the host name for the System Under Test (SUT) once per test plan, rather than adding it to each sampler:

At the root of the plan we add the master Thread Group:

For demo purposes, I configured the Thread Group with 100 threads, each of which will loop 10 times. Tuning the size of the Thread Group is a critical part of running a performance test with JMeter, and I'll talk more about this in a later post.

We're going to implement the Markov-based load model using a loop with an embedded switch statement. The loop exits once the virtual user has completed their navigation. The switch statement selects the correct page to visit on each iteration.

Virtual User State Variables

For each virtual user, we maintain a active variable tracking whether the user is still actively traversing the system, and a state variable tracking which state they are in. These variables are re-initialised on each time a virtual user 'enters' the system. The User Parameters pre-processor does the job here:

Make sure to check the Update Once Per Iteration check box, otherwise each thread will terminate after the first virtual user executing in that thread exits.

The active variable is initialised to true and the state variable to Home so that each virtual user will start out in the Home state.

The Virtual User Loop

The JMeter While Controller executes all it's children while some condition is true. For our virtual user loop, we set the loop to execute while the active variable is true:

The Condition setting of the While Controller allows for boolean variable evaluation using the ${var_name} syntax. Here we evaluate the active variable with ${active}.

Selecting the Correct State

On each iteration of the loop we want to execute the right test actions for the state the user is in. The Switch Controller selects from its children based on the value of some variable. Recall that the virtual user's current state is stored in the state variable. Conditioning our switch on this allows us to pick the child controller named for the state:

Here I'm setting the switch value to ${state}, a variable expression that evaluates to the current value of the state variable for each virtual user.

Nested inside the Switch Controller I've added one Simple Controller per state, each with the correct name for that state. It is these names that we are going to assign to the state variable to move a virtual user through our Markov chain.

Modelling the States

Each state is modelled as a Simple Controller containing a nested HTTP Request sampler and a Poisson Random Timer. The sampler is the actual request that the virtual user makes when entering the state:

Here we see that a virtual user entering the Home state makes a GET request to /. The random timer captures the holding time in that state:

The Poisson timer distributes exponentially about the mean (Lambda) parameter. In this case, we have an exponentially-distributed holding time with mean average of 1500ms.

Jumping Between States

With the basic structure of the test plan in place, all that remains is to encode the transition probabilities and update the state and active variables appropriately at each step. This transition logic isn't easily encoded using the standard JMeter constructs, but thankfully we can drop down to code to calculate our next state. Using the JMeter JSR 223 PostProcessor we can run some code after each iteration of the virtual user loop:

I'm using Groovy for my script, but you can use any JSR-223 compatible scripting engine. There are myriad ways of encoding the state transitions into Groovy, but if you're writing by hand a simple switch with some nested ifs is certainly the clearest:

The vars variable provides implicit access to the variables for the current virtual user. From here we can access our active and state variables. Inside the switch we have a case for each state that has an outbound transition. The nested if statements capture the probability of transition, setting active to false if no transition occurs.

We set active to false in the default case to avoid having to model that explicitly for states that have no outbound transitions.

Robust Models with Assertions

The plan we have constructed is a good first cut and can be used to run some basic load tests. However, there is one glaring hole: we don't verify that responses to our requests are correct. When running a load test monitoring the error rate is just as important as monitoring the throughput and latency.

JMeter will check that we get a valid HTTP response code, but beyond that it does no checking of the actual response. For each request you make, you'll want to add at least one assertion.

For static page requests like those in the skipjaq.com load model, a sensible initial assertion is to check that the page title is correct. To do this, we add a Response Assertion as a child of the home page request sampler:

This assertion uses a simple substring check to assert the contents of the HTML <title> tag are as we expect.

Calculating Model Parameters

Constructing the JMeter plan, while tedious, is mostly mechanistic. It does get messy the larger your system gets, but where things really start to get difficult is when you have to start calculating the model parameters.

If you're building a new system, you're going to have to choose some holding times for your load model. If we assume that user wait times are approximately exponentially-distributed, it serves to ask yourself roughly how long each user spends on average at each state.

You don't have to get this exactly right, because once you're up and running with real users, you're going to determine these wait times empirically. Likewise, it helps to select some transition probabilities and then refine them once you have real data.

In a production system, you have plenty of useful data sources from which to calculate your model parameters. Your HTTP access logs contain a record of every request into your system. Threading these together based on client IP you can calculate both the average holding times and the transition probabilities.

If you're using Google Analytics, the API provides you with easy access to the number of transitions between each page and the average waiting time for each page.

Of course, you don't really want to have to do this yourself. Thankfully, we're here to help. Sign up for our free load model generator to automatically generate a skeleton load model from your Google Analytics or access log data. Combined with our open source JMeter generator, you can be up-and-running with a high quality load test in no time.

Summary

Good quality load test models accurately capture the behaviour of real user populations. Load tests that spin up hundreds of identical virtual users all hammering the system without respite are dangerously far from the real world.

Markov chains provide a handy way to model user load graphically. These chains can be translated - albeit in a clunky fashion - into JMeter test plans for use in your organisation.

Creating the skeleton of your load model can be tedious and time-consuming, so why not let SKIPJAQ do it for free by signing up for our load model generator.