Automating Api Load Tests

I’ve recently had to run a few load tests on a service the company I’m employed by is launching soon. The company is sizable and has many services. I thought, what if I could just automate the hell out of that in such a way that I could just point it at another service and shoot. Then, everyone could have sad graphs useful information about their services! What I wanted my solution to provide was a graph of performance based on CSV-shaped data. The data had to include a few percentiles of latency (50, 90, 99 % requests served, for example), means and standard deviations, for increasing values of concurrency. Error rates would be a plus. On my merry way I went to learn what existed on the market.

This article is a gathering of my findings, where I’ll compare the functionality of a few tools I looked at, I’ll then go on a bit about the strategies I considered and eventually employed, and then I’ll spend a few moments pondering whether or not this is necessary at all.

HTTP API load testing tools examined

I already had known about Apache Benchmark (ab), and I wondered what else existed out there so I started asking around. Someone pointed me in the direction of Apigee’s apib (available here), there’s also Vegeta, and I’ve seen another I didn’t end up using because it seemed heavily GUI-oriented and I was looking for automatable CLI programs (jmeter, if you are ever so inclined).

Apache Benchmark

Results look as follow when using Apache Benchmark:

I couldn’t find a way to configure the output to be as rich as I wanted, so I had to drop it. You can get it to spit out the percentiles in a csv file, which is good for some use cases, but for what I had in mind I needed a few important percentiles, and at least the mean and standard deviation are available in the default output (but I’d have to parse them out). No biggie, but I still decided to look at other stuff.

Apigee’s apib

Now THIS is fun. You can provide a set of options very similar to those ab provide, plus a CSV output (and you can even get a header line only if you call apib -T, which is useful for things like Google Charts (which I ended up using)).

Spoiler alert: This is what I ended up using, with a simple shell script:

This is absolutely what I’m looking for. Excellent. I can totally feed this into Google Charts.

Load testing strategy

I might be overselling this a bit, calling this section “strategy”. It’s mostly about how to attack the problem of automating the load tests in such a way that the tool is reusable for any API that we have.

This is fairly easy if all of your APIs provide a spec, like you can using OpenAPI (formerly known as Swagger). This allows you to easily figure out what the exposed routes are. It doesn’t provide a solution for POST, PUT, or DELETE routes, mostly on account of them usually being much more complex. It’s a much more complicated business, for example, to automatically figure out the order in which you need to create the resources, then automatically generate fake data to create items at load-testing speeds, and also keep tabs on all the returned values so you can also test the PUT and DELETE routes with also fake data…

I mean it’s technically even hard to load-test for routes that have required parameters without having either an example in the Swagger file that has a valid ID, or per-endpoint ID types so that you can infer which route to query for the collection…

I didn’t have a lot of time, and we’re considering the “IDs in the examples” as a first step.

In conclusion: Of the necessity of it all

There’s plenty of alternate tools, and the approach is not complete; for now, I’ll let good enough be good enough. Fact of the matter is:

HTTP Services are, in general, read-heavy

This approach will give me a good idea of whether or not the service is fundamentally broken

This approach also will give me a good idea of when to scale up, given a single node of known size

Essentially, just as long as we can gather the data in a shape that’s easy to pass around, it’s going to be easy to get these performance graphs going, either in GNUPlot, any spreadsheet software, or online through libraries like Google Chart.