Friday, April 08, 2005

HTTP performance testing with httperf, autobench and openload

Update 02/26/07--------The link to the old httperf page wasn't working anymore. I updated it and pointed it to the new page at HP. Here's a link to a PDF version of a paper on httperf written by David Mosberger and Tai Jin: "httperf -- a tool for measuring Web server performance".

Also, openload is now OpenWebLoad, and I updated the link to its new home page.--------

In this post, I'll show how I conducted a series of performance tests against a Web site, with the goal of estimating how many concurrent users it can support and what the response time is. I used a variety of tools that measure several variables related to HTTP performance.

httperf is a benchmarking tool that measures the HTTP request throughput of a web server. The way it achieves this is by sending requests to the server at a fixed rate and measuring the rate at which replies arrive. Running the test several times and with monotonically increasing request rates, one can see the reply rate level off when the server becomes saturated, i.e., when it is operating at its full capacity.

autobench is a Perl wrapper around httperf. It runs httperf a number of times against a Web server, increasing the number of requested connections per second on each iteration, and extracts the significant data from the httperf output, delivering a CSV format file which can be imported directly into a spreadsheet for analysis/graphing.

openload is a load testing tool for Web applications. It simulates a number of concurrent users and it measures transactions per second (a transaction is a completed request to the Web server) and response time.

I ran a series of autobench/httperf and openload tests against a Web site I'll call site2 in the following discussion (site2 is a beta version of a site I'll call site1). For comparison purposes, I also ran similar tests against site1 and against www.example.com. The machine I ran the tests from is a Red Hat 9 Linux server co-located in downtown Los Angeles.

I won't go into details about installing httperf, autobench and openload, since the installation process is standard (configure/make/make install or rpm -i).

server: the name or IP address of your Web site (you can also specify a particular URL via the --uri argument)

rate: specifies the number of HTTP requests/second sent to the Web server -- indicates the number of concurrent clients accessing the server

num-conns: specifies how many total HTTP connections will be made during the test run -- this is a cumulative number, so the higher the number of connections, the longer the test run

Here is a detailed interpretation of an httperf test run. In short, the main numbers to look for are the connection rate, the request rate and the reply rate. Ideally, you would like to see that all these numbers are very close to the request rate specified on the command line. If the actual request rate and the reply rate start to decline, that's a sign your server became saturated and can't handle any new connections. That could also be a sign that your client became saturated, so that's why it's better to test your client against a fast Web site in order to gauge how many outgoing HTTP requests can be sustained by your client.

Autobench is a simple Perl script that facilitates multiple runs of httperf and automatically increases the HTTP request rate. Configuration of autobench can be achieved for example by means of the ~/.autobench.conf file. Here is how my file looks like:

# uri1, uri2# The URI to test (relative to the document root). For a fair comparison# the files should be identical (although the paths to them may differ on the# different hosts)

uri1 = /uri2 = /

# port1, port2# The port number on which the servers are listening

port1 = 80port2 = 80

# low_rate, high_rate, rate_step# The 'rate' is the number of number of connections to open per second.# A series of tests will be conducted, starting at low rate,# increasing by rate step, and finishing at high_rate.# The default settings test at rates of 20,30,40,50...180,190,200

low_rate = 10high_rate = 50rate_step = 10

# num_conn, num_call# num_conn is the total number of connections to make during a test# num_call is the number of requests per connection# The product of num_call and rate is the the approximate number of# requests per second that will be attempted.

num_conn = 200#num_call = 10num_call = 1

# timeout sets the maximimum time (in seconds) that httperf will wait# for replies from the web server. If the timeout is exceeded, the# reply concerned is counted as an error.

timeout = 60

# output_fmt# sets the output type - may be either "csv", or "tsv";

output_fmt = csv

## Config for distributed autobench (autobench_admin)# clients# comma separated list of the hostnames and portnumbers for the# autobench clients. No whitespace can appear before or after the commas.# clients = bench1.foo.com:4600,bench2.foo.com:4600,bench3.foo.com:4600

clients = localhost:4600

The only variable I usually tweak from one test run to another is num_conn, which I set to the desired number of total HTTP connections to the server for that test run. In the example file above it is set to 200.

I changed the default num_call value from 10 to 1 (num_call specifies the number of HTTP requests per connection; I like to set it to 1 to keep things simple). I started my test runs with low_rate set to 10, high_rate set to 50 and rate_step set to 10. What this means is that autobench will run httperf 5 times, starting with 10 requests/sec and going up to 50 requests/sec in increments of 10.

Here is a graph generated via Excel from the CSV file obtained when running autobench against www.example.com for a different test run, with 500 total HTTP connections (the CSV file is here):

A few things to note about this typical autobench run:

I chose example.com as an example of how an "ideal" Web site should behave

the demanded request rate (in requests/second) starts at 10 and goes up to 50 in increments of 5 (x-axis)

for each given request rate, the client machine makes 500 connections to the Web site

the achieved request rate and the connection rate correspond to the demanded request rate

the average and maximum reply rates are roughly equal to the demanded request rate

the reponse time is almost constant, around 100 msec

the are no HTTP errors

What this all means is that the example.com Web site is able to easily handle up to 50 req/sec. The fact that the achieved request rate and the connection rate increase linearly from 10 to 50 also means that the client machine running the test is not the bottleneck. If the demanded request rate were increased to hundreds of req/sec, then the client will not be able to keep up with the demanded requests and it will become the bottleneck itself. In these types of situations, one would need to use several clients in parallel in order to bombard the server with as many HTTP requests as it can handle. However, the client machine I am using is sufficient for requests rates lower than 50 req/sec.

I specified only 200 connections per run, so that the server would not be over-taxed

the achieved request rate and the connection rate increase linearly with the demanded request rate, but then level off around 40

there is a drop at 45 req/sec which is probably due to the server being temporarily overloaded

the average and maximum reply rates also increase linearly, then level off around 39 replies/sec

the response time is not plotted, but it also increases linearly from 93 ms to around 660 ms

To verify that 39 is indeed the maximum reply rate that can be achieved by the Web server, I ran another autobench test starting at 10 req/sec and going up to 100 req/sec in increments of 10 (the CSV file is here):

Observations:

the reply rate does level off around 39 replies/sec and actually drops to around 34 replies/sec when the request rate is 100

the response time (not plotted) increases linearly from 97 ms to around 1.7 sec

We can conclude that the current site1 Web site can sustain up to around 40 requests/second before it becomes saturated. At higher request rates, the response time increases and users will start experiencing time-outs.

the achieved request rate and the connection rate do not increase with the demanded request rate; instead, they are both almost constant, hovering around 6 req/sec

the average reply rate also stays relatively constant at around 6 replies/sec, while the maximum reply rate varies between 5 and 17

there is a dramatic increase in response time (not plotted) from 6 seconds to more than 18 seconds

From this initial run, we can see that the average reply rate does not exceed 6-7 replies/second, so this seems to be the limit for the site2 Web site. In order to further verify this hypothesis, I ran another autobench test, this time going from 1 to 10 requests/second, in increments of 1. Here is the report (the CSV file is here):

Some things to note about this autobench run:

the achieved request rate and the connection rate increase linearly with the demanded request rate from 1 to 6, then level off around 6

the average reply rate is almost identical to the connection rate and also levels off around 6

the maximum reply rate levels off around 8

the reponse time (not plotted) increases from 226 ms to 4.8 seconds

We can conclude that the site2 Web site can sustain up to 7 requests/second before it becomes saturated. At higher request rates, the response time increases and users will start experiencing time-outs.

Finally, here are the results of a test run that uses the openload tool in order to measure transactions per second (equivalent to httperf's reply rate) and reponse time (the CSV file is here):

Some notes:

the transaction rate levels off, as expected, around 6 transactions/sec

the average response time levels off around 7 seconds, but the maximum response time varies considerably from 3 to around 20 seconds, reaching up to 30 seconds

These results are consistent with the ones obtained by running httperf via autobench. From all these results, we can safely conclude that in its present state, the site2 Web site is not ready for production, unless more than 6-7 concurrent users are never expected to visit the site at the same time. The response time is very high and the overall user experience is not a pleasant one at this time. Also, whenever I increased the load on the site (for example by running autobench with 200 through 500 connections per run), the site became almost instantly un-responsive and ended up sending HTTP errors back to the client.

Conclusion

The tools I described are easy to install and run. The httperf request/reply throughput measurements in particular prove to be very helpful in pinpointing HTTP bottlenecks. When they are corroborated with measurements from openload, an overall picture emerges that is very useful in assessing HTTP performance numbers such as concurrent users and response time.

Update

I got 2 very un-civil comments from the same Anonymous Coward-type poster. This poster called my blog entry "amateurish" and "recklessly insane" among other things. One slightly more constructive point made by AC is a question: why did I use these "outdated" tools and not other tools such as The Grinder, OpenSTA and JMeter? The answer is simple: I wanted to use command-line-driven, lightweight tools that can be deployed on any server, with no need for GUIs and distributed installations. If I were to test a large-scale Web application, I would certainly look into the heavy-duty tools mentioned by the AC. But the purpose of my post was to show how to conduct a very simple experiment that can still furnish important results and offer a good overall picture about a Web site's behavior under moderate load.

Let me add that none of "modern" tools is particularly easy to use to test varying loads as you have done here. Neither JMeter nor Grind make it very easy to look at how response time varies with request rate (without manually varying the request rate or going through some very un-obvious gyrations with the tools). As you indicate, the suite of tools you are using falls a bit short for testing web applications (parsing server response for forms etc) but it is great for getting quick raw performance numbers. Thanks for the helpful post!

Just found your blog, and enjoyed reading this article. Have you any idea what causes the consistant quirk at 45 requests per second? It's present on all of your graphs that get to the 45 mark - some are more obvious than others. Any idea what could be causing the throughput reduction (if that's what it is)? It seems odd that both the example.com webserver and yours should consistantly hiccup at the same point.

Andrew -- not sure what's going on when the number of concurrent users reaches the magical 45; I suspect the Linux client where I was running these tests gets in a funky state at that point. It's definitely on the client side.

Hi, I am a freshman for httpref/autobench. Thanks for you nice article. But I have a doubtful point. In your example of "autobench report for site1(200 connections per run)", you explained the response time increase linearly from 93 ms to around 660ms; also in your example "autobench report #2 for site1(200 connections per run)", response time increase linearly from 97 ms to aournd 1.7sec, etc. I wanna know how you get the figure.(How to caculate them), thanks!!

I looked briefly at funkload, but it had a lot of pre-requisites for its installation, and its configuration seemed kind of complicated, so I chose to go with simpler tools. I haven't abandoned it altogether though, so I may go back to it at some point. Do you have any pointers to tutorials/howtos about it?

I will conceed that funkload is rather tedious to get into but on the other hand I don't think it's any harder than any of the tools you discuss here. Don't get me wrong I don't praise for funkload but it has worked quite well for me. I don't have an example at hand but I have discussed it in the CherryPy book as my example of load testing (not pushing to get the book BTW just telling ;))

In any case I really enjoy your articles and this one in particular. Well done.

I am very happy to find your blog. I am trying my first performance project and have no idea how to do it. I am trying to use this tool--silk performer if my company purchases it coz I am not good at using the open source tools as you. I do have a main question on how to select scenarios for load tests. I have been doing functional QA for almost 10 years. I know that test scenarios in performance should be very different than in functional tests. But how should I select scenarios. My company's product is a web application. So would this kind of scenarios is ok: for example, on this blog website, to do load tests, a scenario could be post a blog. Another scenario would be post a blog, post a comment, edit a comment--which one would be a valid scenario for performance tests? Really appreciate your answers.

Hi, I have just spent some time playing around with httperf and found it quite simple and easy to use. Stumbled upon your interesting and informative blog when searching for more information on Httperf. I am looking for some option with httperf that will allow one to run a test for a specified lenght of time. i.e some option that will allow one to run a test (a set of http post/request) with (say) 10 connections for 30 minutes. It would be great if you could throw some light.

Layla -- I'm not sure how to run httperf for a specified period of time. But one solution I see is to write your own script that sits in a loop and calls httperf repeatedly with as many connections as you want. At the end of the loop, you time it, and if the total running time is less than what you need, you go through the loop again. Would that work?

I've used grinder many times, JMeter a few times, whilst both are useful, there are some significant reasons to choose httperf.

Both grinder and jmeter suffer from teh same flaw. A slow server will act as a gate restrictingthe test client from sending more requests until those pending have been processed. This means that thes e tools won't effectively simulate an overload situation.

Httperf is one of teh few load generation tools that don't ahve this restriction and this is why the resulst gathered from httperf ar emore realistic than those gathered with jmeter or grinder.

As a note: httperf only collects reply rate samples once every 5 seconds. If your server is faster than that, you'll get 0s (zeros, i'm adding that for SEO, I wasted an hour figuring this out and hope google reindexes your site) if your server is too fast. Boost num_conns, and/or num_calls to get results.