Ibrahim tests the performance of three open-source webservers on a typical Ericsson Research Linux clusterplatform.

The Benchmarking Environment

The performance of web servers and client-server systems
depends on many factors: the client platform, client software,
server platform, server software, network and network protocols.
Most of the performance analysis of the Web has concentrated on two
main issues: the overall network performance and the performance of
web server software and platforms.

Our benchmarks consist of a mechanism to generate a
controlled stream of web requests with standard metrics to report
results. We used 16 Intel Celeron 500MHz 1U rackmount units (see
Figure 2) that come with 512MB of RAM and run Windows NT. These
machines generate traffic using WebBench, a Freeware tool available
from
www.zdnet.com.

Figure 2. Benchmarking Units

The basic benchmark scenario is a set of client programs
(loaf generators) that emit a stream of web requests and measure
the system response. The stream of requests is called the workload.
WebBench provides a way to measure the performance of web servers.
It consists of one controller and many clients (see Figure 3). The
controller provides means to set up, start, stop and monitor the
WebBench tests. It is also responsible for gathering and analyzing
the data reported from the clients.

Figure 3. Architecture of WebBench

On the other hand, the clients execute the WebBench tests and
send requests to the server. WebBench uses the client PCs to
simulate web browsers. However, unlike actual browsers, the clients
do not display the files that the server sends in response to their
requests. Instead, when a client receives a response from the
server, it records the information associated with the response and
then immediately sends another request to the server.

There are several measurements of web servers. For our
testing, we will report the number of connections or requests
served per second and throughput, the number of served bytes per
second (see Figure 4).

Figure 4. WebBench Control Window

WebBench uses a standard workload tree to benchmark the
server. The workload tree comes as a compressed file that we need
to move to the server and expand in the HTML document root on the
web server (this is where the web server looks for its HTML files).
This will create a directory called WBTREE that contains 61MB of
web documents that will be requested by the WebBench clients. Since
some of our CPUs are diskless, we installed the workload tree on
the NFS server and modified the web server configuration to use the
NFS directory as its document root.

As part of WebBench configuration, we specified that the
traffic generated by the benchmarking machines would be distributed
equally among the targeted CPUs. Figure 5 shows how we specify each
server node and the percentage of the traffic it will
receive.

Figure 5. Sample WebBench Configuration

Test Cases

After setting our Linux cluster and the benchmarking
environment, we were ready to define our test cases. We tested all
of the three web servers (Apache both 1.3.14 and 2.08a, Tomcat 3.1
and Jigsaw 2.0.1) running on 1, 2, 4, 6, 8, 10 and 12 CPUs. For
every test case, we would specify in the RAM disk loaded by the
CPUs which web server to start when the RAM disk is loaded. As a
result, we ran four types of tests, each with a different server
and on multiple CPUs.

For the purpose of this article, we will only show three
comparison cases: Apache 1.3.14 vs. Apache 2.08a on one CPU, Apache
1.3.14 vs. Apache 2.08a on eight CPUs and Jigsaw 2.0.1 vs. Tomcat
3.1 on one CPU.

The first benchmark we did was to test all the web servers on
one CPU. In WebBench configuration, we specified that all the
traffic generated by all the clients be directed to one CPU. Figure
6 shows the results of the benchmark for up to 64 simultaneous
clients. On average, Apache 1.3.14 was able to serve 828 requests
per second vs. 846 requests per second serviced by Apache 2.08a.
The latest showed a performance improvement of 2.1%.

Figure 6. Apache 1.3.14/2.08a Benchmarking Data on One CPU

Figure 7 plots the results of the benchmarks of Apache 1.3.14
and Apache 2.08a. As we can see, both servers have almost identical
performance.

Figure 7. Apache 1.3.14/Apache 2.08a Benchmarking Results on One
CPU

As for the Java-based web servers, Tomcat and Jigsaw, Figures
8 and 9 show the resulting benchmarking data. The maximum number of
requests per second Jigsaw was able to achieve was 39 vs. 60 for
Tomcat. We were surprised by Jigsaw's performance; however, we need
to remember that Jigsaw was designed to experiment new technologies
rather than as a high-performance web server for industrial
deployment.

Figure 8. Tomcat/Jigsaw Benchmarking Data on One CPU

Figure 9. Tomcat/Jigsaw Benchmarking Results on One CPU

When we scale the test over eight CPUs, Apache 2.08a was more
consistent in its performance, servicing more requests per second
as we increased the number of concurrent clients without any
fluctuations in the number of serviced requests (see Figure
10).

Figure 10. Apache 2.08a/Apache 1.3.14 Benchmarking Data on Eight
CPUs

Figure 11 clearly shows how consistent Apache 2.08a is
compared to Apache 1.3.14. On eight CPUs, Apache 2.08a was able to
maintain an average of 4,434 requests per second vs. 4,152 for
Apache 1.3.14, a 6.8% performance improvement.

Very useful article - however, comparing Tomcat and Apache is like comparing apples and oranges: Apache is designed to serve static content, while Tomcat is primarily a JSP/Servlet engine, and contains a standalone web server as a convenience.

Good article. Would like to know if Apache 2.0 was set up in this test to run threaded, or multi-process. The
similarity in performance makes me think both Apache 2.0 and 1.3 versions were running multiple Apache processes, with
resulting overhead from spawing new processes. Under Linux this isn't huge, but other unices have problems with
this model.

I'm also interested in Apache 2.0's multithreded performance when running as an app server - mod_perl, mod_php or
mod_python for example. Does threading allow sharing of persistent database connections, and what effect does that
have on memory usage, speed, and behaviour under heavy loads?

I'm very pleased we got to run an article like this. This is our best defense against FUD from vendors of "less capable" web servers. When I got into Linux I never expected to see IBM running TV ads about Linux but what we see here shows me that IBM (and the rest of us) are on the right team.

Trending Topics

Webinar: 8 Signs You’re Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th

Join Linux Journal and Pat Cameron, Director of Automation Technology at HelpSystems, as they discuss the eight primary advantages of moving beyond cron job scheduling. In this webinar, you’ll learn about integrating cron with an enterprise scheduler.