Web and File Server Comparison:

Mindcraft has issued an Open Benchmark Invitation to the leaders of
the Linux community to participate in a retest of the Linux and Windows NT
Server benchmarks we published. We hope that they will accept this
invitation.

Mindcraft has received a great deal of e-mail and
press coverage about this benchmark. Take a look at our rebuttals and commentary
about several of the press articles. You'll learn a great deal about the
NetBench benchmark.

Microsoft Windows NT Server 4.0 is 2.5 times faster
than Linux as a File Server and 3.7 times faster as a Web Server

Mindcraft tested the file-server and Web-server
performance of Microsoft Windows NT Server 4.0 and Red Hat Linux 5.2
upgraded to the Linux 2.2.2 kernel (in this report referred to simply as
Linux) on a Dell PowerEdge 6300/400 server. For Linux, we used Samba 2.0.3
as the SMB file server and Apache 1.3.4 as the Web server. For Windows NT
Server we used its embedded SMB file server and Internet Information
Server 4.0 Web server.

Figure 1 summarizes the file server peak
throughput measured for each system in megabits per second (Mbits/Sec). It
also shows how many test systems were needed to reach peak performance.
The results show that, as a file-server, Windows NT Server 4.0 is 2.5
times faster than Linux with Samba. In addition, Windows NT Server reaches
its peak performance at 2.3 times the number of test systems that Linux
with Samba does.

Figure 2 shows the Web server peak performance
measured in HTTP GET requests per second and throughput measured in
megabytes per second (MB/Sec). The Web server results show that Windows NT
Server 4.0 is over 3.7 times faster than Linux with Apache. As discussed
in the Web Server Performance section below, the
performance of Linux with Apache drops to 7% of the peak level when we
increased the number of test threads above 160. Thus, Linux/Apache
performance becomes unreliable under heavy load. Windows NT Server, on the
other hand, continues to increase its performance up through 288 test
threads. We believe that we did not reach the true peak performance of the
system under Windows NT Server 4.0 because we did not have more test
systems available.

Mindcraft tested file server performance using the
Ziff-Davis Benchmark Operation NetBench 5.01 benchmark. We used the Ziff-Davis
Benchmark Operation WebBench 2.0 benchmark to test Web server
performance. We tuned each operating system, file server, and Web server
according to available documentation and tuning parameters available in
published benchmarks. The Products Tested section gives the detailed
operating system tuning we used.

Although much has been written about the
performance and stability of Linux, Samba, and Apache, our tests show that
Windows NT Server 4.0 performs significantly faster and handles a much
larger load on enterprise class servers.

Looking at NetBench Results

The NetBench 5.01 benchmark measures file server
performance. Its primary performance metric is throughput in bytes per
second. The NetBench documentation defines throughput as "The number of bytes a client transferred to and from
the server each second. NetBench measures throughput by dividing the
number of bytes moved by the amount of time it took to move them. NetBench
reports throughput as bytes per second." We report throughput in
megabits per second to make the charts easier to compare to other
published NetBench results.

We tested file-sharing performance on Windows NT
Server 4.0 and Linux on the same system. We used Samba 2.0.3 to provide
SMB file sharing for Linux. Figure 3 shows the throughput we measured
plotted against the number of test systems that participated in each data
point.

Understanding how NetBench 5.01 works will help
explain the meaning of the NetBench throughput measurement. NetBench
stresses a file server by using a number of test systems to read and write
files on a server. A NetBench test suite is made up of a number of mixes.
A mix is a particular configuration of NetBench parameters, including the
number of test systems used to load the server. Typically, each mix
increases the load on a server by increasing the number of test systems
involved while keeping the rest of the parameters the same. We modified
the standard NetBench NBDM_60.TST test suite to increase the number
of test systems to 144 and the increment in test systems for each mix to
16 in order to test each product to its maximum performance level. The NetBench
Test Suite Configuration Parameters show you exactly how we configured
the test.

NetBench does a good
job of testing a file server under heavy load. To do this, each NetBench
test system (called a client in the NetBench documentation) executes a
script that specifies a file access pattern. As the number of test
systems is increased, the load on a server is increased. You need to be
careful, however, not to correlate the number of NetBench test systems
participating in a test mix with the number of simultaneous users that a
file server can support. This is because each NetBench test system
represents more of a load than a single user would generate. NetBench
was designed to behave this way in order to do benchmarking with as few
test systems as possible while still generating large enough loads on a
server to saturate it.

When comparing NetBench
results, be sure to look at the configurations of the test systems
because they have a significant effect on the measurements that NetBench
makes. For example, the test system operating system may cache some or
all of the workspace in its own RAM causing the NetBench test program
not to go over the network to the file server as frequently as expected.
This can significantly increase the reported throughput. In some cases,
weve seen reported results that are 75% above the available network
bandwidth. If the same test systems and network components are used to
test multiple servers with the same test suite configuration, you can
make a fair comparison of the servers.

With this background, let us analyze what the
results in Figure 3 mean (the supporting details for this chart are in NetBench
Configuration and Results). The three major areas to look at
are:

Peak
Performance

This tells you the maximum throughput you can
expect from a file server. NetBench throughput is primarily a function
of how quickly a file server responds to file operations from a given
number of test systems. So a more responsive file server will be able to
handle more operations per second, which will yield higher
throughput.

Shape of the
Performance Curve

How a product performs as a function of load is
perhaps the most meaningful information NetBench produces. If
performance drops off rapidly after the peak, users may experience
significant unpredictable and slow response times as the load on the
server increases. On the other hand, a product whose performance is flat
or degrades slowly after the peak can deliver more predictable
performance under load.

Where Peak
Performance Occurs

How quickly these products reach their peak
performance depends on the server hardware performance, the operating
system performance, and the test system performance. In this case, we
tested a fast server platform with significantly slower clients. This
test lab setup meant that small numbers of clients could not generate
enough requests to utilize the server processors fully. So the part of
the throughput performance curve to the left of the peak does not tell
us anything of interest. The performance curve after the peak shows how
a server behaves as it is overloaded.

File Server Performance Conclusions

Windows NT Server 4.0 is a high-performance file
server that helps users be more productive than a Linux/Samba file server
would. We base this conclusion on the following analysis:

The peak performance for Windows NT Server 4.0
was 286.7 Mbits/second at 112 test systems
while Linux/Samba reached a peak of 114.6
Mbits/second at 48 test systems. Thus, Windows NT Server reached a
peak performance level that was 2.5 times that of Linux/Samba. The test
results also show that Windows NT Server 4.0 is 43.5% faster than
Linux/Samba at 48 test systems. Only on a lightly loaded server, with 1
or 16 test systems, does Linux/Samba outperform Windows NT Server and
then by only 26%.

The shapes of the performance curves for both
Windows NT Server 4.0 and Linux/Samba indicate that we reached peak
performance and went beyond it. Performance for both Windows NT Server
4.0 and Linux/Samba degrades slowly as the load is increased past the
peak performance load. So both systems should deliver predictable
performance even under overload conditions.

The peak performance for Windows NT Server 4.0
occurs at 112 test systems while that for Linux/Samba occurs at 48 test
systems. This means that the Windows NT Server 4.0 can handle over 2.3
times the load of Linux/Samba while delivering significantly better
performance.

Looking at WebBench Results

In order to understand what the WebBench
measurements mean you need to know how WebBench 2.0 works. It stresses a
Web server by using a number of test systems (called clients in the
WebBench documentation) to request URLs. Each WebBench test system can be
configured to use multiple worker threads (threads for short) to make
simultaneous Web server requests. By using multiple threads per test
system, it is possible to generate a large enough load on a Web server to
stress it to its limit with a reasonable number of test systems. The other
factor that will determine how many test systems and how many threads per
test system are needed to saturate a server is the performance of each
test system.

The number of threads needed to obtain the peak
server performance depends on the speed of the test systems and the
server. Because of this, it is not meaningful to compare performance
curves generated using different test beds. However, it is meaningful to
compare the peak server performance measurements from different test beds,
as long as the true peak has been reached, because each server sees enough
requests from WebBench test systems to make it reach its maximum
performance level. In addition, it is meaningful to compare performance
curves for different servers based on the number of threads, not systems,
at each data point only if the same test bed is used. That is why our
graphs below show the number of test threads for each data point.

WebBench can generate a heavy load on a Web server.
To do this in a way that makes benchmarking economical, each WebBench
thread sends an HTTP request to the Web server being tested and waits
for the reply. When it comes, the thread immediately makes a new HTTP
request. This way of generating requests means that a few test systems
can simulate the load of hundreds of users. You need to be careful,
however, not to correlate the number of WebBench test systems or threads
with the number of simultaneous users that a Web server can support
since WebBench does not behave the way users
do.

The number of bytes per second that a Web
server sends to all test systems.

We tested both Web servers using the standard
WebBench zd_static_v20.tst
test suite, modified to increase the number of test systems to 144
and the increment in test systems for each mix to 16 in order to test each
product to its maximum performance level. This standard WebBench test
suite uses the HTTP 1.0 protocol without keepalives.

Figure 4 shows the total number of requests per
second for both Windows NT Server 4.0/IIS 4 and Linux/Apache 1.3.4. The
x-axis shows the total number of test threads used at each data point; a
higher number of threads indicate a larger load on the server. Figure 5
gives the corresponding throughput for each platform.

With this background, let us analyze
what the results in Figure 4 and Figure 5 mean (the supporting detail data
for these charts are in the WebBench Configuration and Results section).
As with NetBench, the three major areas to look at are:

Peak Performance

This tells you the maximum requests
per second that a Web server can handle and the peak throughput it can
generate. A more responsive Web server will be able to handle more
requests per second, which will yield higher throughput.

Shape of the Performance
Curve

The shape of the performance curve
shows how a Web server performs as a function of load. If performance
drops off rapidly after the peak, users may experience significant
unpredictable and slow response times as the load on the Web server
increases. On the other hand, a Web server that degrades performance
slowly after the peak will deliver more predictable performance under
load.

Where Peak Performance
Occurs

How quickly a Web server reaches its
peak performance depends on the performance of the server hardware, the
operating system, the Web server software, and the test systems. For
this report, we tested a fast server system with significantly slower
clients. This test bed setup meant that small numbers of clients could
not generate enough requests to utilize the server processors fully. So
the part of the performance curves to the left of the peak does not tell
us anything of interest. The performance curves after the peak show how
a server behaves as it is overloaded.

Web-Server Performance
Conclusions

Windows NT Server 4.0/IIS 4
significantly out-performs Linux/Apache 1.3.4 and provides much more
predictable and robust performance under heavy load. On a given large
workgroup or enterprise-class computer, Windows NT Server/IIS will satisfy
a much larger Web server workload than Linux/Apache will. We base these
conclusions on the following analysis:

The peak performance for Windows NT
Server 4.0/IIS 4 was 3,771 requests per
second at 288 threads while Linux/Apache 1.3.4 reached a peak
of 1,000 requests per second at 160
threads. Thus, Windows NT Server/IIS reached a peak performance
level that was almost 3.8 times that of Linux/Apache. Based on the
increasing performance for Windows NT Server/IIS from 256 to 288
threads, we believe that peak performance would have increased if we had
more test systems available to us.

The shapes of the requests per second
and throughput performance curves for Windows NT Server 4.0/IIS 4
indicate that we probably did not reach the maximum performance levels
possible with the Dell PowerEdge 6300 system. On the other hand, the
performance curves for Linux/Apache indicate that we did reach peak
performance and went beyond it. These results show very serious
performance degradation from 1,000 requests per second at 160 threads to
68 requests per second at 224 threads. Please see our comments in the
next section, Observations, for more information
about this.

The peak performance we measured for
Windows NT Server/IIS occurred at 288 threads while that for
Linux/Apache occurred at 160 threads. This means that the Windows NT
Server/IIS can handle over 1.8 times the load of Linux/Apache. In
addition, the test results show that Windows NT Server/IIS is 140%
faster than Linux/Apache at 160 threads, the peak for
Linux/Apache.

The comments in this section are based on
observations we made during the testing.

Linux Observations

The Linux 2.2.x kernel is not well supported
and is still changing rapidly. The following observations led us to this
conclusion:

We started the tests using Red Hat Linux 5.2
but had to upgrade it to the Linux 2.2.2 kernel because its Linux
2.0.36 kernel does not support hardware RAID controllers and SMP at
the same time. In addition, there are comments in the Red Hat Linux
5.2 source code noting that the SMP code is effectively Beta-level
code and should not be used at the same time as the RAID driver. For
this reason, we upgraded to the Linux 2.2.2 kernel, which has full
support for both hardware RAID controllers and SMP to be used
simultaneously. As of the date this report was written, Red Hat did
not ship or support a product based on the Linux 2.2.x kernel.

The instructions on how to update Red Hat
Linux 5.2 to the Linux 2.2.x kernel at the Red Hat Web site were
complete but require care from the user. It is quite possible to put
the system in a state where you must reload all software from scratch
since you need to recompile and reinstall the kernel.

We contacted Red Hat for technical support
after we saw that Linux was getting such poor performance. They told
us that they only provided installation support and that they did not
provide any support for the Linux 2.2.2 kernel.

We posted notices on various Linux and Apache
newsgroups and received no relevant responses. Also, we searched the
various Linux and Apache knowledge bases on the Web and found nothing
that we could use to improve the performance we were observing.

Linux kernels are available over the Internet
from www.kernel.org and its mirror sites. The
issue is that there are many updates to the kernel. For example, as of
the time of writing this report, we found the following kernel update
history:

Linux Kernel Version

Release Date

Linux 2.2.0

January 25, 1999

Linux 2.2.1

January 28, 1999

Linux 2.2.2

February 22, 1999

Linux 2.2.3

March 8, 1999

Linux 2.2.4

March 23, 1999

Linux 2.2.5

March 28,
1999

Linux performance tuning tips and tricks must
be learned from documentation on the Net, newsgroups, and
trial-and-error. Some tunes require you to recompile the kernel. We came
to this conclusion from the following observations:

The documentation on how to configure the
latest Linux kernel for the best performance is very difficult to
find.

We were unable to obtain help from various
Linux community newsgroups and from Red Hat.

We were unable to find any books or web sites
that addressed performance tuning in a clear and concise manner. At
best we found bits and pieces of information from dozens of sites.

Samba Observations

Samba was easy to set up for file sharing once
you spent a day or two learning how it fits with Linux. For people not
familiar with UNIX/Linux systems, it may take longer to do the
installation.

The documentation available with Samba and in
books is clear and easy to follow.

Apache Observations

Apaches performance on Red Hat Linux 5.2
upgraded to the Linux 2.2.2 kernel is unstable under heavy load. We came
to this conclusion from the following observation:

Performance collapses with a WebBench load
above 160 threads. We verified that the problem was with Apache, not
Linux, by restarting Apache at the 256 threads data point during a
WebBench test run. After the restart, Apache performance climbed back
to within 30% of its peak from a low of about 6% of the peak
performance.

We tried many configurations suggested in
Apache books and in comments in the Apache high performance
configuration file.

There were no error messages in the Web
server error log or operating system logs to indicate why Apache
performance collapsed.

Items in blue were modified from the standard
NetBench 5.01 NBDM_60.TST
test.

NetBench Test Suite Configuration Parameters

Parameter

Value

Comment

Ramp Up

30
seconds

This
is the amount of time at the beginning of a test mix during which
NetBench ignores any file operations that occur.

Ramp Down

30
seconds

This
is the amount of time at the end of a test mix during which NetBench
ignores any file operations that occur.

Length

660
seconds

The
total time for which NetBench will run a test. It includes both the
Ramp Up and Ramp Down times.

Delay

5
seconds

How
long a test system is to wait before starting a test after it is
told by the controller to start. Each test system will pick a random
number less than or equal to this value to stagger the start times
of all test systems.

Think Time

2
seconds

How
long each test system will wait before performing the next piece of
work.

Workspace

20
MB

The
size of the data files used by a test system, each of which has its
own workspace.

Save Workspace

Yes

The
last mix has this parameter set to No to clean up after the test is
over.

Number
of Mixes

10

Each mix tests the server with a different
number of test systems. Mix 1 uses 1 system, Mix 2 uses 16 systems,
and subsequent mixes increment the number of test systems by
16.

Number
of Clients

144

The
maximum number of test systems available to be used by any test mix.
The actual number of test systems that participate in a mix depends
on the number specified in the mix definition and whether an error
occurred to take a test system out of a particular mix.

Items in blue were modified from the standard WebBench
2.0 zd_static_v20.tst
test. This is a 100% static workload that uses HTTP 1.0 without
keepalives.

WebBench Test Suite Configuration Parameters

Parameter

Value

Comment

Ramp Up

30
seconds

This
is the amount of time at the beginning of a test mix during which
WebBench ignores any file operations that occur.

Ramp Down

30
seconds

This
is the amount of time at the end of a test mix during which WebBench
ignores any file operations that occur.

Length

300
seconds

The
total time for which WebBench will run a test. It includes both the
Ramp Up and Ramp Down times.

Delay

0
seconds

How
long a test system is to wait before starting a test after it is
told by the controller to start. Each test system will pick a random
number less than or equal to this value to stagger the start times
of all test systems.

Think Time

0
seconds

How
long each test system will wait before performing the next piece of
work.

Number
of Threads

2

The
number of worker threads used on each test system to make requests
to a Web server. The total number of threads in a mix is the number
of threads times the number of clients in that mix.

Receive Buffer

4096
bytes

The
size of the buffer WebBench uses to receive data sent from a Web
server.

% HTTP 1.0
Requests

100
%

The
percentage of HTTP requests that are made according to the HTTP 1.0
protocol. WebBench does not support keepalives for HTTP 1.0.

Number
of Mixes

10

Each mix tests the server with a different
number of test systems. Mix 1 uses 1 system, Mix 2 uses 16 systems,
and subsequent mixes increment the number of test systems by
16.

Number
of Clients

144

The
maximum number of test systems available to be used by any test mix.
The actual number of test systems that participate in a mix depends
on the number specified in the mix definition and whether an error
occurred to take a test system out of a particular mix.

Linux/Apache 1.3.4 on a Four-Processor Dell
PowerEdge 6300/400

The information in this publication is subject to
change without notice.

MINDCRAFT, INC. SHALL NOT BE LIABLE FOR ERRORS OR
OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES
RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL.

This publication does not constitute an
endorsement of the product or products that were tested. This test is not
a determination of product quality or correctness, nor does it ensure
compliance with any federal, state or local requirements.

The Mindcraft tests discussed herein were
performed without independent verification by Ziff-Davis and Ziff-Davis
makes no representations or warranties as to the results of the tests.