In this chapter, I will assume the following locations for the specified types of files:

Binaries and supporting files /usr/local/apache

Public files /var/www/htdocs (this directory is referred to
throughout this book as the web server tree)

Private web server or application data
/var/www/data

Publicly accessible CGI scripts
/var/www/cgi-bin

Private binaries executed by the web server /var/www/bin

Log files
/var/www/logs

Installation locations are a matter of taste. You can adopt any layout you like as long as you use it consistently. Special care must be taken when deciding where to store the log files since they can grow over time. Make sure they reside on a partition with enough space and where they won’t jeopardize the system by filling up the root partition.

Different circumstances dictate different directory layouts. The layout used here is suitable when only one web site is running on the web server. In most cases, you will have many sites per server, in which case you should create a separate set of directories for each. For example, you might create the following directories for one of those sites:

Before the installation can take place Apache must be made aware of its environment. This is done through the configure script:

Before the installation can take place Apache must be made aware of its environment. This is done through the script:

$ ./configure
–prefix=/usr/local/apache

The configure script explores your operating system and creates the Makefile for it, so you can execute the following to start the actual compilation process, copy the files into the directory set by the –prefix option, and execute the apachectl script to start the Apache server:

$ make
# make install
# /usr/local/apache/bin/apachectl start

Though this will install and start Apache, you also need to configure your operating system to start Apache when it boots. The procedure differs from system to system on Unix platforms but is usually done by creating a symbolic link to the apachectl script for the relevant runlevel (servers typically use run level 3):

# cd /etc/rc3.d
# ln
-s /usr/local/apache/bin/apachectl S85httpd

On Windows, Apache is configured to start automatically when you install from a binary distribution, but you can do it from a command line by calling Apache with the -k install command switch.

Testing the installation

To verify the startup has succeeded, try to access the web server using a browser as a client. If it works you will see the famous “Seeing this instead of the website you expected?” page, as shown in Figure 2-1. At the time of this writing, there are talks on the Apache developers’ list to reduce the welcome message to avoid confusing users (not administrators but those who stumble on active but unused Apache installations that are publicly available on the Internet).

Figure 2-1. Apache post-installation welcome page

As a bonus, toward the end of the page, you will find a link to the Apache reference manual. If you are near a computer while reading this book, you can use this copy of the manual to learn configuration directive specifics.

Using the ps tool, you can find out how many Apache processes there are:

Using tail, you can see what gets logged when different requests are processed. Enter a nonexistent filename in the browser location bar and send the request to the web server; then examine the access log (logs are in the /var/www/logs folder). The example below shows successful retrieval (as indicated by the 200 return status code) of a file that exists, followed by an unsuccessful attempt (404 return status code) to retrieve a file that does not exist:

The idea is to become familiar with how Apache works. As you learn what consti
tutes normal behavior, you will learn how to spot unusual events.

{mospagebreak title=Selecting modules to install}

The theory behind module selection says that the smaller the number of modules running, the smaller the chances of a vulnerability being present in the server. Still, I do not think you will achieve much by being too strict with default Apache modules. The likelihood of a vulnerability being present in the code rises with the complexity of the module. Chances are that the really complex modules, such as mod_ssl (and the OpenSSL libraries behind it), are the dangerous ones.

Your strategy should be to identify the modules you need to have as part of an installation and not to include anything extra. Spend some time researching the modules distributed with Apache so you can correctly identify which modules are needed and which can be safely turned off. The complete module reference is available at http:// httpd.apache.org/docs-2.0/mod/.

The following modules are more dangerous than the others, so you should consider whether your installation needs them:

mod_userdir

Allows each user to have her own web site area under the ~username alias. This module could be used to discover valid account usernames on the server because Apache responds differently when the attempted username does not exist (returning status
404
) and when it does not have a special web area defined (returning
403
).

mod_info

Exposes web server configuration as a web page.

mod_status

Provides real-time information about Apache, also as a web page.

mod_include

Provides simple scripting capabilities known under the name server-side includes (SSI). It is very powerful but often not used.

On the other hand, you should include these modules in your installation:

mod_rewrite

Allows incoming requests to be rewritten into something else. Known as the “Swiss Army Knife” of modules, you will need the functionality of this module.

mod_headers

Allows request and response headers to be manipulated.

mod_setenvif

Allows environment variables to be set conditionally based on the request information. Many other modules’ conditional configuration options are based on environment variable tests.

In the configure example, I assumed acceptance of the default module list. In real situations, this should rarely happen as you will want to customize the module list to your needs. To obtain the list of modules activated by default in Apache 1, you can ask the configure script. I provide only a fragment of the output below, as the complete output is too long to reproduce in a book:

$ ./configure –help...

[access=yes

actions=yes

alias=yes

]

[asis=yes

auth_anon=no

auth_dbm=no

]

[auth_db=no

auth_digest=no

auth=yes

]

[autoindex=yes

cern_meta=no

cgi=yes

]

[digest=no

dir=yes

env=yes

]

[example=no

expires=no

headers=no

]

[imap=yes

include=yes

info=no

]

[log_agent=no

log_config=yes

log_forensic=no]

[log_referer=no

mime_magic=no

mime=yes ]

[mmap_static=no

negotiation=yes proxy=no

]

[rewrite=no

setenvif=yes

so=no

]

[speling=no

status=yes

unique_id=no

]

[userdir=yes

usertrack=no

vhost_alias=no ]

…

As an example of interpreting the output,
userdir=yes
means that the module mod_userdir will be activated by default. Use the
–enable-module
and
–disable-module
directives to adjust the list of modules to be activated:

Obtaining a list of modules activated by default in Apache 2 is more difficult. I obtained the following list by compiling Apache 2.0.49 without passing any parameters to the configure script and then asking the httpd binary to produce a list of modules:

Now that you know your installation works, make it more secure. Being brave, we start with an empty configuration file, and work our way up to a fully functional configuration. Starting with an empty configuration file is a good practice since it increases your understanding of how Apache works. Furthermore, the default configuration file is large, containing the directives for everything, including the modules you will never use. It is best to keep the configuration files nice, short, and tidy.

Start the configuration file (/usr/local/apache/conf/httpd.conf) with a few general-purpose directives:

# location of the web server files
ServerRoot /usr/local/apache
# location of the web server tree
DocumentRoot /var/www/htdocs
# path to the process ID (PID) file, which
# stores the PID of the main Apache process
PidFile /var/www/logs/httpd.pid
# which port to listen at
Listen 80
# do not resolve client IP addresses to names
HostNameLookups Off

Setting Up the Server User Account

Upon installation, Apache runs as a user nobody. While this is convenient (this account normally exists on all Unix operating systems), it is a good idea to create a separate account for each different task. The idea behind this is that if attackers break into the server through the web server, they will get the privileges of the web server. The intruders will have the same priveleges as in the user account. By having a separate account for the web server, we ensure the attackers do not get anything else free.

The most commonly used username for this account is httpd, and some people use apache. We will use the former. Your operating system may come pre-configured with an account for this purpose. If you like the name, use it; otherwise, delete it from the system (e.g., using the userdel tool) to avoid confusion later. To create a new account, execute the following two commands while running as root.

These commands create a group and a user account, assigning the account the home directory /dev/null and the shell /sbin/nologin (effectively disabling login for the account). Add the following two lines to the Apache configuration file httpd.conf: