Login

Building Apache the Way You Want It

Have you ever wanted to customize Apache? This article will help you get started on building a customized version of Apache to suit your own needs. It is taken from chapter three of the book Pro Apache third edition, written by Peter Wainwright (Apress, 2004; ISBN: 1590593006).

APACHE COMES IN ready-to-install binary packages, which for many applications is perfectly adequate. Apache also comes with source code, making it possible to build the entire server from scratch with a suitable compiler.

However, even if you decide to start off by installing a binary distribution, all binary Apache distributions come with a complete copy of the source code included, so you can easily build a customized version of Apache to suit your particular needs.

In this chapter, you’ll look at the following topics:

Verifying the integrity of the Apache source archive

Building Apache from source

Customizing Apache’s default settings

Determining which modules are included

Building Apache as a dynamic server

Advanced building options

Building and installing modules with the apxs utility

Why Build Apache Yourself?

Given the additional work involved, it might seem redundant to go to the trouble of building Apache from source when perfectly good binary distributions are already available for almost any platform you might care to choose. However, there are advantages to building Apache yourself:

Changing the default settings to something more appropriate: Apache has built-in defaults for all configuration settings, including the server root, the location of the configuration file, and the document root. The only notable exception is the network port Apache listens to, which is required in Apache 2 to be defined by the configuration. By setting these at build time, you don’t have to specify them in the configuration, and you can ensure that Apache’s default settings are safe—that is, they don’t point anywhere you don’t want them to point.

The Apache installation process will also substitute your new settings into all scripts and configuration files automatically, so all the parts of the installation are internally self-consistent.

Optimizing the server for your platform: By compiling Apache on the platform where it’s to be installed, you can take advantage of the capabilities offered by the operating system or hardware that a prebuilt binary can’t take advantage of. For example, although any x86 processor will run a supplied Apache binary, you can build an optimized binary that takes full advantage of the newer processor features by using a compiler that’s aware of them. There’s no point in retaining a binary that’s built to work on a 386 processor when you can rebuild it to take advantage of a Pentium 4.

In addition, some platforms have facilities such as advanced memory mapping and shared memory capabilities that Apache can use to improve its performance if the build process detects them. Memory mapping allows data on disk to be presented as memory, making both read and write access to it much faster. Shared memory allows Apache to efficiently share information between different processes and is useful for many things, including in-memory caches of various information such as Apache’s scoreboard table of running processes.

The prebuilt Apache binaries can’t make assumptions about the availability of these features, so they have to take the lowest common denominator to maximize compatibility. The great thing about building from source is that Apache works out all of this itself, so all you have to do is start it off and sit back.

Choosing which modules to include: Apache is a modular server, with different parts of the server’s functionality provided by different sections, called modules. You can choose which of these modules you want and build a server precisely customized to your requirements with all extraneous features removed.

Apache can also build its modules statically into the server or load them dynamically when it’s run. For both static and dynamic servers, you can tell Apache exactly which modules you want to include and which you want to exclude. This allows you to add modules not normally included into the Apache binary, including third-party modules, or to remove modules that you don’t need, to make the server both faster and less prone to inadvertent misconfiguration.

Making changes to the source and applying patches: If no other module or feature supplies this ability, having the source code allows you to add to or modify Apache yourself. Some additional features for Apache don’t come as modules but as patches to the source code that must be applied before Apache is built. For example, adding SSL to older Apache distributions required this. Also, new patches for implementing or altering features that aren’t yet officially part of Apache appear all the time.

Assuming you’ve decided to build Apache, you’ll need a copy of the source, of course. Source code for both Apache 1.3 and Apache 2 is available from http://httpd.apache.org; use the Download link to retrieve a copy of either version from the closest mirror site. Depending on your needs, you can choose either the more venerable but well-tried Apache 1.3 or the newer and less-honed (but much more powerful) Apache 2. Apache 1.3 distributions have a name with the following form:

The configuration process is actually similar across the versions, even though the underlying setup has been fundamentally overhauled for Apache 2 (the devil is, of course, in the details). However, before you start the configuration process you should verify that your source is actually correct.

Verifying the Apache Source Archive

Before you proceed to customize and build your Apache source distribution, you should take the precaution to verify that what you have is what the Apache Software Foundation actually released. Although the chances may be slim, it’s possible your distribution has been modified or tampered with. Clearly this is less than ideal, so to make sure you have a correct and unmodified copy of the code, you can make use of two verification tools: md5sum and pgp/gpg.

Verifying with md5sum

md5sum (sometimes just md5) computes an MD5 checksum for a block of data—in this case, the source distribution—and writes it out as a 32-character string. The checksum algorithm is one-way, meaning you can’t derive any information about the original file from it, and although theoretically it’s possible for two different files to generate the same checksum, it’s very improbable. You can use md5sum to check that your file generates the same checksum as the official distribution:

If you retrieve the checksum from http://httpd.apache.org (not a mirror) by clicking the MD5 link next to the distribution you downloaded, you see that this is the correct value and that therefore your archive is also correct.

Verifying with PGP/GPG

Verifying the MD5 checksum is usually enough for most purposes, but to be absolutely safe, you can’t just check that the archive is correct but that it really was distributed by someone responsible. To do this, you can verify the signature of each archive using either PGP or the equivalent GPG. For more information on how these tools work and how to use them, see http://www.openpgp.org/ and http://www.gnupg.org/.

Verifying the signature of a file allows you to detect the slim but theoretically possible case where the Apache Software Foundation’s own Web site, or one of its mirrors, has been compromised or that someone has modified the information on an intermediate proxy so that the pages you’re accessing look like but actually aren’t the originals.

Assuming that you’re using GPG, you first download the public keys for the Apache developers. This is a file called KEYS and is available at http://www.apache.org/dist/ httpd/KEYS. You then import these keys into GPG with this:

$ gpg -import KEYS

Having done this, you download the signature file for the archive you want to verify. This has the same name as the archive file followed with an .asc extension and can be retrieved from http://http.apache.org by clicking the PGP link next to the distribution you downloaded. Save the signature file into the same place you have the archive and then verify that the two match with this:

$ gpg httpd-2.0.47.tar.gz.asc

If the signature agrees with the file, you get a number of messages ending with this (in the case of the previous example):

In this event, you shouldn’t proceed further with this archive. Better, you should notify the foundation that you’ve found a possibly compromised archive!

Assuming you got a good signature, you’re most of the way there. However, GPG doesn’t know who the signer was. You know the signature is good, but you don’t know that the person who created the signature is trusted. To do that, you need either to establish that the public key of the signer really does belong to him or to verify the public key of any other individual who has signed it with their private key.

You can take the first step by checking the fingerprint of the signer’s public key from the ID given on the first line and then importing the signatures of that key. You can find the fingerprint with this:

$ gpg -fingerprint DE885DD3

The bottom line of the output from this command should match the Primary Key fingerprint line you got from verifying the signature file. Now you can import the signatures for this key:

$ gpg –keyserver pgpkeys.mit.edu –recv-key DE885DD3

Now all you need to do is verify that any of those signatures are actually good; traditionally, the best way to do this is to meet someone face to face. If you can’t do that, you can import more signatures for each of the public keys until you get one for someone you can actually meet. Having done this, you can edit the verified key to tell GPG you trust it (and to what degree you trust it), which in turn will influence each key signed by it. This is called the Web of Trust, and you can find more information on how it works and how to enter it in the GNU Privacy Handbook at http://www.gnupg.org/ gph/en/manual.html#AEN335.

{mospagebreak title=Building Apache from Source}

Building Apache from the source is a relatively painless process. However, you need the following:

An ANSI C compiler: You can use the freely available gcc as an alternative if a native compiler doesn’t compile Apache properly. On most Unix platforms, gcc is the preferred choice; most Linux, BSD, and MacOS X servers have it installed as standard and aliased to the normal C compiler command cc. You can find it at http://www.gnu.org/software/gcc/gcc.html as an installable package for innumerable platforms.

Dynamic linking support: For Apache to be built as a dynamic server loading modules at runtime, the platform needs to support it. Some operating systems may need patches for dynamic support to work correctly. Otherwise, Apache must be built statically.

A Perl interpreter: Some of Apache’s support scripts, including apxs and dbmmanage, are written in Perl. For these scripts to work, you need a Perl interpreter. If Perl isn’t already installed, you can find binary downloads at http://www.cpan.org/ports/ for many platforms. The source code is also available for those wanting to build it themselves. Note that mod_perl isn’t required; these are external stand-alone scripts. (For administrators who want to use mod_perl, it’s worth updating Perl to the latest stable release, particularly if you want to use threads in mod_perl 2.)

Configuring and Building Apache

To build Apache from source and install it with the default settings requires only three commands: one to configure it, one to build it, and one to install it. Both Apache 1.3 and Apache 2 provide the configuration script configure to set up the source for compilation on the server and customize it with command line options to determine the structure and makeup of the eventual installation. You can control almost every aspect of Apache at this time if you want, including experimental code and optional platform-specific features.

The following command scans the server to find out what capabilities the operating system has (or lacks), determines the best way to build Apache, and sets up the source tree to compile based on this information:

$ ./configure

A large part of what the configuration process does is to examine the features and capabilities of the operating system. This information is derived through a series of tests and checks and can take quite some time.

The remainder of the configuration process is concerned with taking your customizations, supplied as command line options, and reconfiguring the source code accordingly. This allows you to enable or disable features, include additional components, or override default server locations and limits. For a list of all available options, you can instead use this:

$ ./configure –help

Once the Apache source is configured, you need to build and install it. The following commands build Apache and then install it in the default location of /usr/local/apache (apache on Windows, /apache on NetWare, and /os2httpd on OS/2):

$ make$ make install

For minimalists, you can also combine all three commands:

$ ./configure && make && make install

More usefully, you can change Apache’s installation root to something else, which will change both where the installation process puts it and the internal settings built into Apache. Apache then defaults to looking for the server root and its configuration files in that location. If you change nothing else, this is the one thing you might want to override because all of Apache’s default directory and file locations are based on this value. You can have Apache install into a different location by supplying a parameter to the configure command. For example:

$ ./configure –prefix=/opt/apache2

This command would cause make install to install Apache outside the /usr/local/ directory tree. On a Unix server, you’ll certainly need root permissions for the actual installation to proceed without a permissions error. However, you can still carry out the configure and build process up to the point of installation as an unprivileged user. For Apache 2, you can install Apache into a different directory than the one that it was configured with:

[2.0] $ make install DESTDIR=/opt/apache2

If you don’t have root privileges, and no friendly administrator is available to help you, you can instead use the –prefix parameter to have Apache installed under your own directory, for example:

Here you specify an installation directory where you do have write privileges. You also specify a port number above 1023; on a Unix server, ports up to this number are considered privileged and can’t be used by nonprivileged processes. A nonstandard installation root and port number are also handy for installing test versions of new releases of the server without disturbing the current server. You could also use the DESTDIR argument to make install.

In general, almost any aspect of Apache can be configured by specifying one or more parameters to the configure command, as you shall see throughout the rest of the chapter.

Apache 2 vs. Apache 1.3 Configuration Process

One of the many significant changes between Apache 1.3 and Apache 2 is the configuration process. Although on the surface the configure script behaves much the same as it used to, there are several differences, too, including many options that have changed name or altered slightly in behavior.

Apache 2 makes use of autoconf, a general-purpose tool for deriving and creating configuration scripts. The autoconf application creates scripts that are similar in spirit to the way that the configure script of Apache 1.3 worked, but they operate on a more generic and cross-platform basis. They’re also easier to maintain and extend; given Apache’s extensible nature, this is a critical requirement. The older Apache 1.3 script mimics a lot of the behavior of autoconf-generated scripts, which is why the two configure scripts have many similarities. However, the resemblance is skin-deep only.

autoconf implements the –enable and –disable options that switch on and off different packages within a source tree and the –with and –without options to configure features both within packages and external to the source tree:

–enable: In Apache, packages translate as modules. As a result, configuration options now exist to enable and disable each module within Apache. For instance, –enable-rewrite and –enable-rewrite=shared can enable mod_rewrite statically or dynamically. Modules may now be specified in a list rather than individually with options such as –enable-modules=”auth_dbm rewrite vhost_alias”, which wasn’t possible in Apache 1.3. The –with-layout option has changed to an –enable option, –enable-layout.

–with and –without: Features within modules and outside the source itself are enabled or disabled with –with and –without options. This covers a range of Apache features such as the server user and group, which change from –server-uid and –server-gid to –with-server-uid and –with-server-gid. This includes the suExec options, apart from –enable-suexec itself, for example, with –suexec-caller becomes –with-suexec-caller. Modules that rely on external features now enable them with options such as –with-dbm, –with-expat, –with-ldap, and –with-ssl.

Exceptions are mostly restricted to options that control the build type and base locations. As a result, many options have been renamed to fit with this scheme, and a few have also changed in how they work—for the most part, they become more flexible in the process.

To make it easier to see how options differ between the old and new configuration styles, I’ll present the various ways in which you can configure Apache’s build process and give examples for both Apache 1.3 and Apache 2. As well as making it easier for those wanting to migrate to Apache 2, it’s also friendlier to administrators who want to stick with Apache 1.3 for now but have an eye to the future.

Also, Apache developers can retrieve the current development code base using CVS. Three modules are needed in total: httpd-2.0 (the server itself ), apr (the Apache Portable Runtime), and apr-util (APR support utilities). The following series of commands will retrieve the complete Apache 2 source tree and place it under /home/admin/ apache2:

The next step is to configure the configuration process itself. For this to work, you must have current versions of autoconf and libtool installed. Both are projects of the Free Software Foundation and are commonly available as a package for most platforms; see the autoconf home page at http://www.gnu.org/software/autoconf/ for detailed information. Assuming you do have autoconf and libtool,you now execute this:

$ cd httpd-2.0$ ./buildconf

This should generate the configure script, which you can then use to actually configure the source code for building.

Curious administrators who don’t need to live quite so close to the cutting edge can also retrieve daily snapshots of the CVS source tree from http://cvs.apache.org/snapshots/. As with the CVS repository, you need the snapshots for httpd-2.0, apr, and apr-util.

NOTE Retrieving source via CVS is strongly discouraged for anyone who doesn’t want to actually assist with Apache development: The active source tree doesn’t pretend to even be a development release, never mind a stable one. For those who do want to help out, you can find more information athttp://www.apache.org/ dev/.

{mospagebreak title=General Options}

Before plunging into detail about various configuration parameters, it’s worth pointing out a few options that adjust the overall configuration process or provide information about the process and about configure itself. Some of these options are unique to Apache 1.3, and others are new in Apache 2 (see Table 3-1).

Table 3-1. Apache 2 configureScript Options

Option

Description Compatibility

–help

Prints out a complete list of the configuration

parameters and their allowed parameters and

permutations, along with their default settings.

Because new options appear from time to time, it’s

well worth printing out a copy of this output for

reference, or even saving it:

$ ./configure –help > configure.help

You can find some additional options for

Apache 2 by running the

configure scripts in the

subdirectories of

srclib: apr, apr-util, and pcre.

–cache-file=FILE

Specifies an alternative name for

config.cache, Apache 2 only

which stores the results of the operating system

analysis performed by

configure and that’s used

by

config.status mentioned previously.

–quiet

Suppresses most of

configure’s output. Mostly useful

–silent

for driving

configure from another application such

as a GUI configuration tool and for administrators

who just want to know when it’s all over.

–no-create

Goes through the configuration process and Apache 2 only

produces the usual messages on the screen but

doesn’t actually create or modify any files.

–show-layout

Displays the complete list of locations that Apache Apache 1.3 only

will use to install and subsequently look for its various

components. This includes the server root, document

root, the names and locations of the configuration file

and error log, where loadable modules are kept, and

so on. It’s useful for checking that directives to change

Apache’s locations are working as expected.

Table 3-1. Apache 2

configureScript Options (Continued)

Option

Description

Compatibility

–srcdir

The location of the source code, in the event you’re using a

configure script located outside of the distribution.

Apache 2 only

–verbose

Produces extra long output from the configuration

Apache 1.3 only

process.

–version

Displays the version number of

autoconf that was used to create the configure script itself.

Apache 2 only

Setting the Default Server Port, User, and Group

A number of the configuration directives that you have to set in httpd.conf can be preset using a corresponding option at build time. In particular, you can set the user and group under which the server runs and the port number to which it’ll listen. What makes these different from other values that might seem to be just as important—for example, the document root—is that, for varying reasons, it may be particularly useful or even necessary to specify them in advance.

The port is a required setting of Apache 2; the server won’t start without it being explicitly configured. Specifying it at build time will cause Apache to add a corresponding line into httpd.conf. Without it, you’ll need to edit httpd.conf, so if you want to create an immediately usable Apache, you must set it here, too. You can change it with the –with-port option, for example:

$ ./configure –with-port=80 …

In Apache 1.3 the default port number in httpd.conf is set to 80 automatically, so this option needn’t be specified for a server running on the standard HTTP port. Of course, editing httpd.conf to add or change it isn’t a great burden, either.

The user and group affect the suExec wrapper (covered shortly), so you must set them at build time. In Apache 1.3 you use the –server-uid and –server-gid options to do this. The corresponding Apache 2 options are –with-server-uid and –with-server-gid. For example, to set the user to httpd and the group to the httpd group, you’d use the following for versions 1.3 and 2, respectively:

You can set a number of other directives in httpd.conf that relate to the locations of things such as the document root and the default CGI directory by overriding parts of Apache’s default layout. I cover this in detail in the “Advanced Configuration” section a little later in the chapter.

Determining Which Modules to Include

Other than optimizing Apache for the platform it’s to run on, the most useful aspect of building Apache is to control which modules are included or excluded. If you want to build a dynamic server, it’s often simpler to build everything dynamically and then subsequently weed out the modules you don’t want from the server configuration. However, choosing modules at build time is essential for static servers because you can’t subsequently change your mind. It can also be useful on platforms where some modules simply won’t build and you need to suppress them to avoid a fatal error during compilation.

Apache will build a default subset of the modules available unless you tell it otherwise; you can do this explicitly by naming additional modules individually, or you can ask for bigger subsets. You can also remove modules from the list, which allows you to specify that you want most or all of the modules built, and then make exceptions for the modules you actually don’t want.

You specify module preferences using the enable and disable options. The syntax of these is one of the areas where the Apache 2 and Apache 1.3 configure scripts differ quite significantly. In fact, it’s likely to be one of the bigger problem areas for administrators looking to migrate. However, although the Apache 2 syntax is more flexible, it still offers essentially the same functionality. To simplify the task for administrators who are looking to migrate, I’ll present the various ways you can control the module list with examples for both versions of Apache.

I’ll also tackle building Apache as a fully static server first before going on to building it as a fully or partly dynamic server.

Enabling or Disabling Individual Modules

Apache 1.3 provides the generic –enable-module option, which takes a single module name as a parameter. To enable mod_auth_dbm and add it to the list of modules that will be built, use this:

[1.3] $ ./configure –enable-module=auth_dbm

Apache 2 replaces this with a more flexible syntax that provides a specific option for each available module. To fit in with the naming convention for options, module names with underscores are specified with – (minus) signs instead, so the previous command in Apache 2 becomes either one of these:

As mentioned earlier, in Apache 2 you can use the new –enable-modules option along with a list of module names. Notice the plural—this isn’t the same as Apache 1.3’s –enable-module. Unfortunately, there’s as yet no equivalent –disable-modules option. So to enable DBM authentication, URL rewriting, and all the proxy modules but disable user directories, as-is responses, and the FTP proxy, you could use this:

This adds mod_auth_dbm, mod_rewrite, and mod_proxy and then removes mod_userdir, mod_asis, and mod_proxy_ftp (which was enabled when you enabled mod_proxy).

You can obtain the list of available modules, and their default state of inclusion or exclusion, from the configure –help command. This produces an output that includes the following section (this particular example is from an Apache 1.3 distribution because Apache 2’s configure doesn’t yet provide this information):

This tells you the names of the modules to use with configure and which modules Apache will build by default. Unfortunately, it doesn’t take account of any other parameters you might add, so you can’t use it to check the result of specifying a list of enable and disable options.

Once you’ve run the configure script and set up the source code for compilation and installation, you can build and install Apache with this:

$ make# make install

Once this is done, you can go to the Apache installation directory, make any changes to httpd.conf that you need—such as the hostname, port number, and server administrator—and start up Apache. You could also just type make install, but doing it in two steps will allow you to perform a last-minute check on the results of the build before you actually install Apache.

{mospagebreak title=Enabling or Disabling Modules in Bulk}

The –enable-module option of Apache 1.3 and the –enable-modules option of Apache 2 have in common two special arguments that allow you to specify a larger group of modules without listing them explicitly:

most expands the default list to include a larger selection of the modules available.

all goes further and builds all modules that aren’t experimental or otherwise treated specially.

Tables 3-2 and 3-3 contrast the modules that are built at the default, most, and all levels, for Apache 1.3 and Apache 2, respectively. Each column adds to the previous one, so the All column contains only those modules that are additional to most. The Explicitly Enabled column details all modules that aren’t included unless explicitly enabled, along with a subheading detailing why.

Some experimental modules are likely to be added to the list of All modules at some point in the future; in Apache 2 this includes mod_file_cache, mod_proxy, and mod_cache. These modules have been listed under Additional rather than Experimental in the Explicitly Enabled column to differentiate them from truly experimental or demonstration modules such as mod_example, which should never be encountered in a production-server environment.

Table 3-2 is the list for Apache 1.3.

Table 3-2. Apache 1.3 Included Modules

Default

Most

All

Explicitly Enabled

mod_access

mod_auth_anon

mod_mmap_static

mod_auth_digest

mod_actions

mod_auth_db

(mod_so)

mod_alias

mod_auth_dbm

Obsolete

:

mod_asis

mod_cern_meta

mod_auth_db

mod_auth

mod_digest

mod_log_agent

mod_autoindex

mod_expires

mod_log_referer

mod_cgi

mod_headers

Example and Experimental

mod_dir

mod_info

mod_example

mod_env

mod_mime_magic

mod_imap

mod_proxy

mod_include

mod_rewrite

mod_log_config

mod_speling

mod_mime

mod_unique_id

mod_negotiation

mod_usertrack

mod_setenvif

mod_vhost_alias

(mod_so)

mod_status

mod_auth_db is deprecated in favor of mod_auth_dbm, which now also handles Berkeley DB. mod_log_referer and mod_log_agent are deprecated in favor of mod_log_confug. mod_so is automatically enabled if any other module is dynamic; otherwise, it must be enabled explicitly in Apache 1.3 (Apache 2 always enables it even if all modules are static). mod_mmap_static is technically experimental but stable in practice. mod_auth_digest replaces mod_digest; only one of them may be built.

Table 3-3 is the corresponding list for Apache 2.

Table 3-3. Apache 2 Included Modules

Default

Most

All

Explicitly Enabled

mod_access

mod_auth_anon

mod_cern_meta

mod_cache

mod_actions

mod_auth_dbm

mod_mime_magic

mod_disk_cache

mod_alias

mod_auth_digest

mod_unique_id

mod_mem_cache

mod_asis

mod_dav

mod_usertrack

mod_charset_lite

mod_auth

mod_dav_fs

mod_deflate

mod_autoindex

mod_expires

mod_ext_filter

mod_cgi/mod_cgid

mod_headers

mod_file_cache

mod_dir

mod_info

mod_isapi (Windows only)

(Continued)

Table 3-3. Apache 2 Included Modules (Continued)

Default

Most

All

Explicitly Enabled

mod_env

mod_rewrite

mod_ldap

mod_http

mod_auth_ldap

mod_imap

mod_logio

mod_include

mod_proxy

mod_log_config

mod_proxy_connect

mod_mime

mod_proxy_ftp

mod_negotiation

mod_proxy_http

mod_setenvif

mod_ssl

mod_so

mod_suexec

mod_status

Example and Experimental

:

mod_userdir

mod_bucketeer

mod_case_filter

mod_case_filter_in

mod_echo

mod_example

mod_optional_hook_export

mod_optional_hook_import

mod_optional_fn_import

mod_optional_fn_export

Either mod_cgid or mod_cgi is built automatically depending on whether the chosen MPM is threaded. Modules shown indented depend on the module before them; that is, mod_mem_cache requires mod_cache.mod_logio requires mod_log_config.mod_deflate and mod_ssl require external libraries to be built; mod_file_cache, mod_cache, and mod_proxy may migrate to the all list in time. mod_http provides basic HTTP support; it may be removed for those intending to use Apache as a framework for a generic protocol server, but should be kept in general use.

Options concatenate so you can specify both general and specific options together to get the mix of modules you want with the minimum of effort. For example, in Apache 1.3, you can specify this:

Deciding what to include and exclude is made simpler if you choose to build a dynamic server because, as I mentioned earlier, you can simply choose to build all the modules and sort them out later. Once the server is installed, you can then comment out the directives in the configuration for the modules you don’t want. This only breaks down if a module can’t be built dynamically or you want to explicitly disable mod_so (because by definition it can’t be dynamic and load itself ). If you change your mind later, you just uncomment the relevant directives and restart Apache.

Building Apache As a Dynamic Server

Building Apache as a purely dynamic server is almost as simple as building it as a static server, though the options again differ between Apache 1.3 and Apache 2. The approach taken also differs subtly. In Apache 1.3, you have to both enable a module and specify that it’s to be built dynamically; neither implies the other, so if you specify that a module is dynamic without also enabling it, it won’t get built. In Apache 2, if you ask for it to be built dynamically, it’ll also automatically be enabled, which is more reasonable.

The option to create modules dynamically in Apache 1.3 is –enable-shared. This takes the name of a module as an argument or the special value max, which takes the list of enabled modules and makes dynamic any of those modules that are capable of being loaded dynamically. The following is fairly typical of how the majority of Apache 1.3 servers are built:

[1.3] $ ./configure –enable-module=all –enable-shared=max

You could also have used most or left out the –enable-module option if you wanted fewer modules compiled. max operates on the result of the combination of all the –enable-module options you specify; it doesn’t imply a particular set of modules in and of itself.

Apache 2 does away with separate options and instead provides a single combined option, –enable-mods-shared, which both enables and makes dynamic the module or list of modules you specify. As with –enable-modules, you can also specify most or all. The equivalent to the previous command in Apache 2 is this:

[2.0] $ ./configure –enable-mods-shared=all

More interestingly, you can also keep Apache a primarily static server but make one or two modules dynamic, or vice versa, so you can have the best of both worlds. This allows you to take advantage of the slightly increased performance benefits of binding a module statically into the server but at the same time keeping optional features external to the server so that they can be removed to make Apache more lightweight without it.

For example, mod_rewrite is very powerful, but it’s large, so if you’re not sure whether you’ll actually be using it, you can only make it dynamic with this (depending on what else you’ve enabled):

Interestingly, you can also make a module shared by giving the –enable option the value shared, so you could also have said this:

[2.0] $ ./configure –enable-modules=all –enable-rewrite=shared …

Even if you choose to build Apache statically, you can still include the mod_so module to allow dynamically loaded modules to be added to the server later. In Apache 1.3 you do this by enabling all modules (which includes mod_so), making any module dynamic, or explicitly add mod_so with this:

[1.3] $ ./configure –enable-module=so …

Conversely, Apache 2 automatically includes mod_so even in a fully static server. If you don’t want it, you must explicitly disable it using any of these:

Alternatively, you can build a mostly dynamic server with one or two modules built statically. It’s rare that you wouldn’t want mod_access or mod_mime not built into the server, so you can make them static with this:

To do the same for Apache 2, you make use of the last argument you can give to an –enable option: static. This is the only way to reverse the status of a module from dynamic back to built-in because there are no such options as –disable-mods-shared or –enable-mods-static in Apache 2 (at least not currently):

When Apache is built from this configuration, it automatically inserts the directives to load dynamic modules, saving you the trouble of doing it yourself. It is, therefore, convenient to build modules even if you don’t really need them. You can comment out the enabling directives and uncomment them later should you change your mind.

{mospagebreak title=Changing the Module Order (Apache 1.3)}

One of the consistent bugbears of configuring Apache 1.3 servers is getting the loading order of the modules correct. This is because the order in which the modules are loaded into the server determines the order in which the Apache core calls them to handle requests.

Apache 2 does away with all of this by allowing modules to specify their own ordering preferences. As a result, the ordering problem no longer exists in Apache 2, and neither do the compile-time options to change it.

For administrators still using Apache 1.3, it remains a problem, so the configure script provides options to allow you to reorder modules. With a dynamic server, this step isn’t essential because you can change the order of modules just by changing the order of the LoadModule directives that load them:

However, getting the order right at compile time will result in an httpd.conf with the modules already in the correct order. A static server has no such ability, so the build process is the only chance you get to determine the default running order.

Because of this, Apache 1.3 provides the ClearModuleList and AddModule directives. These allow the running order of modules to be explicitly defined, overriding the order in a static Apache server and the LoadModule directives of a dynamic server:

The loading order is important because modules listed later in the configuration file get processed first when URLs are passed to modules for handling. If a module lower down the list completes the processing of a request, modules at the top of the list will never see the request, and any configuration defined for them will not be used. This is significant for two particular cases:

Authentication modules such as mod_auth, mod_auth_anon, and mod_auth_dbm are processed in the opposite order to the order in which they’re loaded. If you want DBM authentication to be applied first, you need to ensure that it’s loaded last. One reason for doing this is to authenticate most users from a database but allow Apache to fall back to basic authentication for one or two administrators so they can still gain access in the event the database is damaged or lost.

Aliasing and redirection modules such as mod_alias, mod_vhost_alias, mod_speling, and mod_rewrite modify URLs based on the inverse of the order in which they’re loaded. For this reason, mod_vhost_alias usually loads first, so the others have a chance to act. In addition, for mod_rewrite to be used together with mod_alias, mod_alias must be loaded first.

By looking at the configuration file, you can see the running order for dynamic servers. For static servers, httpd -l lists the modules in the server in the order they’re loaded. Servers can have both types of modules, in which case the static modules load first and the dynamic ones second. You can override this by using ClearModuleList, followed by AddModule directives to reconstruct the module order. This is also the only way to make a static module come before a dynamic one.

You can alter the running order of modules at build time with the –permute-module option. This takes a parameter of two module names, which are then swapped with each other in the loading order:

$ ./configure –permute-module=auth:auth_anon

This causes mod_auth (which is normally loaded before mod_auth_anon) to be loaded after it, so you can perform file-based authentication before anonymous authentication. Because Apache simply swaps the modules without regard the position of any other modules, this is most useful when modules are adjacent or near each other, such as the various authentication modules. This also means that the previous is equivalent to this:

$ ./configure –permute-module=auth_anon:auth

Alternatively, you can also specify two special tokens to move a module to either the start of the list or the end:

Both these examples are real-world ones. mod_vhost_alias should usually be the last of the aliasing modules to be polled as outlined previously, and mod_setenvif should usually be one of the first, so it can set variables in time for other modules to take notice of them.

The END syntax is handy for ensuring that a module comes after another, irrespective of their original loading order. However, it’ll also cause the module to be processed before all other modules, which might have unforeseen side effects. Likewise, using BEGIN to move a module to the beginning of the list will cause it to be processed last, which will cause problems if it needs to operate before another module.

Fortunately, Apache’s default order is sensible and rarely needs to be modified, so use of this option is thankfully rare, and it’s eliminated entirely in Apache 2. In a case when it’s important to move a lot of modules around, it’s almost always simpler to ignore the build-time configuration and use the ClearModuleList and AddModule directives instead.

However, only modules added with an AddModule directive after ClearModuleList will be available to the server, irrespective of whether they’re present in the running executable. Although you can simply remove a dynamic module, this is essentially the only way for you to disable a statically linked module. You still incur the cost of carrying the inactive module as a dead weight within the running server, however.

Checking the Generated Configuration

You can capture the running commentary output by the configure script to a file for later analysis. The following command on a Unix platform lets you view the output while the configure process runs and also records the output in a file:

$ ./configure | tee configure.output

It’s good to check the output from the configure script before proceeding to the compilation stage—it’ll tell you which modules are available, and of those modules, which ones you’ve chosen to build and which you have chosen to ignore.

It’ll also tell you whether Apache has found system libraries for certain features or has defaulted to using an internal version or disabling the feature—this can have potentially important ramifications for the availability of features in the server that aren’t always apparent otherwise. You may not immediately realize that mod_rewrite isn’t built to use DBM database maps because it didn’t find a DBM implementation to use unless you review the output of the configuration process.

Similarly, Apache requires and comes bundled with a cut-down version of the Expat XML parser. If you already have Expat installed, Apache will use it in preference to building its own copy. This can be important because some third-party modules and frameworks can sometimes behave unreliably if Apache uses a different Expat from the rest of the system. The solution is simple: If you do have Expat but in a directory Apache doesn’t expect, then you need to add the –with-expat= <directory> option.

See the section “Configuring Apache’s Library and Include Path” later in the chapter for more on this and other features that may require additional coaxing including DBM, LDAP, and SSL support.

Both Apache 1.3 and Apache 2 store their default configuration information in a template file and generate an active configuration from it according to the options you request and the capabilities of the operating system. Although you don’t need to worry about this in most cases, it can be handy to refer to the files containing the generated configuration both to check the results and also to redo the configuration if need be.

The Generated Configuration in Apache 1.3

Apache 1.3 holds most of its default configuration in a file called Configuration in the src directory, and it’s still possible, although largely deprecated in these more modern times, to configure Apache by editing it. Examining it can also be educational as a glimpse into how Apache is put together, even if you choose not to build Apache yourself.

The Apache 1.3 configuration process results in a file called src/Configuration.apaci (so as not to overwrite src/Configuration) and then uses a second script, src/Configure, to produce the files necessary for building Apache based on its contents. You use this script directly if you edit the Apache 1.3 Configuration file by hand; you can also use it to set up the Apache source without altering the configuration settings.

The Generated Configuration in Apache 2

Apache 2 has an almost completely overhauled build configuration system that’s much more easily maintained, as well as being more extensible and similar to other packages that also use autoconf. Despite this, it behaves almost exactly like the familiar APACI configuration interface of Apache 1.3. As a result, most aspects of configuring an Apache source distribution are the same for either version of the server, but there are just enough differences to be inconvenient.

Apache 2 holds most of its default configuration in a file called configure.in, which is presupplied, but it can be regenerated using the buildconf script provided. However, this should only ever be necessary for developers who are pulling the Apache source code from the code repository via CVS. This file isn’t designed for editing— Apache 2 intends to determine as much as possible from the platform and the rest from arguments to the configure script, rather than encourage hand editing of preset values.

The Apache 2 configuration process generates build instructions throughout the source tree and additionally creates some useful files that allow you to check and re-create the results:

config.nice: This file contains a formatted multiline command line for rerunning the configure script using the same arguments you passed to it the first time. This makes it an executable script that allows you to repeat your steps, as well as provide a convenient way to modify and store your configuration parameters.

config.log: This file contains any messages that were produced by the compiler during the various build checks. Usually, it’ll contain nothing except a few line number references from configure. These are intended to prefix the different points at which messages might arise, should any appear.

config.status: This file contains a script that performs the actual configuration of the source tree using the options that were specified by configure but without analyzing the operating system for the build criteria, stored in config.cache.

{mospagebreak title=Building Apache from Source As an RPM (Apache 2)}

Recent releases of the Apache 2 source distribution contain an RPM .spec file that you can use to build Apache as an installable RPM. The httpd.spec file contains all the details needed to build a default Apache installation and can be used without even unpacking the source archive with this:

$ rpm -tb httpd-2.0.47.tar.gz

This tells the RPM tool to look inside a .tar file for the .spec file (-t) and to build only binary RPMs (-b). Configuration and building of Apache takes place automatically and should result in four RPM files being generated in /usr/src/packages/RPMS/i386 (this path may vary according to Linux distribution and processor architecture):

httpd-2.0.47-1.rpmhttpd-devel-2.0.47-1.rpmhttpd-ssl-2.0.47-1.rpm

For the build to be successful, you’ll need to have several other packages installed first, notably perl and findutils, but the build prerequisites also include pkgconfig, expat-devel, db3-devel, and openldap-devel. To get SSL support, you also need openssl-devel. Each of the -devel packages in turn needs its parent package, and these in turn may have other dependencies.

This is, however, misleading: It’s possible you might need all of these packages, but you can also build an RPM package that doesn’t. You can—and should—edit the httpd.spec file to eliminate dependencies you don’t require. For example, openldap is needed only if you want mod_ldap, and mod_auth_ldap.pkgconfig is needed only on some Linux distributions. Likewise, db3 is needed only if you want mod_auth_dbm to be able to handle Berkeley DB format databases. Similarly, expat-devel is needed only if you want Apache to build with an existing Expat XML parser installation; otherwise, it’ll happily build using the cut-down version that’s included in the Apache source.

These dependences exist because the build instructions in httpd.spec include the modules that require them, but you can remove unnecessary packages from the BuildReq: line so long as you also remove the dependant modules. At the same time, if you want to ensure that the optional httpd-ssl package is built (in other words, make it mandatory), you add openssl-devel:

Within the httpd.spec file is a configure command, which is most easily locatable by searching for a –prefix command line argument. You can customize this using all the criteria and strategies I’ve already discussed for a stand-alone configure command, with the advantage of creating an installable package as the end product. You should remove any part of the command you don’t need—particularly modules that you don’t need but also directory locations. You should also add a layout to provide the basic structure of the installed server. You can even merge the SSL package into the main server, if you want to eliminate it as a separately installable component.

Because you want to edit the .spec file, it’s easiest to extract it singly from the archive and then build it using the -b option to RPM. This requires that the original archive file is present where rpm looks for it, in the SOURCES subdirectory of the RPM system:

If all goes well, this should generate Apache RPMs built according to your precise specifications, including whichever modules and features you want included. You can store the httpd.spec file somewhere safe and reuse it any time to regenerate your Apache setup. As new releases of Apache are made available, you can move your changes into the new httpd.spec (assuming it has changed) with a minimum of fuss and build those according to the same criteria as before.

{mospagebreak title=Advanced Configuration}

The configuration options you’ve considered so far are enough for many purposes, and certainly sufficient for setting up a test server. However, there are many more advanced options at your disposal, ranging from the useful to the curious to the downright obscure. This is especially true of Apache 2, which provides many autoconf-derived options that, although available, aren’t actually that useful in Apache.

Of these options, the most immediately useful are the layout options that determine both where Apache’s files are installed and where Apache expects to find them by default. Other advanced features include the build type for cross-platform builds, platform-specific rules, and locating external packages required by some of Apache’s own features.

Configuring Apache’s Layout

You’ve already seen how –prefix defines the installation root for Apache. However, configure allows several more options to customize the location of Apache’s files in detail.

Choosing a Layout Scheme

The default layout for Apache consists of an installation path in /usr/local/apache, with the various other directories placed underneath. However, it’s possible to completely configure the entire layout. To make life simple, the configure script accepts a named layout defined in a file called config.layout that’s supplied with the Apache source distribution. This contains many alternative layouts that can be chosen by specifying their name on the configure command line:

From this it’s clear which values control which locations and how the various values depend on each other; the installbuilddir and errordir locations are new to Apache 2, but otherwise the locations understood by the two versions are identical. The default layout in Apache 1.3 differs from the previous Apache 2 layout only in the name of the libexec directory; it’s $exec_prefix/libexec in Apache 1.3.

There are ten other layouts defined in config.layout. Note that case is important and that GNU is a valid parameter, but gnu or Gnu aren’t. Table 3-4 details the available layouts along with their main installation prefix (though many of them adjust specific locations in addition).

Table 3-4. Layout Choices

Layout

Description

GNU

Installs files directly into subdirectories of

/usr/local rather than in a separate /usr/local/apache directory. The httpd binary thus goes in /usr/local/bin and the manual pages in /usr/local/man.

Installation paths for MacOS X (a.k.a. Darwin). This is the consumer version found on desktop machines, as opposed to the server edition, and has a significantly different layout (prefix

/usr).

RedHat

Installs files in the default locations for RedHat Linux. This is typically used in the construction of RPM packages for RedHat and is also suitable for RedHat-based distributions such as Mandrake Linux (prefix

/usr).

beos

Installation paths for the BeOS operating system (prefix

/boot/home/ apache).

SuSE

Installs files in the default locations for SuSE Linux. This is typically used in the construction of RPM packages for SuSE and is also suitable for UnitedLinux distributions (prefix

In addition to the standard layouts previously, there’s also one special layout, BinaryDistribution. This is provided to build Apache for packaging and distribution. The distribution may then be unpacked and installed on the target machine or machines, with the installation root chosen at the time of installation. All other locations are defined as relative directories. This allows you to create an archive containing your own complete custom Apache. You can then unpack and install it into the correct location on multiple machines.

Because the creation of a binary distribution is more involved than a straightforward build and install, Apache 2 provides the binbuild.sh and install-bindist.sh scripts, located in the build directory under the top source distribution directory, to help you do it. To use binbuild.sh, you first need to edit it and modify the configure options defined in CONFIGPARAM to build the Apache server you want (don’t change the layout from BinaryDistribution). Then run the following from the top directory of the source distribution:

$ ./build/binbuild.sh

This will configure and build Apache as a binary distribution and then package it into an archive named for the Apache version and target host. For example, on a Pentium III Linux server, the resulting archive would be called as so:

httpd-2.0.46-i786-pc-linux.tar.gz

You also get a readme file explaining how the archive was built:

httpd-2.0.46-i786-pc-linux.README

These files appear next to the unpacked source distribution—that is, the directory above where you actually ran binbuild.sh. You can now transfer and unpack the archive onto any Linux server on which you want to install Apache. After unpacking it, you use the install-bindist.sh script. This takes one argument, the server root where Apache is to be installed, for example:

$ ./install-bindist.sh /usr/local/apache_dist

You can also run this script directly from the source directory where you ran binbuild.sh if you want to install the distribution on the same host.

This will copy and set up the Apache distribution so that it’s configured to run from the specified directory. Once this is done, you can dispense with the original unpacked archive. The default server root, if you don’t specify one, is /usr/local/apache2; this can be changed by editing DEFAULT_DIR at the same time as CONFIGPARAM before you run binbuild.sh. Note that install-bindist.sh is itself generated by binbuild.sh and doesn’t exist except in the build directory of the archives generated by it.

Another file that’s generated by binbuild.sh is the envvars file located adjacent to apachectl in the selected location for executables. apachectl reads envvars to determine the correct environment to start Apache with. For a binary distribution, this typically involves adding additional shared library paths to LD_LIBRARY_PATH (or a similar variable, depending on the platform) so that Apache can find dynamic modules. This is a necessary step because Apache’s installation directories weren’t known at the time you built it for distribution. For a normal undistributed installation, this file contains no active definitions. An original unmodified version of this file is also provided as envvars-std, for reference, if you change envvars.

Adding and Customizing Layouts

It’s also possible to add your own custom layouts to the file by adding a new definition with the name of your layout, for example:

# My custom Apache layout<Layout AlphaComplex> … locations …</Layout>

Although it’s not used in the default Apache layout, you can also use the special suffix + on locations to indicate that the name of the server (as defined by –target or –with-program name in Apache 1.3 and 2, respectively) should be added to the end of the path. For example, in the Darwin layout, you find this definition for the log directory:

logfiledir: ${localstatedir}/log+

As localstatedir is set to /var in the Darwin layout, this means that (with a program name of osxhttpd) Apache’s log files will, in this layout scheme, be located here:

/var/log/osxhttpd

If you don’t want to edit the supplied layout.conf file, you can instead use your own file by prefixing the filename to the layout name:

You can also specify a layout file outside the Apache source distribution if you want. This makes it easy to maintain a local configuration and build successive Apache releases with it.

The alternative to defining your own layout is to specify each of the layout paths on the command line with individual options. The approach you choose depends for the most part on how many locations you want to change.

You can check the effect of a layout scheme with the –show-layout option. This causes the configure script to return a list of the configured directories and defaults instead of actually processing them, for example (using an Apache 1.3 source distribution):

Each of Apache’s locations can also be set individually, including a few rare ones that aren’t (at least, as yet) permitted in the layout. The list of available options, with their defaults in the Apache layout, is detailed in Table 3-5.

Table 3-5. Configuration Directives Relating to Locations

Option

Description

–with-program-name=NAME (2.0)

Installs name-associated files using the base name of

–target=NAME (1.3)

NAME

. This changes the name of the Apache executable

from

httpd to NAME and also changes the default names

of the configuration, scoreboard, process ID, and lock

files—

httpd.conf becomes NAME.conf, and apachectl

becomes

NAMEctl, and so on. This can be useful, along

with

–runtimedir for running a second instance of

Apache with a different configuration in parallel with an

existing one.

The default is

httpd. In Apache 1.3 this option is called

–target

, but in Apache 2 it has been renamed to

–with-program-name

to make way for the new portability

options

–target, –host, and –build.

–prefix=PREFIX

Installs architecture-independent files in

PREFIX. This

determines the primary installation location for Apache

and the default value of the server root, for instance,

/usr/local/apache

under Unix. Most other locations

default to subdirectories of this value.

–exec-prefix=EPREFIX

Installs architecture-dependent files (meaning

principally compiled executables) in

EPREFIX. This

determines the root location of executable files, from

which the

–bindir, –sbindir, and –libexec directories

are derived (if specified as relative values). It defaults to

the same value as

PREFIX if unspecified.

–bindir=DIR

Installs user executables and scripts in

DIR. Usually

located in the bin directory under

EPREFIX.

–sbindir=DIR

Installs sys admin executables in

DIR [EPRIFIX/sbin].

–libexecdir=DIR

Installs program executables in

DIR. This defines the

location of Apache’s dynamic modules, if any are

installed. Usually located in the libexec (1.3) or modules

(2.0) subdirectory under

EPREFIX.

–mandir=DIR

Installs Unix manual pages for each of Apache’s

executables in

DIR. Usually located in the man directory

under

PREFIX. Not to be confused with –manualdir.

–sysconfdir=DIR

Installs configuration files such as

httpd.conf and

mime.types

in DIR. Usually located in the conf directory

under

PREFIX.

(Continued)

Table 3-5. Configuration Directives Relating to Locations (Continued)

Option

Description

–datadir=DATADIR

Installs read-only data files in

DATADIR. This determines

the root location of all nonlocal data files, from which the

–installbuilddir

, –errordir, –iconsdir, –htdocsdir,

–manualdir

, and –cgidir directories are derived, if

specified as relative values. It defaults to the same value

as

PREFIX if unspecified.

–errordir=DIR

Installs custom error documents into

DIR. These are

referred to by

ErrorDocument directives in the main

configuration file and produce nicer, and multilingual,

error messages than Apache’s built-in ones. Usually

located in the error directory under

DATADIR. New in

Apache 2.

–iconsdir=DIR

Installs icons for directory indexing into

DIR. AddIcon

directives in the main configuration file refer to these.

Usually located in the icons directory under

DATADIR.

New in Apache 1.3.10.

–infodir=DIR

Installs documentation in GNU “info” format into

DIR.

Apache itself doesn’t come with any information

documentation, but third-party modules might. Not

created by default but, is usually located in the info

directory under

DATADIR. New in Apache 2.

–htdocsdir=DIR

Installs the default Apache startup Web page into DIR.

The master configuration uses this directory as the initial

value for the

DocumentRoot directive. New in Apache 1.3.10.

–manualdir=DIR

Installs Apache’s HTML documentation into

DIR. Usually

located in the manual directory under

DATADIR. Not to be

confused with

–mandir. New in Apache 1.3.21.

–cgidir=DIR

Installs the standard Apache CGI scripts into

DIR. Usually

located in the

cgi-bin directory under DATADIR. The

master configuration file points to this directory via a

ScriptAlias

directive. New in Apache 1.3.10.

–includedir=DIR

Installs Apache’s header files in

DIR. These are required

by

apxs to compile modules without the full Apache

source tree available. Usually located in the

include

directory under

PREFIX.

–libdir=DIR

Installs Apache’s nonmodule object libraries in

DIR.

Usually located in the

lib directory under PREFIX. New in

Apache 2.

–localstatedir=LOCALDIR

Installs modifiable data files in

DIR. This defines where

files that convey information about a particular instance

of Apache are kept. It usually governs the locations of the

runtime, log file, and proxy directories below. Usually the

same as PREFIX.

–runtimedir=DIR

Installs run-time data in

DIR. This determines the default

locations of the process ID, scoreboard, and lock files.

Usually located in the logs subdirectory under

LOCALDIR.

Table 3-5. Configuration Directives Relating to Locations (Continued)

Option

Description

–logfiledir=DIR

Installs log file data in

DIR. This determines the default locations of the error and access logs. Usually located in the logs subdirectory under LOCALDIR.

–proxycachedir=DIR

Installs proxy cache data in

DIR. This determines the default location of the proxy cache. Usually located in the proxy subdirectory under LOCALDIR.

–sharedstatedir=DIR

Installs shared modifiable data files in

DIR. Not created by default, but usually the same as PREFIX. New in Apache 2.

When considering this list, it’s worth remembering that the directory organization is actually an artifact of the layout definitions in config.layout and not an implicit default; there’s nothing that automatically decides that the bin and sbin directories are under EPREFIX. The same is true for DATADIR and LOCALSTATEDIR—both define paths that are only used as the basis for the default values of other options in config.layout. If you create your own layout that doesn’t make use of them, they have no significance.

You can also combine a layout configuration with an individual location option. The configure script reads the layout first and then overrides it with individual options. For example, you can explicitly request the Apache layout and then override the sbin directory so it’s different from the normal bin directory:

As a final trick, you can actually use the variables in the layout definition as part of individual location options. For example, the –exec_prefix option can be accessed with $exec_prefix. You need to escape the dollar to prevent the shell from trying to evaluate it, so the second command in the previous code could be written more flexibly as this:

$ ./configure –enable-layout=Apache –sbindir=$exec_prefix/sbin

You can do this for any of the values defined in config.layout, which allows you to customize any predefined layout without editing it.

To check that you’ve got everything right, use the –show-layout option:

This also has the benefit of providing the names of the values you can use in your own modifications (by prefixing them with a $).

CAUTIONThis isn’t yet supported by the Apache 2configurescript.

{mospagebreak title=Choosing a MultiProcessing Module (Apache 2)}

One of the key innovations arising from the development of Apache 2 is the introduction of a fully multithreaded server core, and the ability to choose from one of several possible cores that implement different strategies for dealing with large numbers of accesses.

The available server cores, known as MPMs, vary depending on the platform you’re going to build Apache on. The greatest choice is available to Unix derivatives such as Linux and MacOS X; the only other platform that provides a choice is OS/2. MPMs are also available for Windows, NetWare, and BeOS—if you’re on one of these platforms, then Apache will automatically choose the appropriate MPM for you.

Remarkably, considering that the MPM is the central component of the server that implements all of the core server directives, the only option you have to worry about when choosing one is the –with-mpm option. For example, to explicitly tell Apache to use the worker MPM, you would use this:

$ ./configure –with-mpm=worker …

The choice of MPM is closely related to the kind of usage pattern you expect your server to experience, so it is primarily a performance decision. Accordingly, I’ll cover it in detail in Chapter 9 and summarize the configuration directives and build-time definitions in Online Appendix J. For now, I’ll outline the options available with brief notes on their characteristics.

Unix MPMs

Five MPMs are available for Unix platforms. The prefork and worker MPMs are the two primary choices, offering 1.3 compatibility and support for threads, respectively. Three other MPMs are also available for the more adventurous administrator; leader and threadpool are variants of worker that use different strategies to manage and allocate work to different threads. The perchild MPM is more interesting; it allows different virtual hosts to run under different user and group IDs.

prefork

The prefork MPM implements a preforking Apache server with the same characteristics and behavior as Apache 1.3. It doesn’t benefit from the performance gains made possible using threads, but it’s the most compatible with Apache 1.3 and therefore may offer a more direct and convenient migration route. The following is an example of its usage:

$ ./configure –with-mpm=prefork

worker

The worker MPM implements a fully threaded Apache server and is the primary MPM for use on Unix servers. Rather than forking a process for each incoming request, it allocates a thread from an existing process instead. Because threads share code and data, they’re much more lightweight, and many more can exist concurrently. They’re also much faster to start up and shut down. The number of threads allowed to an individual process is limited; once it’s reached, a new child process is forked:

$ ./configure –with-mpm=worker

leader

The leader MPM is an experimental variant of the worker MPM that uses a different algorithm to divide work up between different threads using the leader/follower design pattern. Although it’s structured differently internally, it’s almost identical to worker from the point of view of configuration:

$ ./configure –with-mpm=leader

threadpool

Another experimental variant of the worker MPM, threadpool manages queues of threads and assigns them to incoming connections. In general, usage threadpool isn’t as efficient as worker and is primarily used as a development sandbox for testing features before they’re incorporated into other MPMs:

$ ./configure -with-mpm=threadpool

perchild

The perchild MPM implements an interesting variation of the worker MPM, where a child process is forked for each virtual server configured in the server configuration. Within each process, a variable number of threads are started to handle requests for that host. Because of the alignment of processes to virtual hosts, the child processes can run under the configured user and group for that host, permitting external handlers and filters to run with the correct permissions and obviating the need for the suExec wrapper:

$ ./configure -with-mpm=perchild

Although technically classed as experimental in Apache 2, the perchild MPM is unique in its capability to manage ownerships on a virtual host basis. For administrators who make extensive use of suExec and who want to enable the same permissions controls for mod_perl handlers, PHP pages, and other embedded scripting languages, perchild is the only choice.

Windows MPMs

The following option is the Windows-based option.

winnt

The winnt MPM is the only Windows MPM supplied as standard with Apache 2. It implements a single process, multithreaded server using the threading implementation supported by Windows platforms. Despite its name, it also works fine on Windows 2000 and XP. (Multiple process MPMs aren’t practical on Windows platforms because they don’t implement anything similar to the fork system call of Unix). The following is an example of its usage:

$ ./configure –with-mpm=winnt

OS/2 MPMs

The following option is specific to OS2.

The mpmt_os2 MPM implements a multiprocess (that is, forking), multithreaded server for OS/2. Like the worker MPM, each process contains a limited number of threads, with a new process spawned when the server becomes too busy to allocate a thread from an existing process:

–with-mpm=mpmt_os2

Others

The following are specific to NetWare and BeOS and followed by an example.

netware

The netware MPM provides a multithreaded server on NetWare platforms:

$ ./configure –with-mpm=netware

beos

The beos MPM provides a multithreaded server on BeOS platforms:

./configure –with-mpm=beos

Rules (Apache 1.3)

Rules are special elements of the Apache 1.3 source that can be enabled or disabled to provide specific features that depend on the platform or other resources being available. These are more specialized options that should generally only be overridden if actually necessary. Apache 2 is quite different internally, so it does away with rules entirely in favor of a more platform-sensitive build environment.

Rules are enabled or disabled with the –enable-rule and –disable-rule options. For example, to enable SOCKS5 proxy support, you’d specify this:

$ ./configure –enable-rule=SOCKS5

The list of rules and whether they’re enabled, disabled, or default (that is, determined automatically by configure) can be extracted from the output of configure –help. Third-party modules that patch Apache’s source code generally do it by adding a rule; for instance, mod_ssl adds the EAPI rule. The following is the list of standard rules:

DEV_RANDOM: Enables access to the /dev/random device on Unix systems, which is necessary for modules that need a source of randomness. Currently, the only module that needs this is mod_auth_digest, so configure will only enable this rule if mod_auth_digest is included. This is enabled by default.

EXPAT: Incorporate the Expat XML parsing library into Apache for use by modules that process XML. There are no modules in the Apache distribution at the moment that do, but third-party modules such as mod_dav can take advantage of it if present. This rule is enabled by default only if configure finds a lib/expat-lite directory in the src directory. In Apache 2, you can also use –enable-expat or –disable-expat. This is enabled by default since Apache 1.3 onward.

IRIXN32, IRIXNIS: These are specific to SGI’s IRIX operating system. IRIX32 causes configure to link against n32 libraries if present. It’s enabled. IRIXNIS applies to Apache systems running on relatively old versions of NIS, also known as Yellow Pages. It’s not enabled. Neither option is of interest to other platforms.

PARANOID: In Apache 1.3, modules are able to specify shell commands that can affect the operation of configure. Normally, configure just reports the event; with the PARANOID rule enabled, configure prints the actual commands executed. Administrators building in third-party modules may want to consider using this and watching the output carefully. This rule isn’t enabled by default.

SHARED_CORE: This exports the Apache core into a dynamic module and creates a small bootstrap program to load it. This is only necessary for platforms where Apache’s internal symbols aren’t exported, which dynamic modules require to load. The configure script will normally determine whether this is necessary. It’s included by default. This rule is enabled by default.

SHARED_CHAIN: On some platforms, dynamic libraries (which include modules) won’t correctly feed the operating system information about libraries they depend on when Apache is started. For example, mod_ssl requires the SSLeay or OpenSSL library. If it’s compiled as a dynamic module, Apache isn’t always told that it needs to load the SSL libraries, too. On systems where this problem occurs, enabling the SHARED_CHAIN rule can sometimes fix the problem. On other systems, it may cause Apache to crash, so enable it only if modules are having problems resolving library symbols. This rule is enabled by default.

SOCKS4,SOCKS5: These enable support for the SOCKS4 andSOCKS5 proxy protocols, respectively. If either option is selected, then the appropriate SOCKS library may need to be added to EXTRA_LIBS if configure can’t find it. These aren’t enabled by default.

WANTHSREGEX: Apache comes with a built-in regular expression engine that’s used by directives such as AliasMatch to do regular expression matching. Some operating systems come with their own regular expression engines that can be used instead if this rule is disabled. configure uses Apache’s own regular expression engine unless the platform-specific configuration indicates otherwise. This rule is enabled by default.

On the vast majority of platforms, rules such as SHARED_CHAIN shouldn’t need to be set by hand. In the event they are, an alternative and potentially preferable approach to enabling the SHARED_CHAIN rule is to use Apache’s LoadFile directive to have Apache load a library before the module that needs it, for example:

Note that the library on which the module depends should be loaded before the module to allow it to resolve symbols supplied by the library successfully.

You can find a little more information about some of these rules in the Apache 1.3 src/Configuration file. For detailed information on exactly what they do, look for the symbols in the source code, for example:

$ grep -r SOCKS5 apache_1.3.28 | less

{mospagebreak title=Building Apache with suExec support}

suExec is a security wrapper for Unix systems that runs CGI scripts under a different user and group identity than the main server. It works by inserting itself between Apache and the external script and changing the user and group of the external process to a user and group configured for the server or virtual host. This allows you to have each virtual host run scripts under its privileges and thus partition them from each other. Before you build suExec, it’s worth considering that Apache 2 provides a different solution to the same problem in the shape of the perchild multiprocessing module, which allows you to define a user and group identity per virtual host.

To get suExec support, you must tell Apache to use it at build time with the –enable-suexec option:

$ ./configure –enable-suexec

On its own, this is rare enough, however. For suExec to actually work, it also needs to have various configuration options set. None of these are configurable from Apache’s configuration to keep the suExec wrapper secure from tampering, so if you need to set them, you must define them at build time. Some of the defaults are also slightly odd and aren’t governed by selecting an Apache layout, so if you use a different layout, you’ll probably need to set some of them.

NOTEsuExec is built so that it can only be run by theuser and group of the main server as configured at build time. If you want to havesuExecrun by any User otherthan the default, then it’s not enough to change the settings of theUser andGroup directives in the server-level configuration—they must also be defined at buildtime so thatsuExec recognizes them.

Of particular note are the minimum values of the user and group ID and the document root. The user and group IDs are lower limits that are compared to Apache’s User and Group settings. They constrain the user and group under that suExec will allow itself to be run. This prevents the root user being configured, for example. If you want to use your own user and group with suExec, then you have to ensure that the correct settings are established up front; if you change them later to something below the allowed minimum values, then you must rebuild suExec or it’ll detect that it’s being run by the wrong user or group and refuse to cooperate.

The document root setting is slightly misnamed; it determines where suExec will permit executables to be run from. For a single Web site, it should be the document root, but for virtual hosts, it should be a parent directory that’s sufficiently broad enough to include all the virtual host document root directories somewhere beneath it. For example, if you have all your virtual hosts under /home/www/virtualhostname, then the document root for suExec should be /home/www.

Table 3-6 lists all the configure options that control suExec.

Table 3-6. suExecConfigure Script Options

Option

Description

[1.3]

–server-uid=UID

Sets the user ID that Apache will run under, and that

[2.0]

–with-server-uid=UID

suExec

will allow execution by. The default is nobody.

[1.3]

–server-gid=GID

Sets the group ID that Apache will run under and that

[2.0]

–with–server-gid=GID

suExec

will allow execution by. The default is nobody

under Apache 1.3 and

#-1 under Apache 2.

[1.3]

–suexec-caller=NAME

Sets the name of the user that’s allowed to call

suExec.

[2.0]

–with-suexec-caller=NAME

This should be set to the name of the

User directive in

httpd.conf

. The default is www.

[1.3]

–suexec-docroot=DIR

Sets the root directory of documents governed by

[2.0]

–with-suexec-docroot=DIR

suExec

. This affects the user directory processing of

suExec

. The default is PREFIX/htdocs.

[1.3]

–suexec-logfile=FILE

Determines the location of

suExec’s log file. By default

[2.0]

–with-suexec-logfile=FILE

this is the master server log directory.

[1.3]

–suexec-userdir=DIR

Specifies the name of the subdirectory as inserted into

[2.0]

–with-suexec-userdir=DIR

URLs by

mod_userdir. If user directories are in use

and implemented by

mod_userdir (as opposed to

mod_rewrite

, say), suExec needs to know what the

substituted path is to operate correctly. The default is

public_html

, which is also the default of the UserDir

directive in

mod_userdir.

[1.3]

–suexec-uidmin=UID

Specifies the minimum allowed value of the

User

[2.0]

–with-suexec-uidmin=UID

directive when evaluated as a numeric user ID. The

default is

100, restricting access to special accounts,

which are usually under

100 on Unix systems.

[1.3]

–suexec-gidmin=UID

Specifies the minimum allowed value of the

Group

[2.0]

–with-suexec-gidmin=UID

directive when evaluated as a numeric group ID. The

default is

100, restricting access to special accounts,

which are usually under

100 on Unix systems.

[1.3]

–suexec-safepath=PATH

Defines the value of the

PATH environment variable

[2.0]

–with-suexec-safepath=PATH

passed to CGI scripts. This should only include

directories that are guaranteed to contain safe

executables. The default is

/usr/local/bin:/usr/

bin:/bin

. Paranoid administrators may want to

redefine this list to remove

/usr/local/bin, or

redefine it to nothing.

[1.3]

–suexec-umask=UMASK

Specifies the maximum permissions allowed in the

[2.0]

–with-suexec-umask=UMASK

user file-creation mask of the

suExec executable as an

octal number. By default the server’s

umask is used,

usually

022 (no group or other execute permission).

This is also the hard limit, and

suExec will refuse to

even compile with a more generous setting. You can

make the event more restrictive, however. To have

suExec

refuse to execute anything that’s group-

writable, world-writable, or world-readable, use a

umask

of 026 (new in Apache 1.3.10).

This requirement of forcing suExec to be configured at compile time rather than runtime may seem more than a little inconvenient, but this deliberate inflexibility has more than a little to do with the fact that Apache has a very good reputation for security. If you compare this to the almost continual litany of exploits for some less popular proprietary Web servers that emphasize convenience over security, it becomes easier to put up with this relatively minor sort of inconvenience for the greater security it gives you.

To find out what settings an existing suExec binary has been compiled with, you can use the -V option, for example:

suExec is installed into the bin directory (it moved here from the sbin directory since Apache 1.3.12; if these are set to the same place, the distinction is moot). Also, Apache 2 abstracts support for suExec within Apache itself into the new mod_suexec module and installs this into the configured libexec directory. This is potentially very handy because it allows you to disable suExec support by removing the module and add it later if you need it. Apache 1.3 doesn’t allow you this freedom as it integrates suExec support into the core.

NOTE In this chapter I’ll cover building and installing ofsuExec. I’ll cover the configuration and set up ofsuExecin Chapter 6.

Configuring Apache’s Supporting Files and Scripts

As well as configuring the build process for Apache itself, configure also sets up the various supporting scripts and applications that come as standard with Apache. These range from shell scripts such as apachectl through Perl scripts such as dbmmanage and apxs to compiled binaries.

After the Apache executable has been built, the configuration process carries out some additional stages to clean it of unnecessary symbol information (useful for debugging but useless baggage for a production environment) and to substitute your configuration information into Apache’s supporting configuration files and scripts.

As with everything else, you can impose some control over this as well. Table 3-7 shows the options that control which of these stages are carried out by configure.

In Apache 2, you can also choose whether those supporting tools that are built from source are linked statically or dynamically, in much the same way as you can choose the static or dynamic status of modules. All of these tools are built dynamically by default, but you can make some or all of them into statically linked executables with one of the options in Table 3-8.

One of Apache’s great strengths is its ability to run on almost any platform. Apache 2 improves on this with the APR libraries, which greatly enhance Apache’s ability to adapt to a new platform by providing a portability layer of common functions and data structures. You can find detailed information on the APR on the APR project pages at http://apr.apache.org/.

Between the APR and the adoption of the autoconf system for build configuration, you now have the ability to configure and build Apache for a different platform and even a different processor, if you have a cross-platform compiler available. Many compilers are capable of cross-compliation, including gcc, but not by default; you generally need to build a cross-compiling version of the compiler first. You also need accompanying compiler tools such as ld, ar, and ranlib—on Linux platforms these are generally in the binutils package. Consult the documentation for the installed compiler for information.

Once you have a cross-compiler set up, you can use it to build Apache. To do this, specify one or more of the portability options –target, –host, and –build. All three of these options take a parameter of the form CPU-VENDOR-SYSTEM or CPU-VENDOR-OS-SYSTEM and work together as shown in Table 3-9.

Table 3-9. Configure Script Build Options

Option

Description

–host

The system type of the platform that’ll actually run the server.

–target

The system type of the platform for which compiler tools will produce code. This isn’t in fact the target system (that’s the host), but a definition that’s passed on to compiler tools that are themselves built during the build process. Normally this would be the same as the host, which it defaults to.

–build

The system type of the platform that will carry out the build. This defines your own server when the host is set to something different. It defaults to host.

These three values are used by the autoconf system, on which the Apache 2 build configuration is based. As a practical example of a host value, a Pentium III server running Linux would usually be defined as i786-pc-linux-gnu. This would be used for the host, if it’s the system that you’ll be running Apache on, and the build host, if it’s the system that’ll be building Apache. In most cases, all three values are the same, with the target and build types defaulting to the host type. It’s rare that all three are needed. You can find detailed information about them in the autoconf manual pages at http://www.lns.cornell.edu/public/COMP/info/autoconf/.

Normally, configure will guess the host processor, platform vendor, and operating system type automatically; then default the build and target to it for a local build. Alternatively, you can override the host using the –host option or adding it to the end of the command line as a nonoption. This allows you to specify a different platform. For instance, to build for a Sparc-based server running Solaris like so:

$ ./configure –build=i786-pc-linux-gnu –host=sparc-sun-solaris2

it’s necessary to specify both the host and build system types when cross-compiling. This is because the build system type defaults to the host if not set explicitly. (This is the correct behavior when not cross-compiling but where the guessed host system type is incorrect.) Another reason is that cross-compilers can’t always correctly determine the build system. If it can, you can just use local instead of a complete system type definition:

$ ./configure –build=local –host=sparc-sun-solaris2

It might be necessary to specify the compiler explicitly if you have both a native compiler and a cross-compiler on the same build host; configure will generally find the native compiler first, and you need to tell it otherwise. To do that, you can set the name of the compiler in the environment variable CC like this:

Alternatively, if the cross-compiler has a distinct and different name and is on your path, you can just specify the name and leave out the path. See “Configuring the Build Environment” later in the chapter for more about how you can use environment variables to control the configuration process.

Configuring Apache for Production or Debug Builds

Normally you want to produce a server executable that has debugging information stripped from it and is compiled with any optimizations available for a more efficient server. However, Apache comes with full source code, so if you want, you can build a version of the server for debugging purposes. Several options are available to help with this, more in Apache 2 than in Apache 1.3 (see Table 3-10).

Table 3-10. Configure Script Debug Options

Option

Description

Compatibility

–without-execstrip

In Apache 1.3, this tells the build process to

Apache 1.3 only

disable optimizations and not to strip the

symbol tables out of the resulting Apache

binary so it can be debugged. This is also

helpful for analyzing core files left by a

crashed Apache process after the fact.

–enable-maintainer-mode

This is the Apache 2 equivalent of

Apache 2 only

–enable-debug

–without-execstrip

and similarly

produces a binary for debugging. It also

turns on some additional compile-time

debug and warning messages.

–enable-profile

Switches on profiling for the Apache

Apache 2 only

Portable Runtime; for debugging only.

–enable-assert-memory

Switches on memory assertions in the

Apache 2 only

Apache Portable Runtime; for debugging

only.

A few other more specialized options are also available; see the bottom of the output from ./srclib/apr/configure –help for a complete—if rather terse—list of them. –enable-v4-mapped, for example, tells Apache that it can use IPv6 sockets to receive IPv4 connections. Because this is highly dependent on the operating system, it’s usually best to let configure choose these options for you. Many of these are general autoconf options rather than Apache-specific ones, and you can find more information about them in the autoconfdocumentation athttp://www.lns.cornell.edu/public/COMP/info/autoconf/.

Configuring Apache for Binary Distribution

Apache may also be compiled as a distributable binary archive, which may be copied to other machines, unpacked, and installed. To do this, you must build it using the BinaryDistribution layout and make use of the binbuild.sh script, which is included in the build directory under the root of the source distribution. See “The Binary Distribution Layout” section earlier in the chapter where this is described in detail.

Configuring Apache’s Library and Include Paths

Apache relies on a lot of external libraries and headers to build itself. Various parts of the server require additional libraries and headers, or they’ll either disable themselves or be built with reduced functionality. As usual, options are available to help you teach Apache where to look for external libraries and their attendant headers and where to install its own so that utilities such as apxs can find them. Not all of these are listed by configure –help, but some can be found running this:

srclib/package/configure -helpwhere package is one of apr, apr-util, or pcre.

You can subdivide these options into two loose categories: general options and module-specific options.

General Options

Table 3-11 lists the general options.

Table 3-11. Configure Script General Options

Option

Description

–includedir

The location of Apache’s include files. Default

PREFIX/include. This is a layout option (see Table 3-5).

–oldincludedir

The location of header files outside the Apache source distribution. The default is

/usr/include.

–libdir

The location of Apache’s own libraries. The default is

PREFIX/lib. A layout option (see earlier).

You can specify external library and include paths with the -I and -L compiler options, for example:

Specifies the type of DBM database to use. This is used by mod_auth_dbm

and the DBM map feature of

mod_rewrite. The default is to use SDBM,

which comes bundled with Apache but that has limited functionality.

A local version of a more powerful DBM such as GDBM can be

specified with

–with=dbm=gdbm.

–with-expat=DIR

Specifies the location of the

Expat library and header files. configure

will try several different permutations based on the base directory

path specified by

DIR; the default is to try /usr and /usr/local.

–with-ssl=DIR

Specifies the location of the OpenSSL library and header files, if

mod_ssl

has been enabled. Limited support for other SSL

implementations is also available.

–with-z=DIR

Specifies the location of the

Zlib compression library and header files,

if

mod_deflate has been enabled.

{mospagebreak title=Configuring the Build Environment}

Some of the more obscure Apache settings aren’t configurable via configure because they’re rarely needed except for very finely tuned servers or to enable experimental features that are otherwise disabled. If you need to enable one of these special options, then you have to define them—either in the environment before running configure or afterward in one of the EXTRA_ definitions contained in the following:

[1.3] src/Configuration.apaci[2.0] config_vars.mk

The first route is by far the more preferable because rerunning configure will wipe out any changes you made to the files it generates.

As a practical example, one parameter you might want to set that isn’t available as a configurable option is to increase the hard process limit that forms the upper boundary of the MaxClients directive. To set this so configure sees and absorbs it, add it to the environment with this:

$ CFLAGS=’-DHARD_SERVER_LIMIT=1024′ ./configure …

This is just one of several values related to process and thread management that you can set at compile time; see the MPM discussion in Chapters 8 and 9 and Online Appendix J for more. For developers, one other value of note is GPROF, which creates a profiling Apache binary whose output can be analyzed with the gprof tool. It also enables an additional directive called GprofDir that determines the directory where the profile data file is created.

configure will take an environment variable and integrate it with the other compiler flags so that it’s active during compilation. The distinction is actually fairly arbitrary, and in fact either will work fine.

You can even override the compiler that’s used to carry out the build; you saw an example of that earlier when I discussed cross-compiling Apache.

As another example, Apache 2 provides the experimental mod_charset_lite module for on-the-fly character set conversion. This module won’t work unless you also define APACHE_XLATE, so to enable it as well as increase the hard server limit, modify the previous command to get this:

[2.0] $ CFLAGS=”-DHARD_SERVER_LIMIT=1024-DAPACHE_XLATE” ./configure …

Note the quotes, which are necessary if you want to specify more than one option this way.

You can also undefine something with -U if you want to undo a previously established setting. This works just the same as -D except, of course, you don’t supply a value. Undefining something that isn’t defined in the first place has no useful effect but is harmless.

Environment variables specified this way only last as long as the execution of the command that follows them. If you’re going to be reconfiguring Apache several times to refine the configuration, you can instead set the variable permanently (or at least for the lifetime of the shell). How you do this depends on the shell you’re using:

As I mentioned at the start, if you’ve already run configure, you can avoid rerunning it by editing the EXTRA_CFLAGS line in this:

[1.3] src/Configuration.apaci[2.0] config_vars.mk

But keep in mind that rerunning configure will wipe out these edits.

Building Modules with configure and apxs

Apache’s standard configuration script enables modules to be included or excluded in a flexible manner but only knows about modules that are supplied with Apache. To build third-party modules into Apache, you have to tell the configure script about them.

It’s tedious to have to reconfigure and rebuild Apache to add a dynamic module to it because you only actually want to build the module and not the entire server. For this reason, Apache comes with the apxs utility, a Perl script designed to configure and compile third-party modules without the need to have Apache’s source code present.

So Apache presents you with three options to add new modules to the server:

Add a new module to the Apache source tree and tell configure to use it.

Place the module source code somewhere in the file system and tell configure where to find it.

Use apxs to build the module as a dynamic loadable module independently from configure.

However, configure only works for modules that have their source code contained in a single file. More complex modules require additional steps that have their own installation scripts. These tend to use apxs to build themselves because apxs is configured with the installation information for the version of Apache that created it and can handle more than one source file. In general, if a module comes with its own configuration script, you should use it rather than try to handle the module with configure.

It’s not possible to use apxs in all situations. Very occasionally, a module may require patches to be made to the Apache source code itself before it can be built, dynamically or otherwise. To use these modules, you must therefore rebuild Apache after applying the necessary patches; apxs on its own will not be enough. Luckily, this is a rare occurrence.

Adding Third-Party Modules with configure

The configure script allows extra modules to be incorporated into the build process with the use of two additional options, –activate-module and –add-module. In Apache 2, –activate-module has been replaced by the semantically similar –with-module.

For example, to include the third-party module mod_bandwidth into Apache 1.3 as a static module, you first copy the source file mod_bandwidth.c into the /src/modules/extra directory and then tell configure to use it with this:

You have to specify a relative pathname to the file that starts with src/modules in Apache 1.3; configure will not automatically realize where to find it—in this case, you have put the code in the extra directory, which exists in the Apache 1.3 source tree for just this purpose.

In Apache 2, the source distribution is organized a little differently, with a top-level modules directory under which modules are subcategorized by type: filters, generators, loggers, and so on. There’s no extra directory as standard, but you can easily create one and then include a third-party module by copying the module source and activating it with this:

[2.0] $ ./configure –with-module=extra:mod_bandwidth.c

Many third-party modules provide their own installation scripts. This is typically the case where the module involves multiple source files and can’t be built using Apache’s default module compilation rules. These typically build the module and then copy it into the Apache modules directory tree ready for Apache to link them. Accordingly, both –activate-module and –with-module will also accept an object file or a shared library object as the file parameter, for example:

[2.0] $ ./configure –with-module=extra:mod_bandwidth.o

You don’t have to stick with an extra in Apache 2; you can as easily create a mymodules directory if you prefer. On Unix systems, you can also use a symbolic link to point to a directory outside the distribution.

If configure finds the source code for the module, Apache 1.3 will print out an opening dialogue such as the following:

Rather than spending time copying module source code, you can have configure do it for you with the –add-module option. This has the same effect as –activate-module, but first copies the source code for the module from the specified location into src/modules/extra before activating it:

Once the module has been added, it can subsequently be configured with –activate-module because the source code is now within the Apache source tree. It’s not necessary to keep copying in the source code with –add-module.

{mospagebreak title=Building Modules with apxs}

apxs is a stand-alone utility for compiling modules dynamically without the need to use the configure script or have the Apache source code available. It does need Apache’s header files, though, which are copied to the location defined by –includedir when Apache is installed. However, it’s important to use an apxs that was built with the same configuration options as Apache; otherwise, it’ll make erroneous assumptions about where Apache’s various installation locations are.

At best, this will mean you can’t use apxs to install modules; at worst, apxs won’t be able to find the header files and simply won’t work at all.

However, apxs is totally useless for static Apache servers and isn’t even installed unless mod_so is built for dynamic module support. Administrators migrating to Apache 2 will be glad to learn that despite the substantial changes since version 1.3, the apxs command line hasn’t changed at all and is the same for Apache 2 as it was in Apache 1.3.

Platforms that offer prebuilt packages often put utilities such as apxs into a separate optional package along with the header files. If the standard Apache installation doesn’t include apxs, look for it in a package called apache-devel or similar.

apxs takes a list of C source files and libraries and compiles them into a dynamic module. To compile a simple module with only one source file, you could use something like this:

$ apxs -c mod_paranoia.c

This takes the source file and produces a dynamically loadable module called mod_paranoia.so. You can also compile stand-alone programs with apxs if you give it the -p option:

$ apxs -p -c program_that_uses_apache_libraries.c

apxs will happily accept more than one source file and will also recognize libraries and object files, adding them at the appropriate stage of the linking process:

$ apxs -c mod_paranoia.c libstayalert.a lasershandy.o

The -c option enables the use of a number of other code building options, most of which are passed on to the C compiler (see Table 3-13).

Table 3-13. apxs Command Line Options

Option

Description

-o outputfile

Sets the name of the resulting module file rather than inferring it from the name of the input files, for example,

-o libparanoia.so.

-D name=value

Sets a define value for the compiler to use when compiling the source code, for example,

-D DEBUG_LEVEL=3.

-I includedir

Adds a directory to the list of directories the compiler looks in for header files, for example,

-I /include.

-L libdir

Adds a directory to the list of directories the linker looks in for libraries at the linking stage, for example,

-L /usr/local/libs.

-l library

Adds a library to the list of libraries linked against the module, for example,

-l ldap (assuming you have a libldap somewhere).

-Wc,flag

Passes an arbitrary additional flag to the compiler. The comma is important to prevent the flag being interpreted by

Passes an arbitrary flag to the linker. The comma is important to prevent the flag being interpreted by

apxs, for example, -Wl,-s strips symbols from the resulting object code on some compilers. -L is shorthand for -Wl,-L.

Installing Modules with apxs

Once a module has been built, apxs can then install it into the place configured for modules (previously specified by the –libexecdir option), for example:

$ apxs -i mod_paranoia.so

This builds the module and then installs it into the configured libexec directory.

In addition, you can use the -a option to have apxs modify Apache’s configuration (that is, httpd.conf) to add the LoadModule directive (plus an AddModule directive in Apache 1.3), so Apache will load the module when it’s restarted:

$ apxs -i -a mod_paranoia.so

If the directive already exists but is commented out, apxs is smart enough to just uncomment the existing line. This means that for Apache 1.3, the module loading order is preserved. (Apache 2 doesn’t rely on the order anyway, so it isn’t bothered by this issue).

When adding a new module to Apache 1.3, it’s important to realize that apxs has no special knowledge of where the module should be in the loading order, so it simply adds the LoadModule and AddModule directives to the end of their respective lists. Thus, before restarting Apache, you should take the time to check if this is the correct order. For example, it’s often necessary to move modules such as mod_setenvif to the end, so they can always act before third-party modules that might rely on the settings of environment variables.

If the module is already installed but the configuration doesn’t contain the corresponding directives to load it, you can instead use the -e option. This essentially tells apxs to recognize the -a flag but to not actually install the module:

$ apxs -e -a mod_paranoia.so

Alternatively, if you want to add the relevant lines but have them disabled currently, you can use the -A option instead, with either -i or -e. To install and configure the module in a disabled state, use this:

$ apxs -i -A mod_paranoia.so

To configure the module in a disabled state without installing it, use this:

$ apxs -e -A mod_paranoia.so

Both commands add the directives, but prefix them with a # to comment them out of the active configuration. If they’re already present, then they’re commented out in place; otherwise, they’re added to the end.

On rare occasions, the name of the module can’t be directly inferred from the name of the source file, in which case you have to specify it explicitly with the -n option to ensure that the directives added by -a or -A are correct:

$ apxs -i -n paranoid -a mod_paranoia.so

You can combine the build and install stages into one command by specifying both the -c and -i options at the same time:

$ apxs -c -i -a mod_paranoia.c

Generating Module Templates with apxs

apxs can also generate template modules to kick start the development process for a new module with the -g option. For this to work with the -n option, you must specify the module name:

$ apxs -g -n paranoia

This will create a directory called paranoia within which apxs will generate a makefile that has various useful targets for building and testing the module and a source file called mod_paranoia.c. When compiled, the module provides no directives, but creates a handler you can use to prove that the module works. The handler name is based on the module name, in this case paranoia_handler.

Remarkably, you can combine all the previous stages to create, build, and install a module into Apache in just three commands:

$ apxs -g -n paranoia$ cd paranoia$ apxs -c -i -a mod_paranoia.c

Of course, this module will do very little, but you can test it by registering the default handler somewhere in the configuration:

AddHandler paranoia_handler .par

If you test this by creating a file called index.par (or any file with a.par extension), you’ll get a test page with the message:

The sample page from mod_paranoia.c

Overriding apxs Defaults and Using apxs in makefiles

The Apache build process preconfigures apxs so that it automatically knows all the details of how Apache was built, where it was installed, and what compiler options were used. This allows apxs to build modules in the same way and to install them into the correct place automatically.

You may possibly need to add to or modify one of these presets, so apxs supplies the -S option to allow you to override any of its built-in presets. For example, to have apxs modify a configuration file that was moved to a different location after Apache was installed, you can override the SYSCONFDIR preset:

$ apxs -S SYSCONFDIR=/moved/conf/my_httpd.conf -i -a mod_paranoia.so

apxs is designed not just to build modules itself but also to provide a means for more complex modules to implement their own build processes. It enables them to use apxs to build and install themselves, automatically acquiring the correct defaults and path information, rather than having the information configured by hand. For this reason, apxs provides a query mode that allows configuration and compile-time details to be extracted with the -q option. Three groups of values can be returned by -q or set with -S

In a configured Apache 2 source distribution, the configuration settings are stored in config_vars.mk.

There are many values here, including locations such as bindir, sbindir, and datadir; their expanded versions in exp_bindir, exp_sbindir, and exp_datadir; and their locations relative to the prefix in rel_bindir, rel_sbindir, and rel_datadir. You can also query the operating system with OS, the list of configured dynamic modules with DSO_MODULES, and the subdirectories that will be examined for modules to build with MODULE_DIRS, amongst many others.

For example, to return the flags used by Apache to build dynamic modules, you’d use this:

$ apxs -q CFLAGS_SHLIB

Modules can use these values in their own makefiles, allowing them to compile independently of apxs without having to replicate the configuration setup work previously done by configure when Apache was built. For example, to use the same compiler and compiler options used to build Apache originally, you could put a line in the module’s makefile:

CC=`apxs -q CC`CFLAGS=`apxs -q CFLAGS`

Summary

In this chapter, you saw how to build the Web server you want by compiling the required source code components. You looked at the advantages and disadvantages of static and dynamic loading of Apache’s modules and saw how to customize Apache’s build process using the configure script for both Apache 1.3 and Apache 2. You also looked at the apxs script and saw how to build modules to add to an existing server.

There’s far more detail in this chapter than you really need to get a grip on right away; you can ignore many of the more advanced options until you encounter a situation that requires them. You can generate a simple configuration in only two commands—the rest is merely detail.

As you go on to configure Apache, you may need to come back and rebuild it—either to include some additional functionality into the server or to restructure its layout to suit some new requirement. As this chapter has shown, compiling source code is nothing to be afraid of—it gives you a great deal of control over Apache that you otherwise wouldn’t be able to get with a binary distribution, and for the most part it’s both easy and painless. Building applications from source tends to be an alien concept to the proprietary software world, but for open-source projects it’s both common and unremarkable.