Apache Web-Serving with Mac OS X, Part 4

Editor's note: Kevin Hemenway covered a lot of ground in the
first three parts of this Web-serving primer, starting with
the basics and moving on to topics such as
CGI,
SSI,
PHP
and
access control. In his fourth article, he takes a step back from the major features
and focuses on what you, the reader, have been asking about.

Whistle a sour ditty! Trumpet a happy tune, pirouette a silly
maneuver -- something magical has happened. Your boss, that proponent of
Windows dedication and desire, was rather impressed with your Mac OS X Web
server. In fact, he commissioned the entire GatesMcFarlaneCo staff to poke
around "our glorious new intranet" and see what they thought. Naturally,
the feature requests and "maybe you should"s came rolling in.

In this, the fourth of the trilogy (Adams would be proud!), we're
going to take a step back from the major features and explore a bit
into what else you can do with a stock Apache installation. The features
below can be applied to any Apache installation, and most require stopping and
starting before they become active.

Default Index Documents

In the last two articles, we talked about using Server Side Includes
(SSI) and PHP. By doing so, we instructed our beloved Apache to parse
.shtml files for SSI statements, and .php files for PHP code. We also
quickly gave some examples of a working index.shtml, as well as an
informational index.php.

Most of you (including Garrett from GatesMcFarlaneCo's Accounting)
noticed that when we changed our index.html to one of the names above
(index.shtml or index.php), Apache no longer loaded that page by
default. This produced an automatically-generated listing of all of the
files in that directory. Not only is this unfriendly for our
visitors, but it can potentially be a security hazard.

Fixing this is easy. As with all our Apache configuration changes, we
want to open the /etc/httpd/httpd.conf file in a normal text editor,
like BBEdit or pico. We're
looking for something called "DirectoryIndex," which tells Apache what
file to display when one hasn't been specified (like "http://localhost/"
or "http://127.0.0.1/~morbus/"). After searching, we should see a
line similar to:

DirectoryIndex index.html

For Mac OS X, Apache has been configured to automatically display
index.html files when only a directory has been supplied, like in the
URLs above. When we renamed our index.html to index.shtml or index.php
for testing, Apache couldn't find its DirectoryIndex, and decided to
spit out what it could find -- the contents of the directory itself.

We're not restricted to only one possible DirectoryIndex.
We could use index.html all of the time, index.php some of the time,
and perhaps insomnia caused the rather suggestive zzzdex.shtml. Apache
can be told to look for all of these, in order of preference:

In this case, we're saying "Hey, if someone doesn't request a
particular file in a URL, then look for index.html. If it's there, cool,
display that. If not, try looking for index.php. If that's not there,
try zzzdex.shtml. If that's not there, then yeah, I
suppose you can automatically generate an index."

You can add as many entries as you wish to the DirectoryIndex, but
you do want to try to keep the most common filename first. If you're
serving thousands of pages a second, a properly ordered DirectoryIndex
will save you a tiny bit of time and processing.

Of course, our trusty Garrett thinks the automatically-generated
indexes are "ugly and unbecoming of the GatesMcFarlaneCo mystique." While
we can certainly question the company's "mystique" (lemmings as a
mascot?), it's probably simpler just to turn autogeneration off.
This is a simple matter of removing the word "Indexes." If you do a
search for this in your Apache config file, you'll happen upon:

Options Includes Indexes FollowSymLinks MultiViews

You should remember this as the line that we added "Includes" to when
we were fiddling with SSI. By removing "Indexes" and restarting Apache,
you're stopping the index autogeneration for the specified directory
and its subdirectories (which, in this case, is anything in
/Library/WebServer/Documents).

With the above "Indexes" change, if Apache can't find any of the
filenames listed in the DirectoryIndex, it will complain with an error
like "You don't have permission to access / on this server." This may
not be exactly what you wanted either, so let's continue on with...

Custom Error Pages

Much like ghost sites
have become a standard Internet occurrence, custom error pages are also becoming
status symbols.
There's nothing fancy in creating an error page -- it's just a plain old HTML document
that you tell Apache to display instead of its default error page.

Say we created a simple HTML page called oops.html that has a
cutesy little "I can't believe it's not butter" error message.
We save the file in /Library/WebServer/Documents/ and we want Apache to
display this for errors instead of its default. Rip open your Apache
configuration file, and do a search for "ErrorDocument." You'll see a
large blurbage of text, in which the important lines look like:

These three commented lines demonstrate the three different methods
of defining an error. In the first example, the quoted text is
passed directly to the browser (you can use HTML if you wish). The
second example tells Apache to display the missing.html file located
in the DocumentRoot (/), The final example will tell Apache to
redirect the user to some.other_server.com.

The numbers you see above, like 500, 404, and 402, are also important.
These are error codes (defined in the HTTP
1.1 RFC) that represent the reasons why the error occurred. The most
common error is 404, often seen as "404 Not Found." Uncommenting
the second line above would tell Apache that you want the missing.html
file to be shown each time a 404 error is triggered. Likewise, error 500 is an
"Internal Server Error," and often occurs when CGI scripts or other
server programming goes awry.

If you recall from above, Apache will spit out a "Forbidden" error message
if index autogeneration has been turned off. If we look in the
RFC, we can
see that the error code for "Forbidden" is 403. With this knowledge, we
could configure our ErrorDocument's like so:

With this configuration, we're telling Apache to display oops.html
for errors "404 Not Found" and "403 Forbidden", and oops-500.html for
any "500 Internal Server Error." We're leaving 402, "Payment Required,"
commented, since it's rarely seen in the wild.

Error documents can get pretty smart. For instance, you could send
all errors off to a cgi script that would find out what incorrect URL
was visited, and if the user clicked on a link from another site. You
could then redirect the user to the nearest possible match, based on
where they initially tried to go.

User-Based Configurations

Patti. Dear, dear Patti. The cutest secretary in the world, but also
a rabid collector of fax cover sheets. Being on the boss' good side has
granted her the privilege of running a personal Web site, where she can
share the dirt on Californian headers and Alabama footers. We didn't
touch upon user-based configurations in the first three articles, but
Mac OS X approaches them a bit differently than you'll find in most
Apache installations.

Kevin has provided you with more than enough tools to get you into Apache hot water ... how goes it?

In most installations, user-based Web serving like
http://127.0.0.1/~patti/ is handled generically -- for every user on the
system, be it two or two thousand, the same configuration applies. If an
administrator wanted to change the capabilities of user "mimi," he'd
usually have to create a specific <Directory> block within the
httpd.conf file.

Mac OS X makes this a lot easier by creating a config file for
each user of the system - these files are located in /etc/httpd/users/
and take the form of username.conf. If I open patti.conf, for instance,
I see:

Because of the similarities, everything we've learned in the previous
articles can also be applied to these user-specific directories. Take a
look at the modified patti.conf below. It allows SSIs and CGIs, and will
block access from everyone but the local machine:

With the above configuration, Patti can Web serve with the best of 'em,
adding message boards or discussion groups to each specimen of her
faxtastic collection. By modifying only the patti.conf file, we can turn
on or off features for only her directory, without affecting the main
GatesMcFarlaneCo configuration.