The Apache server’s mod_rewrite module gives you the ability to transparently redirect one URL to another, without the user’s knowledge. This opens up all sorts of possibilities, from simply redirecting old URLs to new addresses, to cleaning up the ‘dirty’ URLs coming from a poor publishing system — giving you URLs that are friendlier to both readers and search engines.

An Introduction to Rewriting

Readable URLs are nice. A well designed website will have a logical file system layout, with smart folder and file names, and as many implementation details left out as possible. In the most well designed sites, readers can guess at filenames with a high level of success.

However, there are some cases when the best possible information design can’t stop your site’s URLs from being nigh-on impossible to use. For instance, you may be using a Content Management System that serves out URLs that look something like

http://www.example.com/viewcatalog.asp?category=hats&prodID=53

This is a horrible URL, but it and its brethren are becoming increasingly prevalent in these days of dynamically-generated pages. There are a number of problems with an URL of this kind:

It exposes the underlying technology of the website (in this case ASP). This can give potential hackers clues as to what type of data they should send along with the query string to perform a ‘front-door’ attack on the site. Information like this shouldn’t be given away if you can help it.

Even if you’re not overly concerned with the security of your site, the technology you’re using is at best irrelevant — and at worst a source of confusion — to your readers, so it should be hidden from them if possible.

Also, if at some point in the future you decide to change the language that your site is based on (to » PHP, for instance); all your old URLs will stop working. This is a pretty serious problem, as anyone who has tackled a full-on site rewrite will attest.

The URL is littered with awkward punctuation, like the question mark and ampersand. Those & characters, in particular, are problematic because if another webmaster links to this page using that URL, the un-escaped ampersands will mess up their XHTML conformance.

Some search engines won’t index pages which they think are generated dynamically. They’ll see that question mark in the URL and just turn their asses around.

Luckily, using rewriting, we can clean up this URL to something far more manageable. For example, we could map it to

http://www.example.com/catalog/hats/53/

Much better. This URL is more logical, readable and memorable, and will be picked up by all search engines. The faux-directories are short and descriptive. Importantly, it looks more permanent.

To use mod_rewrite, you supply it with the link text you want the server to match, and the real URLs that these URLs will be redirected to. The URLs to be matched can be straight file addresses, which will match one file, or they can be regular expressions, which will match many files.

Basic Rewriting

Some servers will not have » mod_rewrite enabled by default. As long as the » module is present in the installation, you can enable it simply by starting a .htaccess file with the command

RewriteEngine on

Put this .htaccess file in your root so that rewriting is enabled throughout your site. You only need to write this line once per .htaccess file.