Making Old Links Work with mod_rewrite

So one of the big concerns with moving from an old blog installation to new software is the possibility of breaking links from around the Internet to content that I have created. At first, one might think that this kind of thing might generate a massive SEP field, but there’s more at stake than just a broken link.

What is that, you ask? PageRank. The content on my site has garnered a non-trivial score in Google’s view of the world, and such a score is valued, as it determines not only how high your own sites appear in any given Google query, but also how high people to whom you link appear. It is valued so much, in fact, that people often resort to comment spamming even mildly popular sites in order to increase their own search results.

So what’s the tool of choice? Well mod_rewrite, of course. The mod_rewrite package is an Apache module that performs some black magic of rewriting URLs from one form into another. In my particular case, the requirement was to map a Geeklog story ID (SID) into a Drupal node ID (NID). Surprisingly, that’s relatively easy to do.

The .htaccess file in this application’s directory contains the following declarations. The first line examines the incoming query string for a variable called story, and it captures it using a regular expression. The second line looks for a request beginning with article.php, and then maps it to a value pulled from a mapping called geekmap. Notice the %1, which is a backreference to the SID captured in the previous line. The question mark at the end strips off any other query strings. Finally, the L stops all further mod_rewrite processing, and the R causes the new URL to be returned as an HTTP 301 response code, meaning a permanent redirect.

The two-part rewriting is required because mod_rewrite doesn’t support examination of query strings inside of a RewriteRule.

And that’s it! I do similar mappings for the old static page links, as well as the old RSS feed, but those are all much simpler cases. Finally, there’s some other Drupal magic on the backend to convert those long underscored URLs into node ID URLs (such as node/136), as well as further mod_rewrite magic to make the Drupal URLs pretty – but that’s beyond the scope of all this.