Menu

A few months back we launched a site redesign/redevelopment project for a client, and made a simple mistake that had some interesting ramifications. It’s worth posting here so others don’t make the same mistake.

What Happened

When redeveloping the site, we moved the WordPress instance from the web root:

I consider this to be a best practice for WordPress powered sites, as it makes version upgrades a bit easier (among other things).1 Some plugins don’t work well with symlinked “plugins” directories, but those generally have easy fixes (use ABSPATH, not dirname() in your plugin).

These changes were made and thoroughly tested on our dev server with historical data. We then made the changes in a staging environment under a beta.example.com style domain name before pushing the changes live. In short, we were pretty careful and tested pretty well.

The Result

When we finally pushed everything live, the production server was brought to its knees. Yikes!

A little poking around, some bug reports, and very shortly thereafter we figured out the problem. We had failed to set up a rewrite rule to map content from:

/wp-content/uploads/

to:

/wordpress/wp-content/uploads/

where it was now located.

An omission like this wouldn’t have had much effect on server load in some site configurations, but one of the wonderful features of WordPress – its lovely permalinks – has a hidden caveat that can be exposed if someone makes a configuration mistake (like we did).

WordPress’s permalink system basically works like this:

See if a file exists in the location that was requested; if the file exists, serve it. This is how images, media, non-WordPress files, etc. are served without conflicting with WordPress.

If no file is found at the location that was requested, then pass the URL to WordPress and see if WordPress can figure out what to show.

It’s a really elegant system and works very well. However, it also means that 404 requests – http requests that result in a “file not found” message – have a much higher impact on the server than a traditional 404 request does. For every 404, the server instantiates WordPress, does some database work to try to see if it can figure out what to serve, etc.

When we didn’t set up proper redirects for the content in “uploads”, we basically increased the server load by a factor of 20. Instead of a single request going to WordPress and 20 requests serving static content by Apache, all 21 requests were being sent to WordPress.

The increased load on the server had some… adverse effects on performance. Yeah.

Why We Missed It

So this should be a pretty easy thing to catch, right? If the URLs to the images are broken, we’d all be seeing a bunch of broken images all over the place in our testing – right?

Not exactly, and for two different reasons:

When we tested on the production server using the beta.example.com hostname, the content was still pointing to example.com and the live production server was dutifully serving the images. It wasn’t until we pushed the changes live that the images were no longer at the previous URLs.

Even when the images were no longer in the proper place, our browsers that we were testing in were showing the images properly. This was due to longer expires headers we had just implemented for media on the server in order to reduce overall server load. In casual testing, everything looked to us to be working properly.

Once we tested in browsers that didn’t have the images cached, we immediately saw the problem.

Easy Fix

Luckily, this is a really easy problem to fix. A simple mod_rewrite rule (placed before the standard WordPress rewrite block) fixed everything right up:

On a side note, wouldn’t have been easier if you just modified the virtual host configuration of the webserver to make public_html/wordpress the new document root, rather than publishing the blog in a subdirectory of the main domain and have to remap everything ?

If the subdirectory installation is just to easen the upgrade (I just use svn) maybe it makes sense, oh well.

I habitually skim over stray (outdated) requests in the site stats and access logs, whenever I move a site or change a site’s structure – but as WP camouflages all 404s in this case you wouldn’t even have a chance by eagle-eyeing into Apache’s error_log.

Alex, I’m bookmarking this explanation. I did *not* realize that urls were being passed to WordPress to resolve in such a manner. I figured it was one pass, then straight to 404. (and while being far from a programmer, I’m far from a newbie too.)

This is significant, because this may explain the performance disparity that some of the more competitive code-jockeys report when comparing WordPress to Moveable Type. “My server goes haywire, WordPress is teh suck, PHP is a crappy language”, blah-blah-yada-yada.

It might simply be a case of a folder configuration that is causing some DBs to get hammered, while others glide right through. I never would have looked at such a thing from an optimization standpoint.

Typically when I move a site, I do a find + replace on the SQL database (new url for old url), the main reason for which being attachment urls and intra-site links. I had some experiences like yours early on, and that seemed like the more sustainable solution.

That’s a good way to do it as well, but has some issues too. For example, your testing on beta.example.com would result in broken images everywhere unless you duplicated the uploads directory into the “new” location.

Alex, Adam raises a good point. Now that the site is no longer just a test, it may be a wise idea to run a quick DB query to update all the site.com/wp-content/ URLs to reflect the site.com/wordpress/wp-content/ URLs throughout the content rather than having to rely on .htaccess — it seems like the proper thing to do.

Changing the URLs is definitely a good idea, but the rewrite rule is still necessary because the old image URLs still exist on the web in links, search results and archives, and need to be redirected accordingly.

This fix would be very useful for our prcboardexamresults.com website. If it reaches only 10K UVs or more, a dramatic and unexpected increase of server load suddenly occurs and this causes our website to crash. I should check and recheck the points that you’ve given up there 😉

Alex
I took your advice and moved my entire site into a subdirectory called ‘blog’ , but it’s broken various bits of my install.
My theme editor no longer works, it just says ‘ File not found, please try again. Merci! ‘ and on te login page, there’s a whacking great pink box that just says object in it.
You’re the only person I’ve seen do this, so I figured I’d ask 😛

hello Alex,
I had to do it the other way round, I wanted to move my blog from a subdirectory called wordpress to the root.

I copied everything, then with phpmyadmin exchanged all occurances of h**p://mydomain.com/wordpress with h**p://mydomain.com and the same for the absolute paths like exchange var/www/web6/web/wordpress with /var/www/web6/web and now ALMOST everything works, except that I have problems displaying some posts from certai ncategories AND when any page of this blog is accessed even if I put it into maintenance mode and jsut browse through the backend, the spu load spikes to 80%

I don’t think a redirect would do any good as this happens even when I surf the backend.

[…] 404s and WordPress Server Load alexking org Posted by root 3 days ago (http://alexking.org) I consider this to be a best practice for wordpress powered sites it the new black matt this is great but unfortunately many existing plugins lebseo web design amp internet marketing company in lebanon adds this comment Discuss | Bury | News | 404s and WordPress Server Load alexking org […]

Thanks for this post – it opened my eyes WHY my serverload
goes to 160 (yes, 160) and the server crashes. Sadly a simple
redirect won’t work because I converted a huge website to wordpress
and it is too complex. Does anyone know how to stop this wordpress
behaviour when permalinks are activated? So that people are sent to
404 directly? I think this is something many people may
have.