Monthly Archives: July 2012

Taking Web Performance Optimisation into my personal life, and partly egged on by my bro, I’ve been looking at my site’s performance over the last few weeks.

As with any performance optimisation, the starting point is the traffic profile. The site sees between 300 and 1’000 pageviews a day, the top 8 pages account for 50% of traffic, the rest is one or two pageviews per page per day.

Given this spread of traffic, full page caching will take a long time to warm up, and will only improve the second hit to any given page, if the cached page still exists. I’d like to boost performance across the board, so I looked at using memcache. Several plugins exist which leverage WordPress’s in-built caching and store the data in memcache so it persists between requests.

Couchbase

The folks behind membase and couchdb merged into Couchbase. They produce couchbase server, which is memcache compatible (via a proxy known as moxi I learned) out of the box, with the added benefit of persistence to disk. One of my long term goals is to store Magento sessions for a long time in a mostly persistent cache, so I was keen to experiment with couchbase.

At first, I installed couchbase, fired up the memcached-redux plugin, and my load times went from ~200ms to >5s. Turns out couchbase doesn’t work out of the box, it needs to be configured via the web interface on localhost:8091. Done. Now load times are in the ~400ms range. Slower. I learn that the proxy is slower at getMulti() requests. So I installed the memcached plugin, which implements its own getMulti() in PHP. Load times improve slightly to ~350ms.

Memcached

I then uninstall couchbase, install memcached, and try again. The memcached-redux plugin showed load times ~350ms, memcached plugin ~300ms with a couple of 4s responses thrown in for good mesaure.

Site was slower

Bottom line, using memcache was slower, whichever backend or plugin I tried.

On this server, we have plenty of spare memory / CPU and mysql has been given a generous amount of memory to play with. My guess is that for reasonably simple queries, when serving from memory, mysql performs about the same as memcache. Some old reading suggests mysql might even perform slightly better under the right circumstances.

Here, mysql is connecting via a unix socket while memcache is over TCP/IP. That alone might account for the performance difference.

Memcache has its place

Memcache has a whole lot of properties that make it useful in a wide range of circumstances. In fact, WordPress.com serve their cached pages from memcache via batcache. In an environment without a shared filesystem, memcache provides a distributed cache, which is the key to its success with WordPress.com. In fact the batcache literature specifically says that file based caching is faster on a single node.

Conclusion

On a single server with plenty of capacity, memcache is the wrong tool. I’m seriously considering varnish for sidebar and/or full page caching, it could really help with the busiest pages and I have some experience with it. But I think the next step will be to test APC. It’s a single machine, in-memory cache, so it could work well in this situation. Plus, the bytecode caching might have a positive impact.

A couple of weeks ago I trialled papertrail. Simply put, these guys rock. The application itself is great, simple and functional. But what sets them apart, above and beyond, is the service. Simply outstanding. I’ve been impressed with every interaction on their live chat, and more impressed by their seemingly non-stop presence in there, night or day.

The pitch is simple. Pipe all your logs into one place. Amalgamate the logs from multiple servers and applications, and dump them all into one place. Then provide a simple, easy to use web interface to monitor and search those logs.

It seems like a “so what” kinda product, but the effect is like going from dial up to broadband. The difference from dial up to broadband was not actually the speed, it was the always-on nature of it. Suddenly, when the net is only a click away, it becomes exponentially more useful. Papertrail is the same.

Before I had tried the service, I had difficulty seeing the value beyond my natural desire to be organised and prepared. But once I had all that data in one easy interface, subtle changes happened. I no longer use `tail -f` to keep an eye on things. I can watch the logs for the same service on multiple servers, colour coded (required a little CSS greasemonkey on my part), in a single flow.

Our logs are now closer. As in more close, not more closing! That has profound impact.

loggly

There’s another seemingly similar service called loggly. At our current estimated usage (2.5G/month) they’re free while papertrail would be $18 or $35 depending on how your price it. To get around that, I’ll likely exclude our static assets from the data we push to papertrail, it’s probably noise anyway. That brings us down to $7/month.

Just to see what it’s like, I created an account on loggly today. I don’t quite get it. The logs are not real time as far as I can tell. The focus seems to be on analysis. Trending, graphing, reporting. The interface is complicated, heavy. Maybe I’m missing something. If you’re a fan of loggly and can extol it’s virtues, please do so in the comments.

I might experiment further with loggly. It’s possible we’ll leverage the graphing capabilities of loggly at some point, maybe even in addition to papertrail. But for now, even at $7 instead of free, I like papertrail. I feel good about being a customer. That’s precious.