julik live

Brutal fragment cache

Recently I've hit that painful mark, y'know... A page on a Rails site I've been developing crossed the dreaded "1 second" request time. This can't be happening, thought I - it's just a list of objects! One SQL request, bonafide, no associations fetched, all indexes are in place... and the most aggravating was - the SQL request itself was a measly 0,1 seconds long.

Something is wrong. After a short investigation the problem turned out to be of the following nature:

I've hit the "repeating render" (or "slow helpers") problem - one of the most stupid Rails aggravations in the performance department. See tha' slide about the app spending 30 percent of time in textilize? That's about us. The beautiful thumbnail render entails not just doing ERB though - it's doing captures, calculating a hashed URL via signed_params and doing a bunch of other stuff. Done 100 times in a row it was most enough to escalate a simple image listing into a performance disaster. So, caching to the rescue - and, more specifically, the Rails fragment cache. After all, I've taken care to render the thumbnails in most the same way almost everywhere (with the same recognizable widgets and sizes), but which ones - that varies per page, search result and so forth.

So, we read about the Rails caching. As it turns out, it recommends us to cache stuff by key, but... the key has to be a String! Or, more specifically, it can be a String or a hash of options (which most likely comes from url_for. So I'm encouraged to specify from which controller I am caching the fragment, from which action, and also it's probably expected that I will introduce some value that will help me identify what exactly I am caching. Like a model primary key value, for instance.

Meh. Such a hassle. Let's examine the problem here: I have an object (an Image model). I know that there were no changes to this Image as long as any of it's attribute values did not change. There are no associations to track on it (this can vary, but still). As long as the Image stays the same, the thumbnail will be the same as well - and the cache fragment too! Why should I bother to pass the key of the Image to the cache key method if I know that the state of all the fields in the model can tell me about the freshness of the cache? But bear with me here. Let's say we do this:

@users_images = User.images

We know that a @users_images variable also holds a certain state - it's an Array with Image objects in it. If the images change, then @users_images will change too, right? It will just contain different objects.

So, let's see again - we are actually not limited to some arbitary string key for an object's cache key, we can use the object itself as state indicator. Tobi Lütke said that the most useful thing is to be specific on what you want to get from the cache. So a simple realisation dawned on me:

If we know what we want to cache and based on what the cache key can change, we can transform our objects themselves into cache keys.

And swifly so. Let's say we have, uhm, well an array of something. Of models, strings, hashes and so forth. This array is marshalable (it does not contain any handles to database connections or files, IOs from CGI processing and Procs). We can use the following trick to reliably get a caching key for this array:

When we change the value of one of the attributes the marshal checksum changes, but it stays the same for the same object fetched from the database over and over. Bwilliant.

Now let's see.. paths they said. Right. The spread of these checksums can be huge, and by huge I mean huge. If we just dump them all into one hashmap or into one directory (likely with Rails fragment cache) the filesystem will cringe and burn. Let's use the nice Jamis Buck trick for splitting the burden (the same is used in attachment_fu):

Excellent. This will give us a path that will create no more than 256 entries per directory, with one neat cached file in the last one down there. And the filesystem will be searching fast.

We also are also using a special treat of hashing algorithms like MD5 here. While Ruby's Object#hash would be giving us values which are not spread in any way, hashing crypto pretty much guarantees that even objects that look like each other give a wildly different hash (have good spread):

The second question is - how we can determine where we've cached from? Opinions are divided - we can bring the controller into the mix, and just pass it to key_for_anything just like most anything else, but this is too specific to be attacked right now. What we must do is not pollute the cache directory with our "automagical" subdirectories but make a subdir

Expiring

Now on to expiry: as my most fantastic Flame teacher once told me, on n'a pas de temps pour ça!. Never. Expiring caches is a burden, we have to skip it altogether if we can. In our case expiring is simply irrelevant because when the data changes it's cache key will change all by itself. memcached will actually expire for you if the cached bit is not too hot (not used often), which, for my case, is fine. You do want to clean your cache from old items every now and then though.

This was the part about expiry and it's over (a bit... short, wasn't it?). Now on to actual business. First, let's set up a helper. Also let's include my lovely trick - using OpenSSL's functions for speeding up hashing.

require 'openssl'
module ExtremistCacheHelper
def lazy_cache_key(*whatever)
calling_method = caller(2)[0..1]
# OpenSSL's MD5 is much faster than the Ruby one - like ten times
checksum = OpenSSL::Digest::MD5.hexdigest(Marshal.dump(calling_method + whatever))
# Splitting an MD5 on 2 symbols will give us good rainbow spread across
# directories with 256 subdirectories max, in each given directory
segmented_path = "megacache/" + checksum.scan(/(.{2})/).join('/')
end
end

I am using the calling method here to somehow differentiate where the cache has been requested from, and I start at level 2 to see the outer method of this one (we will use lazy_cache_key only as an assistant).

Due to idyosincrazies of ERB, let's start with the easy bit - if you want to cache some ERB from a helper method:

# This one should be used from controllers and helpers
def cached_based_on(*whatever)
segmented_path = lazy_cache_key(whatever)
controller.read_fragment(segmented_path) || controller.write_fragment(segmented_path, yield)
end

Now to the more difficult version - when we want to capture from ERB. ActionController actually provides us with a wonderful shortcut just for this specific case (this is what the standard cache do... does):

# This one - from ERB
def erb_cache_based_on(*whatever, &block)
@controller.cache_erb_fragment(block, lazy_cache_key(whatever))
end

And we're done. Now the only thing remaining is piggybacking that into our controller

Excellent. No expiry, no invention of keys and it's coupled to the version of the model - when the model data changes it will be cached anew.

Keep the place clean

There is a concern though: if we are caching into something that does not expire by itself, we will undoubtedly pollute it completely at some point. Let's be smart and do the same thing memcached does: mark how hot a cache is. For this we will override a useful method in Rails' very own file fragment store (which, much helpfully, is #:nodoc:)

This will change the fragment modification time on every read access. When we have time, we glob for all the files that have been modified before a certain date and trash them, no questions asked - a perfect task for a cron job. It has a performance penalty (I presume that changing a modification date on a file is not instantaneous).

Caveats

If you want to include associations in the mix, you'll want to rework this somewhat. And of course, beware of using blank? values (like nil or []) as keys.

Much betta. In comparison to what I've been experiencing with all pages costing 1 second it's a substantial gain (and a cure to the slow helper problem) with minimal manual intervention. And when the cache is hot enough (all thumbnails have been browsed through) all the listings will be sped up considerably (including the dynamic ones, like searches). And it works with pagination without extra magic (if you want to bother to actually load the objects from the database).