Cache Invalidation Complexity: Rails 5.2 and Dalli Cache Store

Rails applications that use ActiveRecord objects in their cache may experience an issue where the entries cannot be invalidated if all of these conditions are true:

They are using Rails 5.2+

They have configured config.active_record.cache_versioning = true

They are using a cache that is not maintained by Rails, such as dalli_store

In this post, we discuss the background to a change in the way that cache keys work with Rails, why this change introduced an API incompatibility with 3rd party cache stores, and finally how you can find out if your app is at risk and how to fix it.

Even if you’re not at Rails 5.2 yet, you’ll likely get there one day. It’s important to read and potentially mitigate this issue before you run into it in production.

However, this causes unnecessary cache invalidations. For example, let’s say that you have three objects and three slots in your cache. Letters A, B, and C differentiate the objects, while a number indicates their versions, these are all version one:

[A(1), B(1), C(1)]

When object A changes, it doesn’t evict the cache for object A, instead, it evicts the last cache entry which is C. Now the cache looks like this:

[A(2), A(1), B(1)]

The next time that C is requested it won’t be found, and it will be re-calculated and get added to the front of the cache. This addition pushes out the copy of B:

[C(1), A(2), A(1)]

Now the next time that B is requested it won’t be found, and it will be re-calculated and get added to the front of the cache:

[B(1), C(1), A(2)]

While we only made one change to object A, it resulted in clearing and resetting the values for both B(1) and C(1) even though they never changed. This method of cache invalidation adds unnecessary time spent recalculating already valid cache entries. Cache versioning’s goal is to fix this unneeded cache invalidation.

Cache invalidation with cache versioning (recyclable cache keys)

With the new method of cache versioning, the keys stay consistent, but the cache_version is stored inside the cache entry and manually checked when pulling an entry from the cache.

When object A changes, to version two, it will pull the A(1) object from the cache, see that it has a different version, and replace the entry in the same slot using a consistent cache key:

[A(2), B(1), C(1)]

Now future calls to retrieve the A object will show that the version is correct, and the cached value can be used.

With this new scheme, changing one object does not have a cascade effect on other cached values. In this way, we’re able to keep valid items in our cache longer and do less work.

How well does it work? DHH at Basecamp had this to say:

We went from only being able to keep 18 hours of caching to, I believe, 3 weeks. It was the single biggest performance boost that Basecamp 3 has ever seen.

By enabling recyclable cache key versioning (config.active_record.cache_versioning = true), instead of having to recalculate every cache entry every 18 hours effectively, the churn spread out over 3 weeks, which is very impressive.

What’s the issue?

Now that you know what recyclable cache keys are and how Rails implements them you should know that the client that talks to the cache provider needs to be aware of this new scheme. Rails ships with a few cache stores

:memory_store

:file_store

:mem_cache_store

:redis_cache_store

If you’re using one of these stores then you get a cache client that supports this feature flag. However, you can also provide a custom cache store and other gems ship with a store. Most notably:

:dalli_store (not maintained by Rails)

If you’re using a custom cache store then it’s up to that library to implement this new scheme.

If you’re using :dalli_store right now and have config.active_record.cache_versioning = true then you are quietly running in production without the ability to invalidate caches. For example, you can see CodeTriage, an app that helps people contribute to Open Source not change the view when the underlying database entry is modified:

Why is this happening? Remember how we showed that the cache key is the same no matter if the model changes? The Dalli gem (as of version 2.7.8) only understands the cache_key, but does not understand how to insert and use cache versions. When using the :dalli_store and you’ve enabled recyclable cache keys then the cache_key doesn’t change and it will always grab the same value from the cache.

How to detect if you’re affected

First confirm what cache store you’re using, make sure to run this in a production env otherwise you might be using a different cache store for different environments:

Pros: With this store you get cache key recycling, you also get cache compression which helps significantly with time transferring bytes over a network to your memcache service. To achieve these features this cache store does more work than the raw :dalli_store, in preliminary benchmarks on CodeTriage while connecting to an external memcache server the performance is roughly equivalent (within 1% of original performance). With the decreased space from compression and the extra time that cache keys can “live” before being evicted with key recycling, this makes this store a net positive.

Cons: The cache keys for :mem_cache_store are identical to the ones generated via :dalli_store, however it does not have the version information stored in the cache entry yet. When :mem_cache_store sees this it falls back to the old behavior of not validating the “freshness” of the entry. This means in order to get the updated behavior where changing an Active Record object actually updates the database you’ll need to invalidate old entries. The “easiest” way to do this is to is to flush the whole cache. The problem with this is that will significantly slow your service as your entire application is then functioning with a cold cache.

Disable recyclable cache keys (cache versioning)

If you don’t want to replace your cache store, disabling the cache versioning feature will also fix the issue of changing Active Record objects not invalidating the cache. You can disable this feature like this:

config.active_record.cache_versioning = false

If you’re wondering about the config naming as I was it’s cache_versioning because the version of the object lives in the cache rather than in the key. It’s effectively the same thing as enabling or disabling recyclable caching.

Pros: You don’t have to switch your cache store. Doesn’t require a cache flush (but will instead manually invalidate keys automatically due to changing cache key format). You can use this information to slowly roll out the cache key changes if you’re able to do blue/green deploys and roll out to a percentage of your fleet. You’ll still get some instances operating under a cold cache but by the time 100% of instances are running with the new version then the cache should be fairly “warm”.

Cons: You won’t have to flush your old cache, BUT the cache key format will change which effectively does the same thing. When you change this config then your whole app will not be able to use any cache keys from before and will effectively be working with a cold cache while you’re re-building old keys. You do not get recyclable keys. You do not get cache compression. Disabling the cache versioning will also mean that dalli must do more work to build cache keys which actually makes caching go slightly slower.

Overall I would recommend switching to :mem_cache_store and then flushing the cache.

Next steps

At Heroku we’ve taken efforts to update all of our documentation to suggest using :mem_cache_store instead of directly using dalli. That being said there are still a ton of historical references to using the older store if you see one in the wild please make a comment and point at this post.

Since the issue is deeper than the :dalli_store, it potentially affects any custom cache we need a way to systematically let people know when they’re at risk for running in a bad configuration.

My proposal is to add a predicate method to all maintained and supported Rails cache stores for example ActiveStorage::Cache::MemCacheStore.supports_in_cache_versioning? (method name TBD). If the app specifies config.active_record.cache_versioning = true without using a cache that responds affirmatively to supports_in_cache_versioning? then we can raise a helpful error that explains the issue.

There’s also work being done on dalli both for adding a limited form of support and for adding documentation.

As it is said there are two truly hard problems in computer science: cache invalidation, naming, and off by one errors. While this incompatibility is unfortunate it’s hard to make a breaking API change that fully anticipates how all external consumers will work. I’ve spent a lot of time in the Rails contributor chat talking with DHH and Rafael and there’s a really good thread of some of the perils of changes that touch cache keys in one of my performance PRs. We realize the sensitive nature of changes anywhere near caching. In addition to bringing more scrutiny and awareness to these types of changes, we’re working towards making more concrete policies.

Subscribe to my Newsletter 😻 🤠

Keep Reading 🚀

Today I have an unusual proposition for you. I’m spending a bunch of time to try to get Beto elected to Texas Senate, so I’ve not been able to write as much technical content. Rather than slow down on my door knocking, I’m looking to pick up the pace, and I want you to do it with me. Starting today, I’m offering anyone who phone banks or “block walks” (knocks on doors) the opportunity to win some of my technical time. Here’s how it’s going to work.

You might know rubocop as the linter that helps enforce your code styles, but did you know you can use it to make your code faster? In this post, we’ll look at static performance analysis and then at the end there’s a video of me live coding a PR that introduces a new performance cop to rubocop.

Rails 5.2 was just released last month with a major new feature: Active Storage. Active Storage provides file uploads and attachments for Active Record models with a variety of backing services (like AWS S3). While libraries like Paperclip exist to do similar work, this is the first time that such a feature has been shipped with Rails. At Heroku, we consider cloud storage a best practice, so we’ve ensured that it works on our platform. In this post, we’ll share how we prepared for the release of Rails 5.2, and how you can deploy an app today using the new Active Storage functionality.

Before you can understand how to build a parser using parslet, you need to understand why you might want to. In my case I have a library called rundoc it allows anyone to write documentation that can be “run”. For example, someone might write docs that had this: