Little key-value store still works great, keeps most of the WWW running.

This week, memcached, a piece of software that prevents much of the Internet from melting down, turns 10 years old. Despite its age, memcached is still the go-to solution for many programmers and sysadmins managing heavy workloads. Without memcached, Ars Technica would likely be unable to serve this article to you at all.

Brad Fitzpatrick wrote memcached for LiveJournal way back in 2003 (check out the initial CVS commit here). While waiting for new hardware to help save the site from being overloaded, Fitzpatrick realized that he had plenty of unused RAM spread across LiveJournal's existing servers. He wrote memcached to take advantage of this spare memory and lighten the load on the site.

memcached is a distributed in-memory key-value store that uses a very simple protocol for storing and retrieving arbitrary data from memory instead of from a filesystem. To store a value, a program connects to the memcached server on the default port of 11211 and issues a series of basic commands. (Note: a binary protocol is also supported.)

The above log connects to the server, sets a key named hello (which expires in 60 seconds), and then fetches it back again. Simple, right? It is, but this becomes very useful when trying to avoid hitting an already overloaded database or filesystem. Additionally, through consistent hashing of keys, memcached clients are able to distribute values across multiple memcached servers.

For example, let's imagine that we have an author's information stored in a database. We could query the database on every pageview for the information, or we could run the query once and store the result in memcached for future requests. The basic logic looks like this:

Today, Ars Technica uses multiple memcached servers to store values such as comment counts, author info, and large chunks of HTML on every page. A single page load can pull hundreds or thousands of values from memcached, or even the entire page if it's a popular URL.

Ask any seasoned programmer about memcached and they'll likely have a story or two about how it saved their ass on launch day. So let's take a moment to thank all the memcached contributors for creating and maintaining one of the most important pieces of software used on the Web today!

i saw mention of support in drupal and some wikis for memcached, but i wonder if wordpress will take advantage of it? wordpress has some caching options already, and they're pretty good, but not as good as storing entire posts in memory until they're updated or expired.

Ah, I thought Redis was more NoSQL storage a la MongoDB or CouchDB or Cassandra. Is Redis just more flexible and easier to use as a cache?

As far as I know redis lacks the distributed nature of memcached. I believe it can do replication, but if you want sharding (like memcached) you have to implement it yourself. The other key difference is that redis is persistent and flushes to disk, whereas memcached will lose everything when restarted.

i saw mention of support in drupal and some wikis for memcached, but i wonder if wordpress will take advantage of it? wordpress has some caching options already, and they're pretty good, but not as good as storing entire posts in memory until they're updated or expired.

Wordpress caching is actually better done with a php bytecode cache like APC, along with an object cache (which could be memcache, but which you can also easily do by using an add-on like Batcache to also use APC space as an object cache).

And I see that Redis devs are working hard on clustering too. Not the same as a 10-year-proven system like memcached, but fun to see all these different approaches to the problem. Thanks for the interesting article!

I feel like any present day discussion of memcached on a major stage like Ars that doesn't mention Redis is doing a disservice to the reader.

Redis can already do almost everything memcached can do, and it does it just as fast (if not faster).

Quote:

I thought Redis was more NoSQL storage a la MongoDB or CouchDB

Redis is "NoSQL", but the similarities with MongoDB and CouchDB pretty much end there. Those are great tools too, but they are disk based schema-less document stores. Great for storing vast amounts of unstructured data that you may need to search through in some structured way. Redis is not disk based (though it is usually disk "backed") and doesn't provide the tools for querying vast amounts of unstructured data.

Redis is a key/value store. Like memcached, you can set keys to simple string values. Like memcached, Redis is designed to hold the entire dataset in memory. Redis, like memcached, works great as a cache.

Unlike memcached, you can set keys to contain complex data types. Redis offers hashes, lists, sets, sorted sets, and tons of powerful commands to leverage those complex data types. Redis comes with built in persistence (snapshotting or appending your data to disk), so you can use it as a real data store instead of just a volatile cache. Atomic operations, transactions (with optimistic locking), and lua scripting also help make Redis much more than a cache. Redis comes with built-in publish/subscribe commands, which works great for communication between processes, apps, or servers. Also included "in the box" is master-slave replication and high availability monitoring/failover tools (Redis Sentinel).

Redis already fills pretty much every use case for memcached as well as memcached does, plus can do things memcached could never do. Redis Cluster is just around the corner too, which will bring sharding and scaling abilities that will make memcached admins jealous. If you are starting a new project and aren't already deeply invested in memcached it makes very little sense to chose memcached over Redis anymore.

In corporate environments, I use the TimesTen database from Oracle for a similar purpose. It provides data to me at memory speeds but is backed up to disk based on rules. The TimesTen memory is also battery backed up - wouldn't want to lose that trade request.

I wonder if a similar facility in front of MySQL would work. My guess is that some things that Memcached is used for are never written to the database. So, an in memory database might be overkill.

I used to run memcached on ~100 production servers. I didn't understand the "cash" reference at first, as in Australia we pronounce "cache" in the context of computing totally differently to how we correctly pronounce cache (in original context; meaning stash) for some reason. In the computing context, we pronounce it as if it were not of French origin.

I used to run memcached on ~100 production servers. I didn't understand the "cash" reference at first, as in Australia we pronounce "cache" in the context of computing totally differently to how we correctly pronounce cache (in original context; meaning stash) for some reason. In the computing context, we pronounce it as if it were not of French origin.

Most people pronounce it exactly like "cash", but for some reason I always mentally read it as "caesh" (rhymes with "lace").

I used to run memcached on ~100 production servers. I didn't understand the "cash" reference at first, as in Australia we pronounce "cache" in the context of computing totally differently to how we correctly pronounce cache (in original context; meaning stash) for some reason. In the computing context, we pronounce it as if it were not of French origin.

Most people pronounce it exactly like "cash", but for some reason I always mentally read it as "caesh" (rhymes with "lace").

We have lots of Aussies in our place (in the UK), and I've always wondered where the disconnect came from over that particular word. Usually you can translate regional accents fairly consistently in a generalized way, so that if you understand which vowels to switch (or to swap v for w, etc) you can essentially transpose words to another accent. But the Aussie pronunciation of cache (kaysh) has always stumped me.

Similarly, the American pronunciation of "twat". It just doesn't transpose correctly

@imgx64: yes, and that's the correct way (according to all dictionaries) but then mainstream has diverged (I think that makes it a homograph?) from the dictionary in Australian English, making Australian dictionaries wrong, if you believe a dictionary is to be passive and document a language as it exists in common usage rather than trying to control it. Language in evolution!

Another divergence in Australian IT culture is "router", as due to our closer English ties we pronounce "route" like they do (sounds exactly like root, and they hilariously pronounce router "rooter"). But we pronounce router the American way, which to me makes perfect logical sense, being an American invention.

It's never correct to pronounce "cache" as "cash-ay" when referring to computer-style cache. Regional variants like "kaysh" are OK

I spent years working on a system called InterSystems Caché, which is, oddly, not a cache but a complete database system... so to avoid confusion I entirely agree you should never pronounce 'cache' like 'caché'.

@Lee Hutchinson & @arkizzle: Yep it's "kaysh" and I believe it was born from ignorance. But I kind of like the idea of having less ambiguous, context dependent language. I tried pronouncing it the "correct" way at work but this was equivalent to inventing my own time zone and insisting everyone else was late; I was met with blank faces.

@imgx64: yes, and that's the correct way (according to all dictionaries) but then mainstream has diverged (I think that makes it a homograph?) from the dictionary in Australian English, making Australian dictionaries wrong, if you believe a dictionary is to be passive and document a language as it exists in common usage rather than trying to control it. Language in evolution!

Another divergence in Australian IT culture is "router", as due to our closer English ties we pronounce "route" like they do (sounds exactly like root, and they hilariously pronounce router "rooter"). But we pronounce router the American way, which to me makes perfect logical sense, being an American invention.

It's only hilarious to Aussies though, because no one else really uses the word "root" for that purpose

in Australia we pronounce "cache" in the context of computing totally differently to how we correctly pronounce cache (in original context; meaning stash) for some reason. In the computing context, we pronounce it as if it were not of French origin.

Computer terminology in Australia can be a bit schizophrenic at times: What does the routing table on a router(both pronounced /ɹaʊt/) store? /ɹuːts/, of course.

Livejournal wasn't bought by Six Apart until 2006. I recall Brad talking about memcached when he first wrote it back in 2003 though. They'd kept running into some significant bottlenecks and decided to try memcaching, only to learn there wasn't really any code out there to handle it, so he wrote something up and gave it a shot.

Livejournal wasn't bought by Six Apart until 2006. I recall Brad talking about memcached when he first wrote it back in 2003 though. They'd kept running into some significant bottlenecks and decided to try memcaching, only to learn there wasn't really any code out there to handle it, so he wrote something up and gave it a shot.

Good point! I guess it would be Danga in 2003. I'll update the article.

i saw mention of support in drupal and some wikis for memcached, but i wonder if wordpress will take advantage of it? wordpress has some caching options already, and they're pretty good, but not as good as storing entire posts in memory until they're updated or expired.

Wordpress caching is actually better done with a php bytecode cache like APC, along with an object cache (which could be memcache, but which you can also easily do by using an add-on like Batcache to also use APC space as an object cache).

in Australia we pronounce "cache" in the context of computing totally differently to how we correctly pronounce cache (in original context; meaning stash) for some reason. In the computing context, we pronounce it as if it were not of French origin.

Computer terminology in Australia can be a bit schizophrenic at times: What does the routing table on a router(both pronounced /ɹaʊt/) store? /ɹuːts/, of course.

LOL

I'm not fluent with the pronunciation symbols, but my personal pronunciation would be that "I took 'Root' 9 to the Best Buy to get a new 'rowter'." This is the first time I've ever thought about how inconsistent that is.

So while I find Redis a pretty interesting development, Redis is not a replacement for memcached. Two reasons, one, you can not cluster Redis like you can memcached. Now the clustering effect on memcache is not like a load-balancer. Basically, when one node fills up or the memory is exceeded, another node will be called in. However, memcached does not load balance data across the network. You should not use memcached as a memory only database (nor Redis).

Second reason is important and some what over laps the first, Redis is single thread only. Redis will only attach to one CPU and when you're maxed out, you need to create a new instance, which you will have to load balance or figure something out code wise to run ontop of multiple instances. These two reasons are the reason you do not (at this point) replace memcached with Redis.

Hope this clears some stuff up, I've been interested in Redis for some time and have developed a few services that utilize memcached.

So while I find Redis a pretty interesting development, Redis is not a replacement for memcached. Two reasons, one, you can not cluster Redis like you can memcached. Now the clustering effect on memcache is not like a load-balancer. Basically, when one node fills up or the memory is exceeded, another node will be called in. However, memcached does not load balance data across the network.

Your first reason makes total sense and I agree 100%. Redis Cluster will fix this shortcoming, but for now the behavior you describe would take a lot of work to setup and administer on top of Redis. Memcached makes more sense if you require this behavior right now.

Quote:

You should not use memcached as a memory only database (nor Redis).

Not entirely sure what you mean. Redis makes a great permanent data store as the included persistence to disk is very robust and configurable. Clearly, it wouldn't make a very good permanent data store if you turned off persistence (memory only).

Quote:

Second reason is important and some what over laps the first, Redis is single thread only. Redis will only attach to one CPU and when you're maxed out, you need to create a new instance, which you will have to load balance or figure something out code wise to run ontop of multiple instances. These two reasons are the reason you do not (at this point) replace memcached with Redis.

This point at first glance seems like a big problem. However, under any memcached-like use case (simple key writes and reads) Redis will exhaust available memory bandwidth long before maxing out even a single core of the CPU, at least on most systems. Since Redis is rarely CPU-bound the design decision to be single-threaded made ensuring all commands are completely atomic a lot simpler, without really risking hitting CPU limits. Its possible that by using complex data types in Redis, combined with lua scripting, you could hit CPU limits. For most use cases, though, the single-threaded nature of Redis will never be a problem, so while it may be a shortcoming, its a very small and qualified shortcoming.