Redis Zen

May 29, 2013

Working with Redis can be done with the usual mindset, but approaching
it from a different mindset can reap significant rewards.

HUGOMORE42

The fundamental approach to most data storage during development is “OK,
so how do I fit this into an SQL DB?“. Then you spend time figuring out
how to munge the data you get out when you run queries. When you are
using Redis, you need to turn this around.

Instead you first approach how you want to retrieve the data. Since
Redis is more of a Data Structure Server than a database, how you intend
to consume the data should drive which data structure(s) you use.

For example, consider a common use case: statistics.

Let us say we want to track traffic in log-time. That is, we have
something that process access logs as they come in via Syslog.
Specifically we want to track every page’s request rate. With the
standard mindset we would assume we want a table with the page-id,
perhaps a timestamp, and an integer. Or perhaps we design a mated table
with one detailing the page, and a table to store one hit per record.
Then we write a bunch of code to tease the information out of the
database.

Now we take the Redis approach of what I call “Access Based Design”. We
know we want to show the top 10 most requested pages in our site for the
last, day, and the last 7 days. This leads to a far different data
structure design.

Instead we might wind up with something like using a sorted set key for
each window we want to track and report on which uses zincr to increment
the counter each timer we see a page-id. This reflects the access driven
design.

To pull the data out we can retrieve precisely the window we want. We
use the zrevrangebyscore command to pull the ten most accessed
page-ids.

Storing the data is simple, though it does require a break from the “do
it all in one DB update” mentality. You use a zincrby command for
each window you track. Thus, you zincrby the key
“sitename:hourly:YYYY:MM:DD:13” adding one to the page-id stored in the
sorted set to register a view at 1PM of the day in question. If you want
to only keep that data for 24 hours, use expireat to have it
automatically purged 24 hours later. By doing this every command you
ensure the data is fresh for 24 hours after the last update.

You would do this for each window on each hit. Redis is fast enough to
handle running multiple commands in quick succession and you can
pipeline them as well. Why munge data going in when you can simply
increment each counter? Especially when you’d need to rework the data
coming back out.

By implementing it this way you eliminate a lot of overhead. You don’t
have t manage the data expiration, you don’t have to do calculations in
code, you make one call to retrieve precisely the data you are after,
and you store data in an extensible fashion which makes adding new
counters trivial.

Choose the structure for your data depending on the best way you would
access it and you can eliminate a bunch of code and logic. While not
everything is as obvious as this common example, the mindset it teaches
is what is important. Once you get the hang of it you’ll be able to know
immediately whether the data fits a traditional relational DB model or a
Data Structure Server model. When it fits the Redis way the design will
be quick and easy for you to see, saving you significant design and
coding time.