Model classes

Do not just “use redis” in random classes, just like you wouldn’t write SQL
queries in, say, a controller. Instead, wrap your Redis usage in a model class.

Key naming

Separate keys with :

The first component should be the underscored name of the model class

For example:

# DriverStatus
driver_status:{driver_id}

High-cardinality key components should be at then end (so we can look at the
“tree of keys” meaningfully). So, for a set of order IDs…

Good:

users:order_ids:{user_id}

Bad:

users:{user_id}:order_ids

Data modeling

It is acceptable, but not required to have exactly one hash key per record.
The one-hash-per-record approach mimics ActiveRecord more closely, but can
defeat the purpose of using Redis—it has faster, more advanced data structures.

Like other non-relational stores it’s often best to store data in a format
that’s friendly to the heaviest queries: the example below illustrates a case
where each record has only an ID an an enumerated field, and uses sets instead.
Another typical approach is to store one hash per record, but also have “index”
keys; for instance, one could speed up geographical lookup of restaurants by…

storing each restaurant’s data in a hash key;

using a sorted set of IDs scored by geohashes for extremely fast bounding-box
searches

Memory usage

When designing storage for a Redis-backed models, it is advisable to be aware if
how efficient Redis data structures are if you’re going to store a lot of data.

Key tidbits:

One-hash-per-record can be very inefficient, as the hash keys are repeated
for each record. Columnar storage might be preferred, or (if the whole hash
will always be needed) packing record data with MessagePack and replacing long
keys with shorter ones.

While querying large sets, sorted sets, of hashes is normally faster than
querying the root keyspace (e.g. for membership checks), be aware that those
data structures have a memory overhead.

Scalability and sharding

It is not fine to hold data for all records of a given model in a single key,
as this breaks shardability of Redis.

Sharding is the practice of scaling Redis horizontally by deterministically
reading and writing certain keys from a given server, based on a hash of the
key. This is built into Redis clients and has even better
support in Redis 3. Splitting large
datasets into multiple keys (partitions) means they can easily be sharded across a cluster,
when we need to, without major refactoring.

Partitioning should be considered for any data structure that exceeds a few
thousand entries (lists, sets, etc), and is likely to grow as time passes or the
business grows.

Example

I have a set of payment method fingerprints. I want to be able to rapidly
determine whether a fingerprint is marked as fraudulent.

The backing store for this is 2x256 Redis sets; 256 for each “good” and “bad”
fingerprint status. Having 256 buckets per status lets us easily shard the data
when we need to.

Note that the number of partitions you need can be optimised. In this particular
case, the primary purpose is to allow for clustering.

The class exposes .find_by(id:), #save, and #save! as any Rails user would
expect.