I think I should write more about Redis development... lately I was so focused on writing the code and the Redis Book that finding the time to blog about Redis was really hard, but I'll try to improve in the next weeks. However today I want to provide some fresh news to Redis users: to have some insight into the near future of a project can be very interesting for developers planning to start a new project with Redis.

Currently we have three development branches of development: 2.2, 2.4, Redis Cluster (unstable branch).

2.2 is a bugfix only development line, so we'll continue to ship 2.2.x versions only to fix bugs.

2.4 is our new branch, it is just a few days old. Our old development model with a stable branch and an unstable branch did not worked well, we needed something in the middle. There is simply a lot of stuff that can be back ported from the unstable branch.

The unstable branch where Redis Cluster development is happening, will take time to reach stability as the cluster is a big project (our idea is to release a first stable version of Redis Cluster later this summer). 2.4 is a way to put something better than 2.2 in the hands of our users ASAP.

We hope to ship 2.4 in an estimated time frame of 6 weeks. It will include the following changes compared to 2.2:

Memory optimized sorted sets. This means that small sorted sets will take little memory like small hashes, lists, and sets composed of integers are doing already.

Variadic versions [LR]PUSH, SADD, ZADD, ... so you can, for instance, push multiple values inside a list with a single command. I measured the difference with a few benchmarks and the difference is really dramatic compared to pipelining of many LPUSH commands.

Big improvements in .rdb persistence. Now specially encoded types are saved directly as they are. Just to give you an example, if you have a dataset composed of lists with an average of 100 elements you can expect 50x faster .rdb persistence.

All the above stuff is already inside redis unstable of course, but with 2.4 it will be readily available to all the users in short time. The current 2.4 branch only includes the first two changes, I'm working on merging the latest.

How to play with Redis Cluster

We have also some news about Redis Cluster. You can test with your hands what we have already.
The following is an howto about testing Redis Cluster. Note: Redis Cluster is not complete, it is currently an alpha with a lot of missing features, and it is not stable. Here the goal is just to provide a preview.

To play with Redis Cluster fire three instances with the following configuration:

port 6379
cluster-enabled yes
cluster-config-file nodes-1.conf

Use port 6379 for the first instance, 6380 and 6381 for the other ports.
Also make sure to use a different cluster-config-file name, nodes-1.conf, nodes-2.conf, nodes-3.conf. The cluster config file is not something you should change by hand, is a file where a cluster node saves the current configuration to reload the state at restart.

Now that you have three instances running you can start performing some command:

As you can see this node only knows about a single node, that is, itself. You can see this from the "myself" flag in the cluster nodes output. The cluster info output instead shows how out of the 4096 hash slots in which the key space is divided, nothing is assigned. This is why this node will not be happy to reply to queries:

redis> get foo
(error) ERR The cluster is down. Check with CLUSTER INFO for more information

So the first thing to do is to join the cluster, that is, make nodes aware that there are other nodes around, as this is a completely new cluster.

As a first step we join the instance running at 6379 with the instance running at 6380:

Every node has an ID that will be used for all the live of the node. All this info are saved in the nodes.conf file. The format of this file is exactly the same as the cluster nodes output as I was lazy to invent something new but this turned to be an advantage actually (less code, more descriptive info nodes).

Now Redis Cluster nodes are like bored old ladies, they gossip a lot about other nodes. But the good thing is that at least cluster nodes are very well informed, and only report informations they are pretty sure about ;)

Every node every second sends a PING packet to some random node, actually this node is not selected at random, but among nodes that are believed to be OK but with the oldest pong_received field in the node structure, so we tend to ping nodes that we don't chat with since more time.

In every PING packet, and in the PONG reply, there is a gossip section where we inform the other node about informations about other nodes. Also when a node pings or pongs another node, there are a lot of detailed information about the node sending the packet.

For a node to be marked as failing we need to both detect that it did not replied to our pings from some time, AND also we need to receive that another node has troubles wit this node, thanks to the gossip section. When this happens the node marks this other node as failing, and sends a "mark-as-failed" message to all the other known nodes.

Let's test gossip in practice. Know we have 6379 joined with 6380. What happens if we join 6380 with 6381 is that also 6379 and 6381 will meet. But Redis Nodes are like good families girls, they only trust and meet with other nodes either already trusted (in their nodes table) or trusted by their friends. The only way to make a Redis Node talking with another node that is not already in the known nodes list, nor in the know nodes of another trusted node is via the CLUSTER MEET command.

Now all the three nodes are connected and aware of their friends... however the nodes are still not able to reply to queries as hash slots are not assigned at all. To assign hash slots we need to send "CLUSTER ADDSLOTS" commands. We assign part of the 4096 slots to all the nodes, so that all the slots will be covered:

Yes! Now our cluster state is OK. As you can see near every line of cluster nodes output there is the is the list of assigned slots. This informations all propagated thanks to the gossip section of PING/PONG packets. We are ready to try some actual query:

redis> get foo
(error) MOVED 3990 127.0.0.1:6381
redis> get bar
(nil)

Now nodes accept our requests finally. The first request was about hash slot 3990 as the key 'foo' will hash to that hash slot. So we got routed to the right node. A good client will remember this and will directly hit the right node the next time.

Ok, that's all for now. I hope that while I can't show a full solution for now this journey in the status of Redis Cluster was more interesting than just reading my tweets about "I'm working at cluster".

Also note that to operate on a cluster you'll actually never do this kind of stuff by hand. The redis-trib program will do all this for you, but my thought was that it is a lot less instructive to just type 'redis-trib create ...'. I wanted to show a bit more of the inner workings.

@Matthew: not in 2.4, probably not even in 3.0 (redis cluster stable release number) as we consider cluster more high priority. Basically diskstore is just an experimental project, it will hit a stable release only if/when we think it rocks. I'm a bit skeptical about mixing Redis and disk as primary storage (not just for persistence) but we'll keep trying new solutions.

I am very excited to start testing with Cluster. Thank you for your time working on it, and for posting an update on your progress! Since Hiredis is the "official" client library for Redis, are there plans to evolve it from being a 'naïve' client to a 'full-featured' client as far as Cluster support is concerned? (Referencing previous Cluster terminology where a naïve client requires two round trips for a lookup (using the MOVED response to find the key) and a full-featured client will maintain (and update) a map of keys to hash slots.)

One small improvement in assigning hash buckets would be to round-robin the hash buckets across the nodes in your cluster. Using the sample hash bucket division of hash slots will end up with a pretty uneven distribution of keys.

The distribution is not very uniform for values of crc16() % 4096 on strings that only differ by one or two ascii values. Strings that are generated sequentially will tend to have similar values. If instead the hash slots are initially assigned in round-robin style (give hashslot mod TotalNodes == NodeNumber to node numbered as NodeNumber), then there will be better distribution of keys across nodes. If your keys end in the same string and differ more on the leftmost characters, then the distribution is a lot better. Even better distribution would of course be a random shuffle, though harder to manage/recover. I know, it's still early days for Redis Cluster, and I am hoping for the best!

So, 1 slot went to node 0, 4 slots to node 1, and 5 slots to node 2. Better to have (1,4,5) than (0, 8, 2).

comments closed

PROGRAMMING AND WEB

Welcome, this blog is about programming, web, open source projects I develop, and rants I love to share from time to time. From the point of view of a programmer that loves to define himself a craftsman.