Category Archives: Redis

I have been asked by few folks on quick tutorial setting up Redis under systemd in Ubuntu Linux version 16.04.

I have blogged quite a bit about Redis in general –https://gennadny.wordpress.com/category/redis/ , however just a quick line on Redis in general. Redis is an in-memory key-value store known for its flexibility, performance, and wide language support. That makes Redis one of the most popular key value data stores in existence today. Below are steps to install and configure it to run under systemd in Ubuntu 16.04 and above.

After the binaries are compiled, run the test suite to make sure everything was built correctly. You can do this by typing:

$ make test

This will typically take a few minutes to run. Once it is complete, you can install the binaries onto the system by typing:

$ sudo make install

Now we need to configure Redis to run under systemd. Systemd is an init system used in Linux distributions to bootstrap the user space and manage all processes subsequently, instead of the UNIX System V or Berkeley Software Distribution (BSD) init systems. As of 2016, most Linux distributions have adopted systemd as their default init system.

To start off, we need to create a configuration directory. We will use the conventional /etc/redis directory, which can be created by typing

$ sudo mkdir /etc/redi

Now, copy over the sample Redis configuration file included in the Redis source archive:

$ sudo cp /tmp/redis-stable/redis.conf /etc/redis

Next, we can open the file to adjust a few items in the configuration:

$ sudo nano /etc/redis/redis.conf

In the file, find the supervised directive. Currently, this is set to no. Since we are running an operating system that uses the systemd init system, we can change this to systemd:

Next, find the dir directory. This option specifies the directory that Redis will use to dump persistent data. We need to pick a location that Redis will have write permission and that isn’t viewable by normal users.
We will use the /var/lib/redis directory for this, which we will create

. . .
# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir /var/lib/redis
. . .

Save and close the file when you are finished

Next, we can create a systemd unit file so that the init system can manage the Redis process.
Create and open the /etc/systemd/system/redis.service file to get started:

Redis Sentinel provides high availability for Redis. If you ever ran SQL Server mirroring or Oracle Golden Gate the concept should be somewhat familiar to you. To start you need to have Redis replication configured with master and N number slaves. From there, you have Sentinel daemons running, be it on your application servers or on the servers Redis is running on. These keep track of the master’s health.

Redis Sentinel provides high availability for Redis. If you ever ran SQL Server mirroring or Oracle Golden Gate the concept should be somewhat familiar to you. To start you need to have Redis replication configured with master and N number slaves. From there, you have Sentinel daemons running, be it on your application servers or on the servers Redis is running on. These keep track of the master’s health.

Say we have a master “A” replicating to slaves “B” and “C”. We have three Sentinels (s1, s2, s3) running on our application servers, which write to Redis. At this point “A”, our current master, goes offline. Our sentinels all see “A” as offline, and send SDOWN messages to each other. Then they all agree that “A” is down, so “A” is set to be in ODOWN status. From here, an election happens to see who is most ahead, and in this case “B” is chosen as the new master.

The config file for “B” is set so that it is no longer the slave of anyone. Meanwhile, the config file for “C” is rewritten so that it is no longer the slave of “A” but rather “B.” From here, everything continues on as normal. Should “A” come back online, the Sentinels will recognize this, and rewrite the configuration file for “A” to be the slave of “B,” since “B” is the current master.

The current version of Sentinel is called Sentinel 2. It is a rewrite of the initial Sentinel implementation using stronger and simpler to predict algorithms (that are explained in this documentation).

A stable release of Redis Sentinel is shipped since Redis 2.8. Redis Sentinel version 1, shipped with Redis 2.6, is deprecated and should not be used.

When configuring Sentinel you need to take time and decide where you want to run Sentinel processes. Many folks recommend running those on your application servers. Presumably if you’re setting this up, you’re concerned about write availability to your master. As such, Sentinels provide insight to whether or not your application server can talk to the master. However a lot of folks decide to run Sentinel processes in their Redis instance servers amd that makes sense as well.

If you are using the redis-sentinel executable (or if you have a symbolic link with that name to the redis-server executable) you can run Sentinel with the following command line:

redis-sentinel /path/to/sentinel.conf

Otherwise you can use directly the redis-server executable starting it in Sentinel mode:

redis-server /path/to/sentinel.conf --sentinel

You have to use configuration file when running Sentinel (sentinel.conf) which is separate from Redis configuration file (redis.conf) and this file this file will be used by the system in order to save the current state that will be reloaded in case of restarts. Sentinel will simply refuse to start if no configuration file is given or if the configuration file path is not writable.

By default , Sentinel listens on TCP port 26379, so for Sentinels to work, port 26379 of your servers must be open to receive connections from the IP addresses of the other Sentinel instances. Otherwise Sentinels can’t talk and can’t agree about what to do, so failover will never be performed.

Some important items to remember on Sentinel

1. You need at least three Sentinel instances for a robust deployment.

2. As per Redis docs, three Sentinel instances should be placed into computers or virtual machines that are believed to fail in an independent way. So for example different physical servers or Virtual Machines executed on different availability zones or application fault domains

3. Sentinel + Redis distributed system does not guarantee that acknowledged writes are retained during failures, since Redis uses asynchronous replication. However there are ways to deploy Sentinel that make the window to lose writes limited to certain moments, while there are other less secure ways to deploy it.

4. You need Sentinel support in your clients. Popular client libraries have Sentinel support, but not all.

5. Test your setup so you know it works. Otherwise you cannot be sure in its performance

Basically. Initial setup expects all nodes running as a master with replication on, with manual set slaveof ip port in redis-cli on futire redis slaves. Then run sentinel and it does the rest.

Start all of your redis nodes with redis config and choose master. Then run redis console and set all other nodes as a slave of given master, using command slaveof <ip address 6379>

Start all of your redis nodes with redis config and choose master. Then run redis console and set all other nodes as a slave of given master, using command slaveof <ip address 6379>. Then you can connect to your master and verify, if there are all of your slave nodes, connected and syncing – run info command in your master redis console. Output should show you something like this

In this case the arguments passed to the service instance will be “*sentinel.1.conf
–sentinel*”.

Make sure of following

1. The configuration file must be the last parameter of the command line. If another parameter was last, such as –service-name, it would run fine when invoked the command line but would consistently fail went started as a service.

2. Since the service installs a Network Service by default, ensure that it has access to the directory where the log file will be written.

The new Premium tier includes all Standard-tier features, plus better performance, bigger workloads, disaster recovery, and enhanced security. Additional features include Redis persistence, which allows you to persist data stored in Redis; Redis Cluster, which automatically shards data across multiple Redis nodes, so you can create workloads using increased memory (more than 53 GB) for better performance; and Azure Virtual Network deployment, which provides enhanced security and isolation for your Azure Redis Cache, as well as subnets, access control policies, and other features to help you further restrict access.

To me a huge disappointment for Redis on Windows (MsOpenTech Redis) and Azure has been inability to scale out across nodes and news of Azure Redis Cluster are particularly welcome.

Redis Cluster provides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes.

Redis Cluster also provides some degree of availability during partitions, that is in practical terms the ability to continue the operations when some nodes fail or are not able to communicate. However the cluster stops to operate in the event of larger failures (for example when the majority of masters are unavailable).

So in practical terms, what you get with Redis Cluster?

The ability to automatically split your dataset among multiple nodes

The ability to continue operations when a subset of the nodes are experiencing failures or are unable to communicate with the rest of the cluster.

Redis Cluster does not use consistent hashing, but a different form of sharding where every key is conceptually part of what they call an hash slot. Every node in a Redis Cluster is responsible for a subset of the hash slots, so for example you may have a cluster with 3 nodes, where:

Node A contains hash slots from 0 to 5500.

Node B contains hash slots from 5501 to 11000.

Node C contains hash slots from 11001 to 16384.

This allows to add and remove nodes in the cluster easily. For example if I want to add a new node D, I need to move some hash slot from nodes A, B, C to D. Similarly if I want to remove node A from the cluster I can just move the hash slots served by A to B and C. When the node A will be empty I can remove it from the cluster completely. Because moving hash slots from a node to another does not require to stop operations, adding and removing nodes, or changing the percentage of hash slots hold by nodes, does not require any downtime.

Note of caution.

Redis Cluster is not able to guarantee strong consistency. In practical terms this means that under certain conditions it is possible that Redis Cluster will lose writes that were acknowledged by the system to the client.

The first reason why Redis Cluster can lose writes is because it uses asynchronous replication. This means that during writes the following happens:

Your client writes to the master A.

The master A replies OK to your client

The master A propagates the write to its slaves A1, A2 and A3.

As you can see A does not wait for an acknowledge from A1, A2, A3 before replying to the client, since this would be a prohibitive latency penalty for Redis, so if your client writes something, A acknowledges the write, but crashes before being able to send the write to its slaves, one of the slaves (that did not received the write) can be promoted to master, losing the write forever.\

Still this is really exciting news for many of us in Azure NoSQL and Distributed In-Memory Cache world. So I logged into new Azure Portal and yes, creating new Redis Cache I saw a Premium option:

As you create your Redis Premium you can specify number of cluster nodes\shards as well, as well as persistence model for the first time!

Few minutes and I have myself a working 3 node cluster:

Now I can access this cluster just like I accessed single Redis instance previously.

My next steps are dig into Azure Redis Cluster deeper so stay tuned for updates.

In my previous posts on Redis I went through basic tutorial for MSOpenTech Redis fork, master\slave setup, configuration, Azure PaaS version and finally monitoring with INFO command. In this post I want to touch on some questions that I ran into while working with Redis at some scale.

Redis 10000 concurrent client limit.

In Redis 2.6 and above and so is in MSOpenTech Redis fork there is a default 10000 client limit set in configuration file (.conf) .Here is my setting on MSOpenTech redis.windows.conf:

# Set the max number of connected clients at the same time. By default
# this limit is set to 10000 clients, however if the Redis server is not
# able to configure the process file limit to allow for the specified limit
# the max number of allowed clients is set to the current file limit
# minus 32 (as Redis reserves a few file descriptors for internal uses).
#
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
#
maxclients 10000

However Redis checks with the kernel what is the maximum number of file descriptors that we are able to open (the soft limit is checked), if the limit is smaller than the maximum number of clients we want to handle, plus 32 (that is the number of file descriptors Redis reserves for internal uses), then the number of maximum clients is modified by Redis to match the amount of clients we are really able to handle under the current operating system limit.

Can I edit that limit higher? Yes, but as above both max number of file descriptors and maxmemory configuration setting then become throttling factors.

The settings in this section cannot be changed using the StackExchange.Redis.IServer.ConfigSet method. If this method is called with one of the commands in this section, an exception similar to the following is thrown:StackExchange.Redis.RedisServerException: ERR unknown command 'CONFIG'.

Any values that are configurable, such as max-memory-policy, are configurable through the portal.”

The Redis client command allows to inspect the state of every connected client, to kill a specific client, to set names to connections. It is a very powerful debugging tool if you use Redis at scale.. Example:

Client List

Please timeout your clients for typical activity. By default recent versions of Redis don’t close the connection with the client if the client is idle for many seconds: the connection will remain open forever.
However if you don’t like this behavior, you can configure a timeout, so that if the client is idle for more than the specified number of seconds, the client connection will be closed.
You can configure this limit via configuration in redis.conf or simply using CONFIG SET timeout .

Redis includes the redis-benchmark utility that simulates running commands done by N clients at the same time sending M total queries (it is similar to the Apache’s ab utility). MsOpenTech retains utility and here I will launch it against my local Redis on Windows using n parameter for 100,000 requests.

I piped the output into log and here is what I get, I guess I am doing great with huge majority of test requests running under or at 1 ms:

I briefly touched on difficulties scaling Redis out in my previous post, and until promised redis cluster is available in earnest (and how soon will it be available on Windows and Azure? ), best way to scale Redis out remains partitioning aka sharding. Partitioning is the process of splitting your data into multiple Redis instances, so that every instance will only contain a subset of your keys. .

Partitioning will allow for following:

Much larger databases\Redis stores, using the sum of the memory of many computers. Without partitioning you are limited to the amount of memory a single instance can support.

It allows scaling the computational power to multiple cores and multiple computers, and the network bandwidth to multiple computers and network adapters.

Twitter, Instagram and other heavy Redis users have implemented custom partitioning which allowed these companies to scale Redis to their needs. As I started thinking of how to do this couple of methods came to my mind:

Classic Range Partitioning. This is accomplished by mapping ranges of objects into specific Redis instances. For example, I could say users from ID 0 to ID 10000 will go into instance R0, while users form ID 10001 to ID 20000 will go into instance R1 and so forth.This system works and is actually used in practice, however, it has the disadvantage of requiring a table that maps ranges to instances. This table needs to be managed and a table is needed for every kind of object, so therefore range partitioning in Redis is often undesirable because it is much more inefficient than other alternative partitioning approaches.

Hash Partitioning . Lets say we take the key name and use a hash function (e.g., the crc32 hash function) to turn it into a number. For example, if the key is foobar, crc32(foobar) will output something like 93024922. Then we can use a modulo operation with this number in order to turn it into a number between 0 and 3, so that this number can be mapped to one of our Redis instances. Although there are few client in Redis that implement consistent hashing out of the box unfortunately some of the most popular do not.

As you may know many features of Redis, such as operations and transactions involving multiple or intersecting keys will not work and adding or removing capacity from Redis will be tricky to say the least. hen partitioning is used, data handling is more complex, for instance you have to handle multiple RDB / AOF files, and to make a backup of your data you need to aggregate the persistence files from multiple instances and hosts. Moreover what is upsetting with C# StackExchange client connecting to MSOpenTech Redis on Windows or Azure there isn’t anything already built in for you, so you will have to build your own. Also partitioning may be ok for Redis that is used as a cache store, but for data store may be an issue. For more see – http://redis.io/topics/partitioning. Some interesting examples are here – http://petrohi.me/post/6323289515/scaling-redis, Twitter proxy implementation – http://antirez.com/news/44

In my previous series on Redis I showed basic Redis tutorial,its abilities to work with complex data types, persist and scale out. In this post idea is to show basic Redis monitoring facilities through Redis CLI. To analyze the Redis metrics you will need to access the actual data.Redis metrics are accessible through the Redis command line interface(redis-cli). So first I will start my MSOpenTech Redis on Windows Server. As I ran into an issue with default memory mapped file being to large for disk space on my laptop , I will feed it on start up configuration file (conf) which has maxmemory parameter cut to 256 MB. Otherwise I would get an error like:

The Windows version of Redis allocates a large memory mapped file for sharinghe heap with the forked process used in persistence operations. This fileill be created in the current working directory or the directory specified byhe ‘heapdir’ directive in the .conf file. Windows is reporting that there isinsufficient disk space available for this file (Windows error 0x70).

Since I changed maxmemory parameter, but not maxheap parameter which stayed at default 1.5 times maxmemory , my maxheap will be 384 MB (256*1.5). I do have that much disk space on my laptop for memory mapped file. As Redis starts I see now familiar screen:

Now I can navigate to CLI.

We will use info command to print important information and metrics on our Redis Server. You can use info command to get information on following:

server

clients

memory

persistence

stats

replication

cpu

commandstats

cluster

keyspace

So lets start by getting general information by running info server

With Redis info memory will probably be one of most useful commands. Here is me running it:

The used_memory metric reports the total number of bytes allocated by Redis. The used_memory_human metric gives the same value in a more readable format.

These metrics reflect the amount of memory that Redis has requested to store your data and the supporting metadata it needs to run. Due to the way memory allocators interact with underlying OS metrics don’t account for memory “lost” due to memory fragmentation and amount of memory reported by this metric may always differ from what is reported by OS.

Memory is critical to Redis performance. If amount of memory used exceeds available memory (used_memory metric>total available memory) the OS will begin swapping and older\unused memory pages will be written to disk to make room for newer\more used memory pages. Writing or reading from disk is of course much slower that reading\writing to memory and this will have profound effect on Redis performance. By looking at used_memory metric together with OS metrics we can tell if instance is at risk of swapping or swapping has began.

Next we can get some useful statistics by running info stats

The total_commands_processed metric gives the number of commands processed by the Redis server. These commands come from clients connected to Redis Server. Each time Redis completes one of 140 possible commands this metric (total_commands_processed) is incremented. This can be used to do certain measurement of throughput and queuing discovery , if by repeatedly querying this metric (via automated batch or script for example) you see slowdowns and spikes in total_commands_processed this may indicated queuing.

Note that none of these commands really measure latency to server. I found out that there is a way to measure it in Redis CLI. If you open separate command window, navigate to you Redis directory and run redis-cli.exe –latency –h <server> -p <port> you can get that metric:

The times will depend on your actual setup, but I have read that on typical 1 Gbits/Sec network it should average well under 200 ms. Anything above probably point to an issue.

Back to stats another important metric is evicted_keys. This is similar to other alike systems such as AppFabric Cache where I profiled similar metric before. The evicted_keys metric gives the number of keys removed by Redis due to hitting the maxmemory limit. Interesting if you don’t set that limit evictions do not occur, but instead you may see something worse such as swapping and running out of memory. This of evictions therefore as protection mechanism here. Interesting that when encountering memory pressure and electing to evict, Redis doesn’t necessarily evict oldest item. Instead it relies either on LRU (Least Recently Used) or TTL (Time to Live) cache policies. You have the option to select between the LRU and TTL eviction policies in the config file by setting maxmemory-policy to “volatile-lru” or “volatile-ttl” respectively. If you are using Redis as in-memory cache server with expiring keys setting up TTL makes more sense, otherwise if you are using it with non-expiring keys you will probably choose LRU.

In my previous posts I introduced Redis , in particular Microsoft port of that open source technology to Windows by MsOpenTech. In this post I want to show how you can use Azure Cache version of Redis based on that port that went to general availability around time of Microsoft TechEd 2014 Conference in May 2014.

Creating Azure Redis Cache:

First you will login with your credentials to new Azure Portal at https://portal.azure.com. Pick Browse on home page and New button in lower left corner: Pick Redis Cache

First enter DNS name, that would be subdomain name to use for the cache endpoint. The endpoint must be a string between six and twenty characters, contain only lowercase numbers and letters, and must start with a letter.

Use Pricing Tier to select the desired cache size and features. Azure Redis Cache is available in the following two tiers.

For Subscription, select the Azure subscription that you want to use for the cache. In Resource group, select or create a resource group for your cache.

Use Geolocation to specify the geographic location in which your cache is hosted. For the best performance, Microsoft strongly recommends that you create the cache in the same region as the cache client application.

At this time I will hit create and my cache will be created. After a bit here is what you see on a portal:

Using Azure Redis Cache

We start by getting the connection details, this requires 2 steps:

1. Get the URI, which is easy to get using the properties window, as shown below

2. Once you have copied the host name url, we need to copy the password, which you can grab from the keys area

Now that we have these two important bits that we need to connect, lets start using our cache service.

Just like in my on premise demo previously I will use StackExchange.Redis package from NuGet in my client. Below is very simple basic way to put\retrieve strings from Azure Redis:

And yet above demo is a bit boring as you usually will not store cache items like this in real time. Therefore a snippet below will use complex classes with properties and magic of JSON serialization to put these in Redis Cache:

In my previous posts I introduced Redis and attempted to show how it can work with advanced data structures , as well as persistence options. Another important Redis feature is master –slave asynchronous replication. Data from any Redis server can replicate to any number of slaves. A slave may be a master to another slave. This allows Redis to implement a single-rooted replication tree. Redis slaves can be configured to accept writes, permitting intentional and unintentional inconsistency between instances. Replication is useful for read (but not write) scalability or data redundancy.

How Redis replication works. According to Redis docs this is workflow description for Redis asynchronous replication:

If you set up a slave, upon connection it sends a SYNC command. It doesn’t matter if it’s the first time it has connected or if it’s a reconnection.

The master then starts background saving, and starts to buffer all new commands received that will modify the dataset. When the background saving is complete, the master transfers the database file to the slave, which saves it on disk, and then loads it into memory. The master will then send to the slave all buffered commands. This is done as a stream of commands and is in the same format of the Redis protocol itself.

Slaves are able to automatically reconnect when the master <-> slave link goes down for some reason. If the master receives multiple concurrent slave synchronization requests, it performs a single background save in order to serve all of them.

When a master and a slave reconnects after the link went down, a full resync is always performed. However, starting with Redis 2.8, a partial resynchronization is also possible.

So Redis master-slave replication can be useful in number of scenarios here:

Scaling performance by using the replicas for intensive read operations.

Data redundancy in multiple locations

Offloading data persistency costs in terms of expensive Disk IO (covered in last post) from the master by delegating it to the slaves

So, if replication is pretty useful as far as read-only scale out – how do I configure it? To configure replication is trivial: just add the following line to the slave configuration file (slave instance redis.conf) :

slaveof

Example:

slaveof 10.84.16.18 6379

More importantly you can use SLAVEOF command in Redis CLI to switch replication on the fly – http://redis.io/commands/slaveof. If a Redis server is already acting as slave, the command SLAVEOF NO ONE will turn off the replication, turning the Redis server into a MASTER. In the proper form SLAVEOF hostname port will make the server a slave of another server listening at the specified hostname and port.

Since Redis 2.6, slaves support a read-only mode that is enabled by default. This behavior is controlled by the slave-read-only option in the redis.conf file, and can be enabled and disabled at runtime using CONFIG SET.

That’s great, but what if for HA purposes I need an automated failover here from master to slave? Enter Redis Sentinel – system designed to help managing Redis instances. It does following:

Sentinel constantly checks if your master and slave instances are working as expected

Sentinel can notify the system administrator, or another computer program, via an API, that something is wrong with one of the monitored Redis instances.

If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.

Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.

For more on Redis Sentinel see – http://redis.io/topics/sentinel. Unfortunately MSOpenTech port of Redis on Windows doesn’t support this feature so I couldn’t easily test it here, hope that in future blog entry testing Redis on Linux flavor I can show you Sentinel configuration and failover.

However, even through above are great features, there is one item that is missing here that for example was present in AppFabric Cache – distributed cluster capable of linear scale out for write traffic. Yes, theoretically I can have multiple masters in Redis as well, however you would have to build some sort of sharding mechanism as multiple folks did in Silicon Valley (Instagram and Facebook I believe done so) to scale out. Fortunately, there is a new Redis Cluster Project. Redis Cluster provides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes.

Commands dealing with multiple keys are not supported by the cluster, because this would require moving data between Redis nodes, making Redis Cluster not able to provide Redis-alike performances and predictable behavior under load. Redis Cluster also provides some degree of availability during partitions, that is in practical terms the ability to continue the operations when some nodes fail or are not able to communicate. So here is what you get with Redis Cluster:

The ability to continue operations when a subset of the nodes are experiencing failures or are unable to communicate with the rest of the cluster.

Every Redis Cluster node requires two TCP connections open. The normal Redis TCP port used to serve clients, for example 6379, plus the port obtained by adding 10000 to the data port, so 16379 in the example.This second high port is used for the Cluster bus, that is a node-to-node communication channel using a binary protocol. The Cluster bus is used by nodes for failure detection, configuration update, failover authorization and so forth. Clients should never try to communicate with the cluster bus port, but always with the normal Redis command port, however make sure you open both ports in your firewall, otherwise Redis cluster nodes will be not able to communicate.

To create a cluster, the first thing we need is to have a few empty Redis instances running in cluster mode. This basically means that clusters are not created using normal Redis instances, but a special mode needs to be configured so that the Redis instance will enable the Cluster specific features and commands. Therefore we will add following to configuration (redis.conf):

As you can see what enables the cluster mode is simply the cluster-enabled directive. Every instance also contains the path of a file where the configuration for this node is stored, that by default is nodes.conf. This file is never touched by humans, it is simply generated at startup by the Redis Cluster instances, and updated every time it is needed.

Note that the minimal cluster that works as expected requires to contain at least three master nodes.

When instances are setup and running cluster node, next you need to create a cluster using Redis Cluster command line utility – redis-trib. The redis-trib utility is in the src directory of the Redis source code distribution. Example of use would be something like: