Clustering overview

A RabbitMQ broker is a logical grouping of one or several Erlang nodes, each running the RabbitMQ application and sharing users, virtual hosts, queues, exchanges, etc. Sometimes we refer to the collection of nodes as a cluster.

All data/state required for the operation of a RabbitMQ broker is replicated across all nodes, for reliability and scaling, with full ACID properties. An exception to this are message queues, which currently only reside on the node that created them, though they are visible and reachable from all nodes. Future releases of RabbitMQ will introduce migration and replication of message queues.

The easiest way to set up a cluster is by auto configuration using a default cluster config file. See the clustering transcripts for an example.

The composition of a cluster can be altered dynamically. All RabbitMQ brokers start out as running on a single node. These nodes can be joined into clusters, and subsequently turned back into individual brokers again.

RabbitMQ brokers tolerate the failure of individual nodes. Nodes can be started and stopped at will.

A node can be a RAM node or a disk node. RAM nodes keep their state only in memory (with the exception of the persistent contents of durable queues which are still stored safely on disc). Disk nodes keep state in memory and on disk. As RAM nodes don’t have to write to disk as much as disk nodes, they can perform better. Because state is replicated across all nodes in the cluster, it is sufficient to have just one disk node within a cluster, to store the state of the cluster safely. Beware, however, that RabbitMQ will not stop you from creating a cluster with only RAM nodes. Should you do this, and suffer a power failure to the entire cluster, the entire state of the cluster, including all messages, will be lost.

Clustering transcript

The following is a transcript of setting up and manipulating a RabbitMQ cluster across three machines - rabbit1, rabbit2, rabbit3, with two of the machines replicating data on ram and disk, and the other replicating data in ram only.

We assume that the user is logged into all three machines, that RabbitMQ has been installed on the machines, and that the rabbitmq-server and rabbitmqctl scripts are in the user’s PATH.

Initial setup

Erlang nodes use a cookie to determine whether they are allowed to communicate with each other – for two nodes to be able to communicate they must have the same cookie.

The cookie is just a string of alphanumeric characters. It can be as long or short as you like.

Erlang will automatically create a random cookie file when the RabbitMQ server starts up. This will be typically located in /var/lib/rabbitmq/.erlang.cookie on Unix systems and C:\Documents and Settings\Current User\Application Data\RabbitMQ\.erlang.cookie on Windows systems. The easiest way to proceed is to allow one node to create the file, and then copy it to all the other nodes in the cluster.

As an alternative, you can insert the option “-setcookie cookie” in the erl call in the rabbitmq-server and rabbitmqctl scripts.

Starting independent nodes

Clusters are set up by re-configuring existing RabbitMQ nodes into a cluster configuration. Hence the first step is to start RabbitMQ on all nodes in the normal way:

The node name of a RabbitMQ broker started from the rabbitmq-server shell script is rabbit@shorthostname, where the short node name is lower-case (as inrabbit@rabbit1, above). If you use the rabbitmq-server.bat batch file on Windows, the short node name is upper-case (as in rabbit@RABBIT1). When you type node names, case matters, and these strings must match exactly.

Creating the cluster

In order to link up our three nodes in a cluster, we tell two of the nodes, say rabbit@rabbit2 and rabbit@rabbit3, to join the cluster of the third, say rabbit@rabbit1.

We first join rabbit@rabbit2 as a ram node in a cluster with rabbit@rabbit1 in a cluster. To do that, on rabbit@rabbit2 we stop the RabbitMQ application, reset the node, join the rabbit@rabbit1 cluster, and restart the RabbitMQ application.

Now we join rabbit@rabbit3 as a disk node to the same cluster. The steps are identical to the ones above, except that we list rabbit@rabbit3 as a node in the clustercommand in order to turn it into a disk rather than ram node.

By following the above steps we can add new nodes to the cluster at any time, while the cluster is running.

Note For the rabbitmqctl cluster command to succeed, the target nodes need to be active. It is possible to cluster with offline nodes; for this purpose, use therabbitmqctl force_cluster command.

Changing node types

We can change the type of a node from ram to disk and vice versa. Say we wanted to reverse the types of rabbit@rabbit2 and rabbit@rabbit3, turning the former from a ram node into a disk node and the latter from a disk node into a ram node. To do that we simply stop the RabbitMQ application, change the type with an appropriatecluster command, and restart the application.

The significance of specifying both rabbit@rabbit1 and rabbit@rabbit2 as the cluster nodes for rabbit@rabbit3 is that in case of failure of either of them, rabbit@rabbit3 can still connect to the cluster when it starts, and operate normally. This is only important for ram nodes; disk nodes automatically keep track of the cluster configuration.

Restarting cluster nodes

Nodes that have been joined to a cluster can be stopped at any time. It is also ok for them to crash. In both cases the rest of the cluster continues operating unaffected, and the nodes automatically “catch up” with the other cluster nodes when they start up again.

We shut down the nodes rabbit@rabbit1 and rabbit@rabbit3 and check on the cluster status at each step:

All disk nodes must be running for certain operations, most notably leaving a cluster, to succeed.

At least one disk node should be running at all times.

When all nodes in a cluster have been shut down, restarting any node will suspend for up to 30 seconds and then fail if the last disk node that was shut down has not been restarted yet. Since the nodes do not know what happened to that last node, they have to assume that it holds a more up-to-date version of the broker state. Hence, in order to preserve data integrity, they cannot resume operation until that node is restarted.

Breaking up a cluster

Nodes need to be removed explicitly from a cluster when they are no longer meant to be part of it. This is particularly important in case of disk nodes since, as noted above, certain operations require all disk nodes to be up.

We first remove rabbit@rabbit3 from the cluster, returning it to independent operation. To do that, on rabbit@rabbit3 we stop the RabbitMQ application, reset the node, and restart the RabbitMQ application.

Note that rabbit@rabbit2 retains the residual state of the cluster, whereas rabbit@rabbit1 and rabbit@rabbit3 are freshly initialised RabbitMQ brokers. If we want to re-initialise rabbit@rabbit2 we follow the same steps as for the other nodes:

Note that we used force_reset here. The reason is that removing a node from a cluster updates only the node-local configuration of the cluster, and that calling reset gets the node to connect to any of the other nodes that it believes are in the cluster, to perform some house-keeping that is necessary when leaving a cluster. However, at this point, there are no other nodes in the cluster, but rabbit@rabbit2 doesn’t know this. As a result, calling reset would fail, as it can’t connect to rabbit@rabbit1 orrabbit@rabbit3, hence the use of force_reset, in which rabbit@rabbit2 does not attempt to contact any other nodes in the cluster. This situation only arises when resetting the last remaining node of a cluster.

Auto-configuration of a cluster

Instead of configuring clusters “on the fly” using the cluster command, clusters can also be set up via the RabbitMQ .config file; see the configuration guide for details. The file should set the cluster_nodes field in the rabbit application to a list of cluster nodes.

Listing cluster nodes this way has the same effect as using the cluster command. However, the latter takes precedence over the former, i.e. the default cluster configuration is ignored subsequent to any successful invocation of the cluster command, until the node is reset.

A common use of cluster configuration via the RabbitMQ config file is to automatically configure nodes to join a common cluster. For this purpose the same configuration can be set on all nodes, thus specifing the potential disk nodes for the cluster.

Say we want to join our three separate nodes of our running example back into a single cluster, with rabbit@rabbit1 and rabbit@rabbit2 being the disk nodes of the cluster. First we reset and stop all nodes – NB: this step would not be necessary if this was a fresh installation of RabbitMQ.

Note that, in order to remove a node from an auto-configured cluster, it must first be removed from the rabbitmq.config files of the other nodes in the cluster. Only then, can it be reset safely.

Upgrading clusters

When upgrading from one version of RabbitMQ to another, RabbitMQ will automatically update its persistent data structures if necessary. In a cluster, this task is performed by the first disc node to be started (the “upgrader” node). Therefore when upgrading a RabbitMQ cluster, you should not attempt to start any RAM nodes first; any RAM nodes started will emit an error message and fail to start up.

All nodes in a cluster must be running the same versions of Erlang and RabbitMQ, although they may have different plugins installed. Therefore it is necessary to stop all nodes in the cluster, then start all nodes when performing an upgrade.

While not strictly necessary, it is a good idea to decide ahead of time which disc node will be the upgrader, stop that node last, and start it first. Otherwise changes to the cluster configuration that were made between the upgrader node stopping and the last node stopping will be lost.

Automatic upgrades are only possible from RabbitMQ versions 2.1.1 and later. If you have an earlier cluster, you will need to rebuild it to upgrade.

A cluster on a single machine

Under some circumstances it can be useful to run a cluster of RabbitMQ nodes on a single machine. This would typically be useful for experimenting with clustering on a desktop or laptop without the overhead of starting several virtual machines for the cluster. The two main requirements for running more than one node on a single machine are that each node should have a unique name and bind to a unique port / IP address combination for each protocol in use.

You can start multiple nodes on the same host manually by repeated invocation of rabbitmq-server ( rabbitmq-server.bat on Windows). You must ensure that for each invocation you set the environment variables RABBITMQ_NODENAME and RABBITMQ_NODE_PORT to suitable values.

will start two nodes (which can then be clustered) when the management plugin is installed.

Firewalled nodes

The case for firewalled clustered nodes exists when nodes are in a data center or on a reliable network, but separated by firewalls. Clustering is not recommended over a WAN or when network links between nodes are unreliable. The shovel plugin is a better solution in that case.

If different nodes of a cluster are in the same data center, but behind firewalls then additional configuration will be necessary to ensure inter-node communication. Erlang makes use of a Port Mapper Daemon (epmd) for resolution of node names in a cluster. Nodes must be able to reach each other and the port mapper daemon for clustering to work.

The default epmd port is 4369, but this can be changed using the ERL_EPMD_PORT environment variable. All nodes must use the same port. Firewalls must permit traffic on this port to pass between clustered nodes. For further details see the Erlang epmd manpage.

Once a distributed Erlang node address has been resolved via epmd, other nodes will attempt to communicate directly with that address using the Erlang distributed node protocol. The port range for this communication can be configured with two parameters for the Erlang kernel application:

inet_dist_listen_min

inet_dist_listen_max

Firewalls must permit traffic in this range to pass between clustered nodes (assuming all nodes use the same port range). The default port range is unrestricted.

The Erlang kernel_app manpage contains more details on the port range that distributed Erlang nodes listen on. See the configuration page for information on how to create and edit a configuration file.