Scaling, sharding and replication

RethinkDB allows you to shard and replicate your cluster on a per-table basis. Settings can be controlled easily from the web administration console. In addition, ReQL commands for table configuration allow both scripting capability and more fine-grained control over replication, distributing replicas for individual tables across user-defined groups of servers using server tags.

Multi-datacenter setup

To group servers together in data centers, RethinkDB uses Server tags. Servers can be “tagged” with one or more group names on startup:

rethinkdb --server-tag data_center_1

Once a server has been given a tag, the tags may be used to assign table replicas to servers with the same tags using the reconfigure command. Read the section of this document on Server tags for more details.

Running a proxy node

Once you have several machines in a RethinkDB cluster, you can improve your cluster’s efficiency by running a proxy node on each application server and having the client application connect to the proxy on localhost.

A proxy node doesn’t store any data; instead it acts as a query router. This offers some performance advantages:

The proxy will send queries directly to the correct machines, reducing intracluster traffic.

If you’re using changefeeds, the proxy will de-duplicate changefeed messages sent from other cluster nodes, further reducing traffic.

The proxy node can do some query processing itself, reducing CPU load on database servers.

To run a proxy node, simply use the proxy command line option on startup.

rethinkdb proxy --join hostname:29015

Sharding and replication via the web console

When using the web UI, simply specify the number of shards you want, and based on the data available RethinkDB will determine the best split points to maintain balanced shards. To shard your data:

Go to the table view (Tables → table name).

Click on the Reconfigure button.

Set the number of shards and replicas you would like.

Click on the Apply Configuration button.

A table may have up to 64 shards.

Sharding and replication via ReQL

There are three primary commands for changing sharding and replication in ReQL. In addition, there are lower-level values that can be changed by manipulating system tables.

If no tags are specified on startup, the server will be started with one tag, default. Changing the sharding/replica information from the web UI or from ReQL commands that do not specify server tags will affect all servers with the default tag.

The web UI only affects servers with the default tag. If you remove the default tag from a server or start it without that tag, it will not be used for tables configured through the web UI.

When servers are tagged, you can use the tags in the reconfigure command. To assign 3 replicas of the users table to us_west and 2 to us_east:

Note that tables are configured on creation and when the reconfigure command is called, but the configurations are not stored by the server otherwise. To reconfigure tables consistently—especially if your configuration uses server tags—you should save the configuration in a script. Read more about this in Administration tools.

Write acks and durability

Two settings for tables, write acknowledgements and write durability, cannot be set through either the web interface or the reconfigure command. They must be set by modifying the table_config table for individual tables.

The write acknowledgement setting for a table controls when the cluster acknowledges a write request as fulfilled. There are two possible settings:

majority: The cluster sends the acknowledgement when the majority of replicas have acknowledged it. This is the default.

single: The cluster sends the acknowledgement when any replica has acknowledged it.

The durability setting for a table controls when writes are committed. In hard durability mode, writes are committed to disk before acknowledgements are sent; in soft mode, writes are acknowledged immediately upon receipt. The soft mode is faster but slightly less resilient to failure.