The order in which you chain ReQL commands can affect performance. For an example, imagine combining the previous two queries to return an ordered list of names of admin users. The filter operation can be distributed across shards, but the orderBy operation cannot. So this query:

Commands that stop subsequent commands from being parallelized include:

order_by (with or without indexes)

distinct

eq_join

reduce, fold

limit, skip, slice

max, min, avg

Any command that requires the results from the shards to be combined on the server executing the query will finish executing on that server rather than being distributed. Optimize your queries by putting commands that can execute in parallel before commands that combine the result set whenever possible.

Replication

RethinkDB’s defaults tend to prioritize safety over performance. One of those defaults is that queries will be sent to the primary replicas for shards, which will always have current data (although that data may be returned to a query before it’s been committed to disk).

You can increase the performance of a query by using the outdated read mode, which allows the cluster to return values from memory on arbitrarily-selected replicas.

While outdated reads are faster, they are the least consistent. For more information on this option, read “Balancing safety and performance” in the Consistency guarantees documentation.

Proxy nodes

Starting RethinkDB with the proxy command turns a server into a proxy node, which acts as a query router. This increases cluster performance by reducing intracluster traffic and, if you’re using changefeeds, de-duplicating feed messages.