Replica placement strategy for solrcloud

Details

Description

Objective

Most cloud based systems allow to specify rules on how the replicas/nodes of a cluster are allocated . Solr should have a flexible mechanism through which we should be able to control allocation of replicas or later change it to suit the needs of the system

All configurations are per collection basis. The rules are applied whenever a replica is created in any of the shards in a given collection during

collection creation

shard splitting

add replica

createsshard

There are two aspects to how replicas are placed: snitch and placement.

snitch

How to identify the tags of nodes. Snitches are configured through collection create command with the snitch param . eg: snitch=EC2Snitch or snitch=class:EC2Snitch

ImplicitSnitch

This is shipped by default with Solr. user does not need to specify ImplicitSnitch in configuration. If the tags known to ImplicitSnitch are present in the rules , it is automatically used,
tags provided by ImplicitSnitch

cores : No:of cores in the node

disk : Disk space available in the node

host : host name of the node

node: node name

D.* : These are values available from systrem propertes. D.key means a value that is passed to the node as -Dkey=keyValue during the node startup. It is possible to use rules like D.key:expectedVal,shard:*

Rules

This tells how many replicas for a given shard needs to be assigned to nodes with the given key value pairs. These parameters will be passed on to the collection CREATE api as a multivalued parameter "rule" . The values will be saved in the state of the collection as follows

A rule is specified as a pseudo JSON syntax . which is a map of keys and values
*Each collection can have any number of rules. As long as the rules do not conflict with each other it should be OK. Or else an error is thrown

In each rule , shard and replica can be omitted

default value of replica is * means ANY or you can specify a count and an operand such as < (less than) or > (greater than)

and the value of shard can be a shard name or * means EACH or ** means ANY. default value is ** (ANY)

There should be exactly one extra condition in a rule other than shard and replica.

all keys other than shard and replica are called tags and the tags are nothing but values provided by the snitch for each node

By default certain tags such as node, host, port are provided by the system implicitly

How are nodes picked up?

Nodes are not picked up in random. The rules are used to first sort the nodes according to affinity. For example, if there is a rule that says disk:100+ , nodes with more disk space are given higher preference. And if the rule is disk:100- nodes with lesser disk space will be given priority. If everything else is equal , nodes with fewer cores are given higher priority

Fuzzy match

Fuzzy match can be applied when strict matches fail .The values can be prefixed ~ to specify fuzziness

example rule

#Example requirement "use only one replica of a shard in a host if possible, if no matches found , relax that rule".
rack:*,shard:*,replica:<2~
#Another example, assign all replicas to nodes with disk space of 100GB or more,, or relax the rule if not possible. This will ensure that if a node does not exist with 100GB disk, nodes are picked up the order of size say a 85GB node would be picked up over 80GB disk node
disk:>100~

Examples:

#in each rack there can be max two replicas of A given shard
rack:*,shard:*,replica:<3
//in each rack there can be max two replicas of ANY replica
rack:*,shard:**,replica:2
rack:*,replica:<3
#in each node there should be a max one replica of EACH shard
node:*,shard:*,replica:1-
#in each node there should be a max one replica of ANY shard
node:*,shard:**,replica:1-
node:*,replica:1-
#In rack 738 and shard=shard1, there can be a max 0 replica
rack:738,shard:shard1,replica:<1
#All replicas of shard1 should go to rack 730
shard:shard1,replica:*,rack:730
shard:shard1,rack:730
#all replicas must be created in a node with at least 20GB disk
replica:*,shard:*,disk:>20
replica:*,disk:>20
disk:>20
#All replicas should be created in nodes with less than 5 cores
#In this ANY AND each for shard have same meaning
replica:*,shard:**,cores:<5
replica:*,cores:<5
cores:<5
#one replica of shard1 must go to node 192.168.1.2:8080_solr
node:”192.168.1.2:8080_solr”, shard:shard1, replica:1
#No replica of shard1 should go to rack 738
rack:!738,shard:shard1,replica:*
rack:!738,shard:shard1
#No replica of ANY shard should go to rack 738
rack:!738,shard:**,replica:*
rack:!738,shard:*
rack:!738

In the collection create API all the placement rules are provided as a parameters called rule
example: