Node repair is the process that the database uses to make sure data in every replica is consistent with other replicas.

Data distribution overview

In DataStax Enterprise (DSE), the total amount of data managed by the cluster is
represented as a ring with nodes.

In DataStax Enterprise (DSE), the total amount of data managed by the cluster is represented
as a ring. The ring is divided into ranges equal to the number of nodes, with each node being
responsible for one or more ranges of the data. Before a node can join the ring, it must be
assigned a token. The token value determines the node's position in the ring and its range of
data. Table data is partitioned across the nodes based on the row key.

To determine the node where the first replica of a row will live, DSE locates the node with a
token value greater than that of the row key. Each node is responsible for the region of the
ring between the node itself and its predecessor, excluding the predecessor's token value. With
the nodes sorted in token order, the last node is considered the predecessor of the first
node.

For example, consider a simple four-node cluster where all of the row keys managed by the
cluster were numbers in the range of 0 to 100. Each node is assigned a token that represents a
point in this range. In this simple example, the token values are 0, 25, 50, and 75. The first
node with token 0 is responsible for the wrapping range (76-0). The node with the lowest token
also accepts row keys less than the lowest token and more than the highest token.

Figure 1: Four-node cluster

Token number

DSE provides several types of partitioners, which
should be assigned to your cluster. An initial_token value
is assigned to each node so that each node is responsible for roughly an equal amount of
data.