Configuring data consistency

Consistency levels in Cassandra can be configured to manage availability versus data
accuracy.

Consistency levels in Cassandra can be configured to manage availability versus data accuracy.
You can configure consistency on a cluster, datacenter, or individual I/O operation basis.
Consistency among participating nodes can be set globally and also controlled on a per-operation
basis (for example insert or update) using Cassandra’s drivers and client libraries.

Write consistency levels

This table describes the write consistency levels in strongest-to-weakest order.

Used in multiple datacenter clusters to strictly maintain consistency at the same
level in each datacenter. For example, choose this level if you want a read to fail when a
datacenter is down and the QUORUM cannot be reached on that datacenter.

Provides strong consistency if you can tolerate some level of failure.

LOCAL_QUORUM

Strong consistency. A write must be written to the commit log and
memtable on a quorum of replica nodes in the same datacenter as the coordinator. Avoids latency of inter-datacenter
communication.

Used in multiple datacenter clusters with a rack-aware replica placement
strategy, such as NetworkTopologyStrategy, and a properly configured snitch. Use to maintain
consistency locally (within the single datacenter). Can be used with SimpleStrategy.

A write must be sent to, and successfully acknowledged by, at least one replica node
in the local datacenter.

In a multiple datacenter clusters, a consistency level of ONE is
often desirable, but cross-DC traffic is not. LOCAL_ONE accomplishes this.
For security and quality reasons, you can use this consistency level in an offline
datacenter to prevent automatic connection to online nodes in other datacenters if an
offline node goes down.

ANY

A write must be written to at least one node. If all replica nodes for the given
partition key are down, the write can still succeed after a hinted handoff has been written. If all replica nodes are down at write time, an
ANY write is not readable until the replica nodes for that partition have
recovered.

Provides low latency and a guarantee that a write never fails. Delivers the lowest
consistency and highest availability.

You cannot configure this level as a normal consistency level, configured at the
driver level using the consistency level field. You configure this level using the
serial consistency field as part of the native protocol operation. See failure scenarios.

LOCAL_SERIAL

Same as SERIAL but confined to the datacenter. A write must be written conditionally
to the commit log and
memtable on a quorum of replica nodes in the same datacenter.

Same as SERIAL. Used for disaster recovery. See failure scenarios.

SERIAL and LOCAL_SERIAL write failure scenarios

If one of three nodes is down, the Paxos commit fails under the following conditions:

CQL query-configured consistency level of ALL

Driver-configured serial consistency level of SERIAL

Replication factor of 3

A WriteTimeout with a WriteType of CAS occurs and further reads do not see the write. If the
node goes down in the middle of the operation instead of before the operation started, the write
is committed, the value is written to the live nodes, and a WriteTimeout with a WriteType of
SIMPLE occurs.

Under the same conditions, if two of the nodes are down at the beginning of the operation, the
Paxos commit fails and nothing is committed. If the two nodes go down after the Paxos proposal
is accepted, the write is committed to the remaining live nodes and written there, but a
WriteTimeout with WriteType SIMPLE is returned.

Read consistency levels

Returns the record after all replicas have responded. The read operation will fail if
a replica does not respond.

Provides the highest consistency of all levels and the lowest availability of all
levels.

EACH_QUORUM

Not supported for reads.

Not supported for reads.

QUORUM

Returns the record after a quorum of replicas has responded from any datacenter.

Ensures strong consistency if you can tolerate some level of failure.

LOCAL_QUORUM

Returns the record after a quorum of replicas in the current datacenter as the coordinator node has reported. Avoids latency of
inter-datacenter communication.

Used in multiple datacenter clusters with a rack-aware replica placement strategy (
NetworkTopologyStrategy) and a properly configured snitch. Fails when
using SimpleStrategy.

ONE

Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other
replicas consistent.

Provides the highest availability of all the levels if you can tolerate a
comparatively high probability of stale data being read. The replicas contacted for reads
may not always have the most recent write.

TWO

Returns the most recent data from two of the closest replicas.

Similar to ONE.

THREE

Returns the most recent data from three of the closest replicas.

Similar to TWO.

LOCAL_ONE

Returns a response from the closest replica in the local datacenter.

Same usage as described in the table about write consistency levels.

SERIAL

Allows reading the current (and possibly uncommitted) state of
data without proposing a new addition or update. If a SERIAL read finds an
uncommitted transaction in progress, it will commit the transaction as part of the read.
Similar to QUORUM.

To read the latest value of a column after a user has invoked a lightweight transaction to write to the column, use
SERIAL. Cassandra then checks the inflight lightweight transaction for
updates and, if found, returns the latest data.

LOCAL_SERIAL

Same as SERIAL, but confined to the datacenter. Similar to
LOCAL_QUORUM.

Using a replication factor of 3, a quorum is 2 nodes. The cluster can tolerate 1 replica
down.

Using a replication factor of 6, a quorum is 4. The cluster can tolerate 2 replicas
down.

In a two datacenter cluster where each datacenter has a replication factor of 3, a quorum
is 4 nodes. The cluster can tolerate 2 replica nodes down.

In a five datacenter cluster where two datacenters have a replication factor of 3 and
three datacenters have a replication factor of 2, a quorum is 7 nodes.

The more datacenters, the higher number of replica nodes need to respond for a successful
operation.

If consistency is a top priority, you can ensure that a read always reflects the most recent
write by using the following formula:

(nodes_written + nodes_read) > replication_factor

For example, if your application is using the QUORUM consistency level for
both write and read operations and you are using a replication factor of 3, then this ensures
that 2 nodes are always written and 2 nodes are always read. The combination of nodes written
and read (4) being greater than the replication factor (3) ensures strong read consistency.

Similar to QUORUM, the LOCAL_QUORUM level is calculated
based on the replication factor of the same datacenter as the coordinator node. That is, even
if the cluster has more than one datacenter, the quorum is calculated only with local replica
nodes.

In EACH_QUORUM, every datacenter in the cluster must reach a quorum based on
that datacenter's replication factor in order for the read or write request to succeed. That
is, for every datacenter in the cluster a quorum of replica nodes must respond to the
coordinator node in order for the read or write request to succeed.

Configuring client consistency levels

You can use a new cqlsh command, CONSISTENCY, to set the consistency level for queries from the
current cqlsh session. The WITH CONSISTENCY clause has been removed from
CQL commands. You set the consistency level programmatically (at the driver level). For
example, call QueryBuilder.insertInto with a
setConsistencyLevel argument. The consistency level defaults to
ONE for all write and read operations.