Apache Cassandra™ 1.2

JavaScript must be enabled in order to use this site.

Please enable JavaScript in your browser and refresh the page.

About hinted handoff writes

How hinted handoff works and how it optimizes the cluster.

Hinted handoff is a Cassandra feature that ensures high write availability when consistency is not required. Hinted handoff
dramatically improves response consistency after temporary outages such as network failures.
You enable or disable hinted
handoff in the cassandra.yaml file.

When a write is performed and a replica node for the row is either known to be down ahead
of time, or does not respond to the write request, the coordinator will store a hint locally
in the system.hints table. This hint indicates that the write needs to be replayed to the
unavailable node(s).

The hint consists of:

The location of the replica that is down

The row that requires a replay

The actual data being written

By default, hints are saved for three hours after a replica fails because if the replica
is down longer than that, it is likely permanently dead. In this case, run a repair to
re-replicate the data before the failure occurred. You can configure this interval of time
using the max_hint_window_in_ms property in the cassandra.yaml file.

After a node discovers from gossip that a
node for which it holds hints has recovered, the node sends the data row corresponding to
each hint to the target. Additionally, the node checks every ten minutes for any hints for
writes that timed out during an outage too brief for the failure detector to notice through
gossip.

A hinted write does not count toward consistency
level requirements of ONE, QUORUM, or ALL. The coordinator node stores hints for
dead replicas regardless of consistency level unless hinted handoff is disabled. If
insufficient replica targets are alive to satisfy a requested consistency level, an
UnavailableException is thrown with or without hinted handoff. This is an important
difference from Dynamo’s replication model; Cassandra does not
default to sloppy quorum.

For example, in a cluster of two nodes, A and B, having a replication factor (RF) of 1:
each row is stored on one node. Suppose node A is down while we write row K to it with
consistency level of one. The write fails because reads always reflect the most recent write
when:

W-nodes + R > replication factor

where W is the number of nodes to block for writes and R is the number of nodes to block
for reads. Cassandra does not write a hint to B and call the write good because Cassandra
cannot read the data at any consistency level until A comes back up and B forwards the data
to A. For more information about how hinted handoff works, see "Modern hinted handoff" by Jonathan Ellis.

For applications that want Cassandra to accept writes even when all the normal replicas are
down, when not even consistency level ONE can be satisfied, Cassandra provides consistency
level ANY. ANY guarantees that the write is durable and will be readable after an
appropriate replica target becomes available and receives the hint replay.

By design, hinted handoff inherently forces Cassandra to continue performing the same
number of writes even when the cluster is operating at reduced capacity. Pushing your
cluster to maximum capacity with no allowance for failures is a bad idea. Hinted handoff is
designed to minimize the extra load on the cluster.

All hints for a given replica are stored under a single partition key , so replaying hints is a simple sequential read with minimal
performance impact.

If a replica node is overloaded or unavailable, and the failure detector has not yet
marked it down, then expect most or all writes to that node to fail after the timeout
triggered by write_request_timeout_in_ms, which defaults to 10 seconds. During that time,
Cassandra writes the hint when the timeout is reached.

If this happens on many nodes at once this could become substantial memory pressure on the
coordinator. So the coordinator tracks how many hints it is currently writing, and if this
number gets too high it will temporarily refuse writes (withUnavailableException) whose
replicas include the misbehaving nodes.

When removing a node from the cluster by decommissioning the node or by using the nodetool removenode command, Cassandra
automatically removes hints targeting the node that no longer exists. Cassandra also removes
hints for dropped tables.

At first glance, it may appear that hinted handoff lets you safely get away without
needing repair. This is only true if you never have hardware failure. Hardware failure has
the following ramifications:

Loss of historical data about which writes have finished. The is no information about
what data has gone missing to convey to the rest of the cluster.

Loss of hints-not-yet-replayed from requests that the failed node coordinated.