Notes on 'Chord: A Scalable Peer-to-peer Lookup Service'

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications was
written by Ion Stoica, Robert Morris (of Morris worm, YC and xv6 fame), et al at
PDOS MIT in 2001. That department is also responsible for the distributed
systems class at MIT.

Brief notes taken using free recall and are subject to errors.

The paper

Chord is a protocol for looking up nodes by a key in a decentralized peer to
peer system. Each node can then have some arbitrary value, but that is outside
of the scope of the protocol. It provides a single key operation: lookup(k)
which provides the location of a node responsible for that key k. It can thus be
a key component of a distributed hash table.

It does this without using any special nodes, unlike previous peer-to-peer
systems such as Napster or DNS. DNS, for example, comes with a set of special
root servers pre-programmed. This makes the Chord protocol more decentralized
and resiliant to failure.

Chord uses consistent hashing, which means the values of keys are evenly
distributed despite nodes joining and leaving the system. One key innovation is
that the consistent hashing in a distributed hash table doesn’t require a node
to know about all other nodes. Instead, each node only knows about a select set
of other nodes. This way of implementing routing means that it only takes on the
order of O(log n) oeprations to find the value you are looking for, despite
nodes joining and leaving.

The network topology can be thought of as a ring of nodes. If we lookup a key
at a node and it doesn’t have the value, the node can point us in the right
direction. It does this using a finger table which includes the address of other
nodes. By using some nifty modular artihmetic this ensures we will always get
closer to the actual value.

Questions

How can you convince yourself that “inconsistent” hashing doesn’t distribute
values evenly as the size of the hash table changes?

How do we guarantee that values don’t get lost?

As nodes silently leave the network, how do other nodes update their routing tables?