Red-black trees

Red-black trees are self-balancing binary search trees in which every node has
one of two colors: red or black.

Red-black trees obey two additional invariants:

Any path from the root to a leaf has the same number of black nodes.

All red nodes have two black children.

Leaf nodes, which do not carry values, are considered black for the purposes
of both height and coloring.

Any tree that obeys these conditions ensures that the longest root-to-leaf
path is no more than double the shortest root-to-leaf path.
These constraints on path length guarantee fast, logarithmic reads, insertions and deletes.

Examples

The following is a valid red-black tree:

Both of the following are invalid red-black representations of the set {1,2,3}:

The following are valid representations of the set {1,2,3}:

Delete: A high-level summary

There are many easy cases in red-black deletion--cases where the change is
local and doesn't require rebalancing or (much) recoloring.

The only hard case ends up being the removal of a black node with no
children, since it alters the height of the tree.

Fortunately, it's easy to break apart this case into three
phases, each of which is conceptually simple and straightforward to implement.

The trick is to add two temporary colors: double black and negative black.

The three phases are then removing, bubbling and balancing:

By adding the color double-black, the hard case reduces to changing
the target node into a double-black leaf.
A double-black node counts twice for black height, which allows the
black-height invariant to be preserved.

Bubbling tries to eliminate the double black just created by a removal.
Sometimes, it's possible to eliminate a double-black by recoloring its parent and its sibling.
If that's not possible, then the double-black gets "bubbled up" to its parent.
To do so, it might be necessary to recolor the double black's (red) sibling to negative black.

Balancing eliminates double blacks and negative blacks at the same time.
Okasaki's red-black algorithms use a rebalancing procedure. It's possible to
generalize this rebalancing procedure with two new cases so that it can
reliably eliminate double blacks and negative blacks.

Red-black trees in Racket

My implementation of red-black trees is actually an implementation
of red-black maps:

Every sorted-map has a comparison function on keys.
Each internal node (T) has
a color, a left sub-tree, a key, a value and a right sub-tree.
There are also black leaf nodes (L) and double-black leaf nodes (LBB).

To further condense cases,
the implementation also uses color arithmetic.
For instance, adding a black to a black yields a double-black.
Subtracting a black from a black yields a red.
Subtracting a black from a red yields a negative black.
In Racket:

(define/match and switch-compare are macros to make the code more compact and readable.)

Because deletion could produce a double-black node, the
procedure bubble gets invoked to move it upward.

Removal

The remove procedure breaks removal into several cases:

The cases group according to how many children the target node has.
If the target node has two sub-trees, remove reduces it to the
case where there is at most one sub-tree.

It's easy to turn removal of a node with two children into removal of a
node with at most one child: find the maximum (rightmost) element in its left
(less-than) sub-tree; remove that node instead, and place its value into the
node to be removed.

For example, removing the blue node (with two children) reduces to removing
the green node (with one) and then overwriting the blue with the green:

If the target node has leaves for children, removal is straightforward: