Splay Trees

Extremely elegant and simple -- only one non-trivial operation needs to be implemented.

Delivers $O(\log n)$ time per operation performance too. But only in amortized (as opposed to worst-case) sense!

Is actually optimal in a much more meaningful way (will explain later).

Key design principle: Be lazy! That is, don't work until you must.

Don't try to keep the tree always balanced.

Re-balance only when convenient

$\Rightarrow$ Use the work of searches to re-balance “along the way”.

Need a potential function $\Phi$ to facilitate the analysis.

$\Phi$ aims to keep track of the amount of work we “saved” by procrastinating.

Will charge this “saved” work against the “bursts” of work we actually will need to do from time to time. (As we are mostly procrastinating, when we finally have to do some work, it will be a lot of work!).

Cool feature: our data structure does not need to actually keep track of the potential. This potential is implicitly captured by the complexity/”messiness” of our data structure (resulting from our procrastination).

Two important intuitions: path shortening and re-balancing.

Path shortening:

If executing a Search took too long -- fix it.

$\Rightarrow$ When you go down a long path, make it shorter on your way back!

Intuition: shortening of the path releases potential (= makes the data structure more “orderly”). This pays for
long traversal (and for work done to shorten it afterwards).

Analysis

These weights are just for the analysis -- they don't affect the implementation.

For now: can think of these weights being all equal to $1$. But later different choices of weights will give us very interesting conclusions.

Define size $s(x)$ of a node $x$ is total weight of subtree nodes, i.e. “number of nodes” (if all weights equal to $1$).

Define rank $r(x)=\lg s(x)$.

Intuitively: $r(x) \approx$ “best height of subtree at $x$”.

Remember that this is log base $2$ -- will be important to get the constants right.

Our potential function: $\Phi=\sum r(x)$.

Some intuition:

Recall: Having fat subtree deep down is bad---makes us search deep.

$\Rightarrow$ This should be reflected by large potential.

$\Rightarrow$ And it is: because that fat subtree contributes to all ranks
above it.

Also: we should make that large potential of fat subtrees vanish to pay for the search that encountered them.

$\Rightarrow$ Want to bring that fat subtree up the tree

(Note: that for a single fat subtree even single rotations would achieve this, but we want to do it for all fat subtrees that we encounter -- and these can be nested (think of a tree being a path). Hence the need for more sophisticated rotations.)

Note: $x$ is the root of the tree after the splay operation and the rank of the root is always the same $\log W$, where $W=\sum_x w_x$ is the total weight of the nodes.

$\Rightarrow$ So, rank $r'(x)$ after the splay is equal to $r(t)$.

$\Rightarrow$ $r(t)-r(x)=r'(x)-r(x)$ -- the amortized cost of a splay is within a constant of its change in rank.

Intuition:

If all nodes are in “right place” then $r(t)-r(x)$ is the “ideal” depth of $x$.

If $x$ has high rank (i.e., is fat), we expect it to be high in the tree.

If this is not the case, i.e., if $x$ is deep in the tree, then:

$x$ has to contribute a lot to the potential.

$\Rightarrow$ $\Phi$ is large.

Splaying $x$ will bring it to the root.

$\Rightarrow$ The large amount of potential released as a result will make
the amortized cost be still small. (That is, real cost will be large but $\Delta \Phi$ will be very negative and thus account for this overhead.)

One immediate consequence of Access Lemma:

Let all weights $w_x$ be equal to $1$.

$\Rightarrow$ Total weight $W$ is equal to $n$.

$\Rightarrow$ Rank of the root is $\log W$ and rank of every node is at least $1$.

By Access Lemma, the amortized time to (find and) splay a node $x$ is at most
\[
O(\log W/w_x) = O(\log n).
\]

Important detail: To make the above analysis work also for unsuccessful Search(x) operations, we need to splay the last node we visited in that search.

Proof of the Access Lemma

Will analyze one double rotation at a time.

Specifically: if $s'$ and $r'$ are the sizes and ranks after that rotation then the amortized cost of it is at most $3(r'(x)-r(x))$.

Summing up across all rotations during a splay and telescoping them will give the lemma. (Remember: $r(t)$ is equal to the final rank of $x$ after the splay.) Also, $+1$ corresponds to the extra single rotation that might be needed at the end.

{Will analyze only zig-zig (harder case) -- zig-zag is in the paper.}

r r'

z x
/ \ / \
y D A y
/ \ ====> / \
x C B z
/ \ / \
A B C D

Observe:

Only nodes $x,y,z$ change ranks.

Real cost equal to $2$.

$\Delta \Phi = r'(x)+r'(y)+r'(z)-r(x)-r(y)-r(z)$.

Rank of $x$ can only increase and the rank of $z$ can only decrease. I.e., $r'(x)\geq r(x)$ and $r'(z)\leq r(z)$.

$\Rightarrow$ This function is concave and thus maximized for $a=\frac{1}{2}$.

$\Rightarrow$ yields $r'(z)+r(x)-2r'(x) \leq -2$ as desired. (recall we are using $\log_2$)

Bounding Overall Change of Potential

Important Caveat: The Access Lemma bounds only the amortized cost of each operation.

$\Rightarrow$ If there is a sequence of $m$ operations, its total real cost is at most
\[
O(m \log n) - \left(\Phi_m - \Phi_0\right)=O(m \log n) + \Phi_0 - \Phi_m,
\]
where $\Phi_m$ is the potential of our data structure at the end of processing the whole sequence and $\Phi_0$ is its initial potential.

Key question: How large can $\Phi_0-\Phi_m$ be?

Observe: $\Phi_0 \leq n\log W$ (corresponds to “everyone being the root of the whole tree”.)

Also, $\Phi_m \geq \sum_x \log w_x$ (corresponds to the fact that each node has itself in its subtree).

Even if we do only one operation we incur this large (but fixed) additive cost.

This additive cost has a natural interpretation though: it's the (upper bound on the) cost of splaying each item once. (Recall: after splaying $x$ is the root, so by Access Lemma the total amortized cost is $O(r'(x)-r(x))=O(\log W/w_i)$.)

$\Rightarrow$ Our amortized cost bound is valid if each item is splayed/requested at least once in the request sequence.

Also: Note that potential of empty tree is $0$.

$\Rightarrow$ If we start with empty tree, no problem (but for that we need inserts...)

Applications of Access Lemma

By applying the above analysis with different weights we can derive a lot of interesting facts about performance of splay trees. (Key fact to keep in mind: the behavior of the splay tree is completely independent of our choice of the weights!)

Balance theorem: Total cost for $m$ operations is $O((m+n)\log n)$ (as good as any
balanced tree!)

Set $w_x=1/n$, for each node $x$.

Overall potential drop $\leq \sum_x \log W/w_x = \log n$.

Amortized cost of $Search(x)$: $1+3\log n=O(\log n)$.

However, the performance guarantee that splay tree offer go way beyond the above mere “$O(\log n)$ time per operation” guarantee that “traditional” balanced BSTs give. Some important example below!

Static Optimality: Total cost for $m$ operations is as good as of any fixed tree. (Including trees that were designed while knowing what the request sequence will be!)

Static Finger Optimality: Total cost for $m$ operations is as good as of any fixed tree in which additionally we start our search from any fixed node $f$ (instead of the root).

Set $w_x=1/(1+|x-f|)^2$, for each node $x$, where $|x-f|$ is the distance from $x$ to the finger $f$.

$\sum w_i \le 2 \sum 1/k^2 = O(1)$.

Overall potential drop $O(n\log n)$.

Amortized cost of $Search(x)$: $O(\log |x-f|)$.

In fact: Can also show that splay trees are optimal for dynamic finger too, i.e., when the next search is started from the previously found node (instead of the root). But the proof is very hard (and requires much more than just the Access Lemma).

Working Set/Cache Optimality:

At access $j$ to node $x_j$, let $t_j$ be number of distinct
nodes accessed since that node $x_j$ was previously accessed. Then time is:
\[
O(n\log n+\sum
\log t_j).
\]
Note: $O(\log t_j)$ is the time we would expect it takes to find element $x_j$ even if our tree contained only the $t_j$ nodes that were accessed since previous access to $x_j$.

Best-of-the-Above Optimality: Cost is at most as good as the best one of the choices stemming from the above optimality conditions.

Dynamic Optimality Conjecture: One of the most outstanding open problems in data structures is the Dynamic Optimality Conjecture. This conjecture states that the performance of splay tree matches (up to constant) the performance of the best dynamic BST. That is, performance of a BST that can perform rotations between requests (each rotation having a cost of $1$) and such that these rotations are chosen optimally with the full knowledge of the whole request sequence.

This conjecture is still open. The best result on this front is due to Demaine et al. (FOCS 2004) and presents a different (much more complicated) BSTs, called Tango trees, that have performance that is within $O(\log \log n)$ of that of the optimal dynamic BST.

Still, it is generally believe that Dynamic Optimality Conjecture is true and splay tree indeed is the ultimate BST construction. See a survey “In Pursuit of the Dynamic Optimality Conjecture” by John Iacono about this fascinating conjecture.

Update Operations

We have two remaining operations we need to implement: $Insert(x)$ and $Delete(x)$. There are two ways to implement them:

Insert/delete the item as in a standard BST and then just splay the inserted element/parent of deleted element.

Splaying ensures that our work is “paid for” by the resulting potential change.

But showing that this is the case requires a little bit of calculations.

Alternatively, consider the following operations:

$Split(x)$ splits the BST into two BST, one containing all the elements smaller or equal to $x$ and one containing all the elements larger than $x$.

$\Rightarrow$ Can be achieved by simply splaying $x$ and thus takes amortized $O(\log n)$ time (corresponds to taking all weights be equal to $1$ in the analysis).

$Join$ takes two BSTs $S$ and $T$ such that all elements in $S$ are not larger than all elements in $T$ and creates a single BST contains elements of both $S$ and $T$.

$\Rightarrow$ Can be achieved by splaying first the largest element $x$ of $S$ (note that then this element becomes a root with no right child) and adding the root of $T$ as the right child of $x$.

$\Rightarrow$ Joining increases only the potential of the new root, and only by at most $O(\log n)$.

$\Rightarrow$ Thus, again, the whole operation takes $O(\log n)$ amortized time (if we take all $w_x$ equal to $1$ in the analysis).

Note: For most balanced trees such $Split(x)/Join$ operations do not take only $O(\log n)$ time!

To implement $Insert(x)$:

Perform $Split(x')$, where $x'$ is the largest element not larger than $x$. (Finding $x'$ can be easily amortized into the subsequent splay.)

Create a new BSTs with $x$ as a root and the two BSTs obtained from splitting be its subtrees.

$\Rightarrow$ Total potential increased by $O(\log n)$, i.e., the increase of potential of the root. (Remember: we are performing analysis with all weights being $1$ here.)

$\Rightarrow$ Total amortized cost is $O(\log n)$.

To implement $Delete(x)$:

Perform $Split(x)$

$\Rightarrow$ $x$ will be the root of one of the resulting BSTs and will have no right child.

Remove $x$ and perform $Join$ on the resulting two BSTs.

$O(\log n)$ amortized cost, as desired.

Possible Heuristics

Top down splaying.

Choose to splay only when path is “long”. (Real cost too
large so need to amortize). Drawback: must know weights.

Can choose to stop splaying after a while. Good for random
access frequencies.