AVL tree

Named after inventors Adelson-Velskii and Landis, the AVL tree was concieved in 1962. It was the first dynamically balanced binary search tree to be proposed. Like a binary tree, AVL trees consist of parent nodes with no more than two child nodes. The tree re-sorts itself when a node finds itself the child of a node in a one branched subtree.

For example:

5 5
/ \ / \
3 7 2 7
/ -> / \
2 1 3
/
1

The 2 replaces the 5's lesser child and the 3 becomes the 2's right child (insert orphanage joke). Pairs of sub-trees differ in height by at most 1, maintaining an O(logn) search time. Addition and deletion operations also take O(logn) time.

I present here a highly recursive implementation of the AVL tree algorithm. This is a generic implementation using template classes that will order a binary tree on a user defined sort key plus allow for arbitrary payloads.

The AVL tree algorithm is used to keep the binary tree in balance so that seaching will always be optimal.
Each time you insert or remove a node from the tree, the tree is rebalanced. Thus, you pay a small price
when maintaining the tree for excellent search results with O(log n) execution for each
search.

By default, the T::operator < (with its traditional meaning of less than) is used to create order in the tree but
by reversing the meaning of T::operator < to actual mean greater than, you can create a tree that is reverse
sorted. If you do this, please comment it carefully and fully otherwise some poor sucker is going to be
hopelessly confused when he comes along and tries to modify your code.

//
// Use "namespace" to make sure the class names don't conflict with other code.
//

namespace AVL {

//
// The basic unit of currency in a tree are the nodes that comprise it.
//
template <class T>
class Node
{
public :

// This is were we keep the data we want to store in each node.
// It is const because if you change it while it is in the tree structure
// you compromise the integrity of the tree.
// It is public because the Tree class must have access to it in order
// to return it after being found with the found_node function.

const T data;

private :

// Each node has two children: left and right. If they are both NULL then
// the node is a leaf node. Otherwise, it's an interior node.

Node<T> * left, * right;

// The height is computed to be: 0 if NULL, 1 for leaf nodes, and the maximum
// height of the two children plus 1 for interior nodes.
// This is used to keep the tree balanced.

// Recursively search the tree for some data and if found remove (delete) it.
// When you remove an interior node the right child must be place right of
// the right most child in the left sub-tree.
// Remember to balance the tree on the way up after removing a node.

//
// Balancing a tree (or sub-tree) requires the AVL algorithm.
//
// If the tree is out of balance left-left, we rotate the node to the right.
// If the tree is out of balance left-right, we rotate the left child to the
// left and then rotate the current node right.
// If the tree is out of balance right-left, we rotate the right child to the
// right and then rotate the current node left.
// if the tree is out of balance right-right, we rotate the node to the left.
//

Node<T> * balance ()
{
int d = difference_in_height ();

// only rotate if out of balance
if (d < -1 || d > 1)
{
// too heavy on the
right
if (d < 0)
{
// if right child is too heavy on the left,
// rotate right child to the right
if (right -> difference_in_height () > 0)
right = right -> rotate_right ();

// rotate current node to the left
return rotate_left ();
}
// too heavy on the
left
else
{
// if left child is too heavy on the right,
// rotate left child to the left
if (left -> difference_in_height () < 0)
left = left -> rotate_left ();

// rotate current node to the right
return rotate_right ();
}
}

// recompute the height of each node on the way
up
compute_height ();

// otherwise, the node is balanced and we simply
return it
return this;
}

//
// Cover class for maintaining the tree.
//
// Since Node<T> is self allocating and self deleting, the Tree<T> class
// ensures that only qualified calls are made.
//
// Tree<T> is the public interface to the AVL Tree code.
// Node<T> is not meant to be used by the public.
//
// This code makes use of the somewhat dubious practice of calling a member
// function with a NULL "this" pointer. We will not run into problems since
// we have no virtual member functions in Node<T>.
//

AVL trees (Also called "Red-Green" trees (no relation to the show)1, and "Height Balanced" trees) are really, really cool. The AVL comes from the creators, Adelson, Veslkij and Laudis (Source: vera). I will go into a bit more detail behind the theory than the previous posters.

The general premise behind AVL trees is "The height of the left and right subtrees differ by no more than one." What this means is that it's time to whip out the ASCII Art

Figure 1 is unbalanced because the height of the leaf node 3'is of height 7 whereas the height of leaf node 4 is 3. The difference between these nodes is greater than one, and so they are in an unbalanced state.

Unbalanced trees are bad because they increase search time - and that is bad. The whole point to using an AVL tree is for the speed of O(log n).

Once we identify how to rotate nodes and balance AVL trees we have to decide when to balance. To do that we can either write one function to balance the whole tree all at once and call it periodically, or we can do one of two things. What we can do is assign a node a depth identifier (see fig 1 & 2). There are actually two ways to do this.

The first is to assign a height variable to each node. What this height variable does is store how many children are below this (that is, the one storing the numbers) node on the left and right side. Take figure 7, for example. Node A has 0 children, so its left and right height is 0. The same for Node C. Node B, however, has two children, so its height is 1. These numbers are computed recursively so the super-root node will always be correct. We know when a node is unbalanced when its left and right height differ by 2.

The other method is a spinoff of the above. It uses a balance factor. This value is an integer and can have three values:

Let's look at figure 3 again. Node A has a BF of 0, node B has a BF of -1 -- meaning it has a left child, but no right -- and node C (the root node) has a BF of -2, which means it has 2 more left children than right. This value is also computed recursively.

All of this is really pointless unless the reader understands and can answer the question "why am I using an AVL tree and rotation code?" The answer is that an AVL tree is faster to search than a standard binary tree that is not balanced (an unbalanced binary tree has O(n) speed). What makes an AVL tree stand out from a regular binary tree is that an AVL tree is (self) balancing. Remember: Speed is the name of the game. While there are speedier data structures, such as B-Trees or M-way trees, an AVL tree is easier to implement.