Binary search tree (C)

This article implements a simple binary search tree data structure in C with search, insert, and delete operations. A binary search tree is a tree where each node contains a value, and for each node, the left subtree only contains values less than the value of the node, and the right subtree only contains greater values. This allows to find the node for any value in O(logN) average time.

The binary search tree shall work with any data type, therefore the code takes pointers to void for the data, and relies on an user-supplied comparison function for sorting.

We represent an entire tree using a pointer to its root node; thus this structure doubles as a tree representation. Constructing and destroying nodes is straightforward. Newly constructed nodes initially are leaf nodes, i.e. they have no left or right child:

On destroying nodes, it is assumed that the node has already been removed from the tree. Especially there are no subtrees to destroy. Moreover, this search tree implementation doesn't take over ownership of the data values, therefore the data pointer doesn't have to be destroyed either.

The most important operation on the tree is searching, which uses the properties of the binary search tree to traverse from the root down towards a given node. In the end it will either find the requested node or prove that it is not in the tree, in which case it returns a pointer to NULL. We return the address of the node pointer instead of just the node pointer to facilitate in-place updates.

Inserting is now simple. We search for the node, and if it is not found, we create a new node and put it in the place the search function expected to find it, which is where it belongs in the tree. This particular insertion function does not allow duplicate values in the tree (set semantics), but one that does would be similar.

Deletion is the most complicated operation. We focus on deleting nodes rather than values, since search() can be used to find a node based on a value if necessary. Deleting a node with one child or less is simple; we just replace it with its child:

<<declarations>>=voiddelete(structbst_node**node);<<delete operation>>=voiddelete(structbst_node**node){structbst_node*old_node=*node;if((*node)->left==NULL){*node=(*node)->right;free_node(old_node);}elseif((*node)->right==NULL){*node=(*node)->left;free_node(old_node);}else{delete node with two children}}

A node's inorder predecessor is the node whose value immediately precedes the node's value in sorted order. We can find this node by following right child pointers repeatedly starting at the node's left child. Similarly, the inorder successor immediately follows it in sorted order, and can be found by following left child pointers starting at the node's right child.

To delete a node with two children, as shown below, we simply exchange the value of the node with either its inorder predecessor or successor, then delete the node that we exchanged with. This preserves the order properties of the tree, and the predecessor/successor nodes always have at most one child, so we already know how to delete them. In this implementation, we choose to consistently use the predecessor.