среда, 10 марта 2010 г.

Linear data structures are widely used in common-day life of every programmer, .NET BCL contains many of such structures, like List, LinkedList (in fact LinkedList is doubly linked list) etc. What we’ll try to do in this post is to create the structure (based on 2-3 trees) with following characteristics:

Immutable (modification returns new instance of structure with changes applied)

This definition is very simple and this is one of its biggest advantages. However this structure cannot enforce abidance of main tree invariant: all leaves should be located on the same level. That's why this rule is put in the code of modification operation and this is the reason of complexity and unreadability of this code. This is the balance: reduce complexity in one place and will be increased in another. But everything is not so bad, we can bring more order in code by modifying the structure of 2-3 tree (all data is removed from interim nodes, the only data containers - leaves)

Notice, that structure of the tree is not regular, first level will be Tree<T>, second - Tree<Node<T>>, third - Tree<Node<Node<T>>> and so on. Operations on such tree usually take log2(N) time, where N - number of elements. However we would like to perform enqueue and dequeue in a constant time...It is time to explain the mistery of the name "Finger tree".

Finger is the structure that provides efficient access to the nodes of the tree near specified location. We'll take end nodes from left and right side and treat them in a different manner - they will behave like a buffer with end elements. The appearance of 2-3 tree with fingers applied is presented on the code below:

As we may notice the common view of structure remains the same. For further progress we need to extend types with possibiliy to enumerate themselves, later this feature will be used i.e. for fold operations.

Finger module (code below) contains operations to push and pop values from finger buffers. All operations are trivial except corner case: push to full buffer. Such situation is impossible and controlled by top level push/pop functions, detailed description will be given later in text, for now this should be accepted as an axiom.

Operation “push-to-tree” is rather trivial, except the situaton when finger already contains four elements, in that case we push three elements into middle tree and leave finger with two items. This means, that Fingers.pushLeft should never be called for Finger.Four.

Worth noting that pushLeft function is calling itself with different type parameter inside its body. This case is called polymorphic recursion and it cannot be defined in let-bound functions in F#. Instead it can be declared as method of class with full type annotations. This trick is widely used in CommonOperations type. pushRight function is mirror image of pushLeft, so it will be omitted.

Implementation of popLeft function is also non-complex but in that case special situation - one remaining element in finger buffer.

We examined enqueue/dequeue and traversing and now proceed to concatenation. This operation is rather primitive when one of arguments is Empty of Single tree and the only difficult case is concatenation of two Multi trees. Left end of first tree and right end of second tree is passed without changes to result tree and middle part is the result of applying merge function with the following signature (tree1-middle)FingerTree<Node<'T>> -> (tree1-right-end)Finger<'T> -> (tree2-left-end)Finger<'T> -> (tree2-middle)FingerTree<Node<'T>> -> (result)FingerTree<Node<'T>> . It is very easy to define function for transforming two fingers to the list of nodes and use it to create generalized concatenation function.

Two finger trees can be concatenated by calling concat passing empty list as ts argument.

In this post we have explored the possibilities of simple finger tree. It can be basically used as an immmutable deque. Next post will illustarte how to enhance current implementation with effective searching of element with particular property.