Storing TreeView Structures With MongoDB

Storing Tree like Structures With MongoDB

Background

In a real life almost any project deals with the tree structures. Different kinds of taxonomies, site structures etc require modelling of hierarhy relations. In this article I will illustrate using five typical approaches plus one combination of operating with hierarchy data on example of the MongoDB database. Those approaches are:

Model Tree Structures with Child References

Model Tree Structures with Parent References

Model Tree Structures with an Array of Ancestors

Model Tree Structures with Materialized Paths

Model Tree Structures with Nested Sets

Note: article is inspired by another article ‘Model Tree Structures in MongoDB‘ by MongoDB, but does not copy it, but provides additional examples on typical operations with tree management. Please refer for 10gen article to get more solid understanding of the approach.

Challenges to address

Get path to node (for example, in order to be build the breadcrumb section)

Get all node descendants (in order to be able, for example, to select goods from more general category, like ‘Cell Phones and Accessories’ which should include goods from all subcategories.

On each of the examples below we:

Add new node called ‘LG’ under electronics

Move ‘LG’ node under Cell_Phones_And_Smartphones node

Remove ‘LG’ node from the tree

Get child nodes of Electronics node

Get path to ‘Nokia’ node

Get all descendants of the ‘Cell_Phones_and_Accessories’ node

Please refer to image above for visual representation.

Tree structure with parent reference

This is most commonly used approach. For each node we store (ID, ParentReference, Order).

Operating with tree

Pretty simple, but changing the position of the node withing siblings will require additional calculations. You might want to set high numbers like item position * 10^6 for order in order to be able to set new node order as trunc (lower sibling order – higher sibling order)/2 – this will give you enough operations, until you will need to traverse whole the tree and set the order defaults to big numbers again.

Indexes

Recommended index is on fields parent and order

db.categoriesPCO.ensureIndex( { parent: 1, order:1 } )

Tree structure with childs reference

For each node we store (ID, ChildReferences).

Please note, that in this case we do not need order field, because Childs collection already provides this information. Most of languages respect the array order. If this is not in case for your language, you might consider additional coding to preserve order, however this will make things more complicated.

Indexes

Tree structure using Nested Sets

For each node we store (ID, left, right).

Left field also can be treated as an order field

Adding new node

Please refer to image above. Assume, we want to insert LG node after shop_top_products(14,23). New node would have left value of 24, affecting all remaining left values according to traversal rules, and will have right value of 25, affecting all remaining right values including root one.

Steps:

take next node in traversal tree

new node will have left value of the following sibling and right value – incremented by two following sibling’s left one

now we have to create the place for the new node. Update affects right values of all ancestor nodes and also affects all nodes that remain for traversal

Node removal

While potentially rearranging node order within same parent is identical to exchanging node’s left and right values, the formal way of moving the node is first removing node from the tree and later inserting it to new location. Node: node removal without removing it’s childs is out of scope for this article. For now, we assume, that node to remove has no children, i.e. right-left=1

Steps are identical to adding the node – i.e. we adjusting the space by decreasing affected left/right values, and removing original node.

Updating/moving the single node

moving the node can be within same parent, or to another parent. If the same parent, and nodes are without childs, than you need just to exchange nodes (left,right) pairs.

Formal way is to remove node and insert to new destination, thus the same restriction apply – only node without children can be moved. If you need to move subtree, consider creating mirror of the existing parent under new location, and move nodes under the new parent one by one. Once all nodes moved, remove obsolete old parent.

As an example, lets move LG node from the insertion example under the Cell_Phones_and_Smartphones node, as a last sibling (i.e. you do not have following sibling node as in the insertion example)

Step 1 would be to remove LG node from tree using node removal procedure described above Step2 is to take right value of the new parent. New node will have left value of the parent’s right value and right value – incremented by one parent’s right one Now we have to create the place for the new node: update affects right values of all nodes on a further traversal path

Indexes

Tree structure using combination of Nested Sets and classic Parent reference with order approach

For each node we store (ID, Parent, Order,left, right).

Left field also is treated as an order field, so we could omit order field. But from other hand we can leave it, so we can use Parent Reference with order data to reconstruct left/right values in case of accidental corruption, or, for example during initial import.

The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple parents.

The Array of Ancestors pattern – no specific advantages unless you constantly need to get path to the node

You are free to mix patterns (by introducing order field, etc) to match the data operations required to your application.