Tag: sorting

In my previous post, I described a schema and set of associated queries to persist and and update arbitrarily ordered items in a SQL database (using a linked list). This approach can scale to very large lists, without degrading performance when adding or rearranging items. But having stored a list, how can it be reproduced in the correct order? This post describes an approach to efficiently sort linked lists from SQL in client-side code. While the below examples are written in JavaScript, you could use the same basic technique in almost any modern language.

Suppose that you select the following (unordered) linked list from a database:

The naiveSort function re-loops through the list from the beginning each time it finds the next item and adds it to a sorted copy of the array. The function will return the correctly sorted list, but the number of required iterations exponentially increases as the list lengthens, following the equation size + size * ((size - 1) / 2). For example, a list containing 100 items would require 5,050 iterations, while a list containing 1,000 items would require 500,500! With this approach, any advantage of the linked list’s efficient insertion and reordering would be lost in lengthy sort times.

An efficient sorting algorithm

The mapSort function starts by looping through the linked list a single time, adding the item array indexes to a map with a key of the item’s previous_item_id property. It then follows the chain of item_id references through the map to build the complete sorted list. This approach requires (size * 2) - 1 iterations, allowing it to scale linearly with list length.

Testing with Node.js on my Core i5 desktop PC, the mapSort function was able to sort a 5,000 item list in an average of 2.3 ms, compared to 68.4 ms for naiveSort. With larger lists, the discrepancy grew even greater. Sorting 100,000 items took an average of over 40 seconds with naiveSort, but just 61.7 ms with mapSort!

There are likely other optimizations that could be implemented to further increase performance, but for most practical purposes this technique should prove sufficient.