Menu

An Experiment in Sorting in Linear Time

Yes, you can sort in linear time, as long as you're avoiding comparisons.

Radix sort and counting sort are two examples. They both avoid comparisons, and are possible because they sort based on grouping by a single element at a time, where that single element is a letter or digit (in mathematical terms, elements of a limited alphabet).

Comparison-based Bounds

Comparison based sorting will never be faster than O(n log n), because it's bounded by the concept that comparison-based sorting as an element-by-element set of comparisons can be represented as a decision tree. For n sorted elements, at each node in the tree, we choose one of the n elements to be placed either before or after the other. This ends at the leaves, where there are n! possible leaves. Each leaf represents a permutation of an ordering of the elements.

Linear Sorting Given Special Restrictions

In my example of linear sorting, taken from Programming Pearls, by Jon Bentley, outlines a sorting problem, not limited by the runtime, but by space.

Task:

Sort the numbers in a given file.

Output the sorted numbers to another file.

Here are the conditions:

There are at most 10,000,000 7-digit integers in the file, one on each line.

There are no duplicates.

You have only 1 MB of memory.

You can see some problems right away. If we have 10,000,000 integers, and each is a 32-bit integer, you're dealing with 40 MB of data to sort. Just 1 MB? What is this? 1985? In this case, I think it was. The system's overhead and other programs only allowed for 1 MB of space to sort the data. This was part of desktop software. Good luck.

The Solution

The solution is creative, and takes advantage of the fact there are no duplicates.

The solution the author comes up with is: store the integers in a bit vector. For each number in the file, turn on its corresponding bit in the bitmap. So if the number is 4762, flip the 4,762 + 1st bit (reserving 0 in case it shows up). Then once you've flipped all the bits on for each number in the file, iterate through the bitmap and output the corresponding bit index to a file.

Normally you wouldn't think of this because of data loss. You're turning a 32 bit integer into a single bit. But because integers are a built-in construct of programming languages and CPUs, we have a built-in way to reconstruct the data, and it costs no data space to do so.

Analysis

n: the number of items to sort

Initializing a large bit vector to all zeroes: ~1.1*n operations (1 bit for each possible integer, even if that integer is not present in the input, here using 9,999,999 possible integers/9,000,000 unique numbers = 1.1)

Reading the file: n operations

Setting the bits: n operations

Reading the bits from the bitmap: ~1.1*n operations (see above)

Writing numbers to file: n operations

This gives us 5.2 * n operations, or O(n). Somewhere around 52 million operations.

Note, there are no comparisons involved in this algorithm.

If this were comparison-based, the most we'd hope for is O(n log n), where log210,000,000 = 23.25. n log n would make it approximately 232 million operations.

Python list storage analysis:

In Python, everything is an object, and uses memory multiples larger than the underlying primitive data types.

In order to make it easy to index into the right place to set/get bits, I used a Python list and indexed into a "bucket" where I could set the correct bit (from 0 to 31). The number of buckets: (max possible number // bits per integer) + 1.

That's 312,500 buckets. I'm using a list to hold 312,500 integers.

Sizes:

Empty Python list: 64 bytes

Python list of 1 integer: 72 bytes (64 + 8)

Python list of 2 integers: 80 bytes (64 + 8 * 2)

Python list of 312,500 integers: 2,500,064 bytes (64 + 8 * 312,500)

So we're already way over our memory allotment. There's just too much object size overhead. However, it's still much less than the 41 MB it would normally take if it was just a C-style memory slab. We're using only 6% of the memory we'd normally use!

Profiling Memory

I used memory-profiler to see what kind of memory this sorting application was taking overall.

It seems as though this is maxing out at 3.296 KB. I don't know why. I may not be reading this right. There is probably some compiler super-optimization going on here. Not sure. Sorry for the inconclusiveness.

Call for Comment

Any thoughts or improvements? I like that I can wrangle an array in C with far fewer resources than in Python. I'd love to be able to have this kind of control in Python. Did I not use the bitarray module correctly?