Introduction

The first thing I consider to be the most important to understand is what sorting algorithms are. According to Wikipedia, the sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output.

In this article, I will describe some sorting algorithms. All algorithms presented here have been written in C# and a lot of ideas are based on algorithms you can find on Wikipedia.

I have decided to create GUI visualization for sorting algorithms. This project also allows users to save outputs into the animated GIF picture and set speed of sorting.

Using the code

This solution consists of 2 projects. First project called Components provides classes for creation of animated GIF images. This project is based on the NGIF project. More information about this project can be found there.

The second project called SortComparison is the main part of the solution. In includes a form called frmMain where you can choose sorting algorithms, set the number of samples you want to sort, sorting speed, and select whether you want to create an animated GIF file. On the form are placed two panels called pnlSort1 and pnlSort2 where the sorting visualizations are rendered.

Every algorithm has its own method named by the name of the sorting algorithm and accepts an IList parameter and returns an IList object.

The method DrawSamples draws all the samples on the panel. This method is called after the random samples are generated. The samples are generated by clicking on the Shuffle button. The samples are stored in the array variable.

During sorting, when the check box Create animation is checked, images are generated after each swapping of two items of the samples array. This images are indexed from 0 to n where n represents the current number of swaps.

Sorting algorithms

Bubble Sort

Bubble sort, also known as sinking sort, is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items, and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort. The equally simple insertion sort has better performance than bubble sort, so some have suggested no longer teaching the bubble sort.

Bidirectional Bubble Sort

Cocktail sort, also known as bidirectional bubble sort, cocktail shaker sort, shaker sort (which can also refer to a variant of selection sort), ripple sort, shuttle sort, or happy hour sort, is a variation of bubble sort that is both a stable sorting algorithm and a comparison sort. The algorithm differs from bubble sort in that it sorts in both directions on each pass through the list. This sorting algorithm is only marginally more difficult to implement than bubble sort, and solves the problem with so-called turtles in bubble sort.

The first rightward pass will shift the largest element to its correct place at the end, and the following leftward pass will shift the smallest element to its correct place at the beginning. The second complete pass will shift the second largest and second smallest elements to their correct places, and so on. After i passes, the first i and the last i elements in the list are in their correct positions, and do not need to be checked. By shortening the part of the list that is sorted each time, the number of operations can be halved.

Bucket Sort

Bucket sort, or bin sort, is a sorting algorithm that works by partitioning an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm. It is a distribution sort, and is a cousin of radix sort in the most to least significant digit flavour. Bucket sort is a generalization of pigeonhole sort. Since bucket sort is not a comparison sort, the O(n log n) lower bound is inapplicable. The computational complexity estimates involve the number of buckets.

Bucket sort works as follows:

Set up an array of initially empty "buckets".

Scatter: Go over the original array, putting each object in its bucket.

Sort each non-empty bucket.

Gather: Visit the buckets in order and put all elements back into the original array.

Comb Sort

Comb sort is a relatively simplistic sorting algorithm originally designed by Wlodzimierz Dobosiewicz in 1980. Later it was rediscovered and popularized by Stephen Lacey and Richard Box with a Byte Magazine article published in April 1991. Comb sort improves on bubble sort, and rivals algorithms like Quick sort. The basic idea is to eliminate turtles, or small values near the end of the list, since in a bubble sort these slow the sorting down tremendously. (Rabbits, large values around the beginning of the list, do not pose a problem in bubble sort.)

In bubble sort, when any two elements are compared, they always have a gap (distance from each other) of 1. The basic idea of comb sort is that the gap can be much more than one.

The gap starts out as the length of the list being sorted divided by the shrink factor (generally 1.3), and the list is sorted with that value (rounded down to an integer if needed) for the gap. Then the gap is divided by the shrink factor again, the list is sorted with this new gap, and the process repeats until the gap is 1. At this point, comb sort continues using a gap of 1 until the list is fully sorted. The final stage of the sort is thus equivalent to a bubble sort, but by this time most turtles have been dealt with, so a bubble sort will be efficient.

Cycle Sort

Cycle sort is an in-place, unstable sorting algorithm, a comparison sort that is theoretically optimal in terms of the total number of writes to the original array, unlike any other in-place sorting algorithm. It is based on the idea that the permutation to be sorted can be factored into cycles, which can individually be rotated to give a sorted result.

Unlike nearly every other sort, items are never written elsewhere in the array simply to push them out of the way of the action. Each value is either written zero times, if it's already in its correct position, or written one time to its correct position. This matches the minimal number of overwrites required for a completed in-place sort.

Gnome Sort

Gnome sort, originally proposed by Hamid Sarbazi-Azad in 2000 and called Stupid sort, and then later on described by Dick Grune and named "Gnome sort", is a sorting algorithm which is similar to insertion sort, except that moving an element to its proper place is accomplished by a series of swaps, as in bubble sort.

It is conceptually simple, requiring no nested loops. The running time is O(n2), but tends towards O(n) if the list is initially almost sorted. In practice, the algorithm can run as fast as Insertion sort. The average runtime is O(n2).

The algorithm always finds the first place where two adjacent elements are in the wrong order, and swaps them. It takes advantage of the fact that performing a swap can introduce a new out-of-order adjacent pair only right before or after the two swapped elements. It does not assume that elements forward of the current position are sorted, so it only needs to check the position directly before the swapped elements.

Heap Sort

Heap sort begins by building a heap out of the data set and then removing the largest item and placing it at the end of the partially sorted array. After removing the largest item, it reconstructs the heap, removes the largest remaining item, and places it in the next open position from the end of the partially sorted array. This is repeated until there are no items left in the heap and the sorted array is full. Elementary implementations require two arrays - one to hold the heap and the other to hold the sorted elements.

Heap sort inserts the input list elements into a binary heap data structure. The largest value (in a max-heap) or the smallest value (in a min-heap) are extracted until none remain, the values having been extracted in sorted order. The heap's invariant is preserved after each extraction, so the only cost is that of extraction.

During extraction, the only space required is that needed to store the heap. To achieve constant space overhead, the heap is stored in the part of the input array not yet sorted.

Heap sort uses two heap operations: insertion and root deletion. Each extraction places an element in the last empty location of the array. The remaining prefix of the array stores the unsorted elements.

Insertion Sort

If the first few objects are already sorted, an unsorted object can be inserted in the sorted set in the proper place. This is called insertion sort. The algorithm considers the elements one at a time, inserting each in its suitable place among those already considered (keeping them sorted). Insertion sort is an example of an incremental algorithm. It builds the sorted sequence one number at a time. This is perhaps the simplest example of the incremental insertion technique, where we build up a complicated structure on n items by first building it on n - 1 items and then making the necessary changes to fix things in adding the last item. The given sequences are typically stored in arrays. We also refer to the numbers as keys. Along with each key may be additional information, known as satellite data.

Merge Sort

If the list is of length 0 or 1, then it is already sorted. Otherwise:

Divide the unsorted list into two sublists of about half the size.

Sort each sublist recursively by re-applying merge sort.

Merge the two sublists back into one sorted list.

Merge sort incorporates two main ideas to improve its runtime:

A small list will take fewer steps to sort than a large list.

Fewer steps are required to construct a sorted list from two sorted lists than two unsorted lists. For example, you only have to traverse each list once if they're already sorted (see the merge function below for an example implementation).

Odd-Even Sort

Odd-even sort is a relatively simple sorting algorithm. It is a comparison sort based on bubble sort with which it shares many characteristics. It functions by comparing all (odd, even)-indexed pairs of adjacent elements in the list and, if a pair is in the wrong order (the first is larger than the second) the elements are switched. The next step repeats this for (even, odd)-indexed pairs (of adjacent elements). Then it alternates between (odd, even) and (even, odd) steps until the list is sorted. It can be thought of as using parallel processors, each using bubble sort but starting at different points in the list (all odd indices for the first step). This sorting algorithm is only marginally more difficult than bubble sort to implement.

Pigeonhole Sort

Pigeonhole sorting, also known as count sort (not to be confused with counting sort), is a sorting algorithm that is suitable for sorting lists of elements where the number of elements (n) and the number of possible key values (N) are approximately the same. It requires O(n + N) time.

The pigeonhole algorithm works as follows:

Given an array of values to be sorted, set up an auxiliary array of initially empty "pigeonholes", one pigeonhole for each key through the range of the original array.

Going over the original array, put each value into the pigeonhole corresponding to its key, such that each pigeonhole eventually contains a list of all values with that key.

Iterate over the pigeonhole array in order, and put elements from non-empty pigeonholes back into the original array.

Quick Sort

Quick sort sorts by employing a divide and conquer strategy to divide a list into two sub-lists. The steps are:

Pick an element, called a pivot, from the list.

Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.

Recursively sort the sub-list of lesser elements and the sub-list of greater elements.

The base cases of the recursion are lists of size zero or one, which never need to be sorted.

Selection Sort

Repeat the steps above for the remainder of the list (starting at the second position and advancing each time)

Effectively, the list is divided into two parts: the sublist of items already sorted, which is built up from left to right and is found at the beginning, and the sublist of items remaining to be sorted, occupying the remainder of the array.

Shell Sort

The principle of shell sort is to rearrange the file so that looking at every hth element yields a sorted file. We call such a file h-sorted. If the file is then k-sorted for some other integer k, then the file remains h-sorted. For instance, if a list was 5-sorted and then 3-sorted, the list is now not only 3-sorted, but both 5- and 3-sorted. If this were not true, the algorithm would undo work that it had done in previous iterations, and would not achieve such a low running time.

The algorithm draws upon a sequence of positive integers known as the increment sequence. Any sequence will do, as long as it ends with 1, but some sequences perform better than others. The algorithm begins by performing a gap insertion sort, with the gap being the first number in the increment sequence. It continues to perform a gap insertion sort for each number in the sequence, until it finishes with a gap of 1. When the increment reaches 1, the gap insertion sort is simply an ordinary insertion sort, guaranteeing that the final list is sorted. Beginning with large increments allows elements in the file to move quickly towards their final positions, and makes it easier to subsequently sort for smaller increments.

By the way, it would be nice to create a comparison of the sorting algorithms from the perspective of possible parallelization. In one of our projects we realized that although quick sort was faster on smaller datasets than merge sort, when we tried to rewrite the algorithms to utilize multiple cores (in order to be able to sort datasets with millions of items), it turned out that quick sort cannot be parallelized, while merge sort can. So, at the end of the day, we decided to use a slower, but parallel algorithm, which scales better with multicore hardware.

Does anyone in the community have any information on this aspect of the sorting algorithms?

Hi, greate article
And I have a fun fact for you:
It's been about 5 years that I wanted to write exactly the same application!
It seems I'm too lazy to do things
And fortunately you did it instead of me, and surely you won the prize, which you deserve

I'm very glad that you like this article. I've been working on this article so long because of the same reason :P laziness. Now I want to continue in this work and do some parallelization of some algorithms.

That is a really good point. If you are writing a generic sorting provider, you can always have it check the data types being compared for the availability of the xor operator (if the language supports such queries), or you can check the data types against known types that support xor before beginning the sort and call appropriate swap logic as a result of that evaluation.