Sorting Algorithms In VB

Tuesday Jan 8th 2008 by Richard Newcombe

Share:

Cover a few of the better-known sorting algorithms with sample code for VB6.

All too often in an application, there's a need to sort a list of items. In this article, I will cover some of the more common sorting algorithms and give a few tips on choosing the right one for the job. I will cover the classic "Bubble sort" and "Insertion sort," right through to the difficult "Exchange sort" and "Stack sort." Then, you will look into a "Hybrid sort" that is a mix of the "Insertion and Stack sort" algorithms. You also will look at the advantages and disadvantages of each sorting algorithm.

First, look at a list of different sorting algorithms with a short list of details. 'Stable' refers to the ability of the sort algorithm to keep identical items in the original order.

Sort Name

Method

Stable?

Short Description

Bubble Sort

Exchanging

Yes

Basic two-loop sort

Cocktail Sort

Exchanging

Yes

Bi-directional Bubble sort

Comb Sort

Exchanging

No

Faster variation of the Bubble sort

Gnome Sort

Exchanging

Yes

Hybrid (Bubble and Insertion Sort)

Selection Sort

Selection

No

In-place comparison sort

Insertion Sort

Insertion

Yes

Simple comparison sort

Shell Sort

Insertion

No

Generalization of insertion sort

Binary tree sort

Insertion

Yes

Builds a Binary tree of the data and uses the tree to sort

Library Sort

Insertion

Yes

Insertion sort that leaves gaps in the list for subsequent elements

Merge Sort

Merging

Yes

A divide, sort, and merge algorithm

In-Place Merge Sort

Merging

Yes

Space-optimised version of the merge sort

Heap Sort

Selection

No

Two-stage comparison style sort

Smooth Sort

Selection

No

Small variation of a Heap sort

Quick Sort

Partitioning

No

Recursive partition and sort

Intro Sort

Hybrid

No

Hybrid of quick sort and heap sort

Patience Sorting

Insertion

No

Statistical ordering of elements

Stand Sort

Selection

Yes

Useful for data which is stored in a linked list

There are still more sorting algorithms and methods, but some require specific hardware, like the "Bead sort" and "Network Sort," and others are so impractical that they exist solely for demonstration purposes, like the "Bogo Sort" and "Stooge Sort" (named after the Three Stooges).

Bubble Sort

The bubble sort is the oldest of the sort algorithms that has ever been used. The basic coding used here is set up two loops simply to go through the list of items one at a time and compare an item to an item after it in the list, and place the smaller (or larger) item higher up in the list.

There are two methods of using this algorithm; both are correct and return much the same results it terms of time and passes. One method, "the backward bubble sort," is to set up the outer loop to step up through the list and the inner loop to step down to the outer loop value, and on each pass compare elements adjacent to each other and swap them if they are in the wrong order.

The other method, "the forward bubble sort," is to step both the inner and outer loops down the list and compare the element of the outer loop to the element of the inner loop and swap them if they are in the wrong order.

The main difference between forward and backward bubble sorting is the direction the list is sorted, as denoted by the name. The code length (in almost any language) will be identical, with only a few differences in variable placements. Execution time will be very similar across the two algorithms, and will depend mostly on how the original list is currently sorted.

With Bubble sorts (Forward or Backward), it's worth noting that after the nth pass, either the first or last n items are sorted into their correct positions respectively.

There aren't many advantages to using the bubble sort, except maybe the compiled code length in some languages, like Assembly, or the ease at setting up and debugging the code. One major disadvantage is execution time. The execution time for the algorithm increases exponentially as the size of the list increases.

Now, look at the code. Listing 1 is the backward bubble sort, and Listing 2 is the forward bubble sort. In Listing 3, you make a few changes to the code to get maximum speed out of it. You add a Boolean check so that if there is a pass that does no swaps, you know that the list is sorted and there is no need to carry on. In a semi-sorted list, the code in Listing 3 would be the fastest of the three.

Insertion Sort

The Insertion sort is one step better than the bubble sort and works on a similar principle. You still loop over the full list from first to last but instead of swapping elements over, you insert the element in the correct position within the already sorted part of the list.

If the list to be sorted were in reverse order, both Bubble and Insertion sorts would take about the same time to sort; however, if the list was semi-sorted, the insertion sort would run much faster.

With Insertion sort, it's interesting to note that after the nth pass the first n items are sorted; however, they are not necessarily in their correct and final positions.

Insertion sort has only a few advantages over other sorting algorithms in that it can be used to sort items in an online fashion, as in the user inputting a list or input from an outside device, a lot quicker that most other sorts.

Listing 4

Selection Sort

The Selection sort is somewhere between the insertion sort and bubble sort; it scans over the remaining unsorted list to find the smallest item and place it at the beginning. Selection sort is also often referred to as Exchange sort because of the method in which items are not shifted up and down but rather will exchange positions to place an item in the correct place.

There is not much that can be said for the selection sort except that, depending on the medium being used where writing to an array is more expensive than reading (memory access times; in other words, on EEPROMS or Flash), a selection sort will run faster than most other sorts due to the number of item swaps that take place.

With Selection sort, it's worthwhile to note that after the nth pass, only n item swaps have taken place and the first n items are sorted into their correct position.

Listing 5 is an example of the Selection sort.

Listing 5

For Loop1 = 0 To Total
Pos = Loop1
For Loop2 = Loop1 + 1 To Total
If List(Loop2) < List(Pos) Then
Pos = Loop2
End if
Next Loop2
If Pos <> Loop1 Then
Tmp = List(Pos)
List(Pos) = List(Loop1)
List(Loop1) = Tmp
End If
Next Loop1

Heap Sort

The Heap sort has a two-step method of sorting; first, the list is semi sorted to build a parent-child tree within the list. Then, the parent child tree is used to sort the list.

There are several methods to implement the heap sort, with some algorithms using secondary arrays to perform the sort. The listing given performs the heap sort within the original list, and has been optimised.

In the first step, the list will appear to be in a worse sorted order than before starting; however, the sorting parent child tree has been built with the largest item in the list on the top. During the second step, after every nth pass, the n last items will be sorted into their correct positions. The parent -child tree ensures that with each pass the next largest item is always shifted to the top of the list.

The heap sort has one disadvantage. Given its code size, it's not practical to use in projects were there is a limited memory available for code; for example, on Micro-controllers and Programmable PICs.

However large the code may be, the heap sort did have the advantage of been the fastest, non-recursive code, sort available for many years, limited to systems where memory read and write access times were similar.

Shell Sort

The Shell sort improves the Insertion sort by doing comparisons separated by a gap of several positions, with each pass progressively reducing the gap. It was developed by Donald Shell and was published in 1959 in "Communications of the ACM."

Donald Shell's original recommendation was to use n/2 for the first pass and divide by 2 for each pass thereafter, until the gap was 1, at which point the Shell sort is the same as an Insertion sort. The best-known gap sequence for use in the Shell sort is 1, 4, 10, 23, 57, 132, 301, 701. For any lists that are larger, you can calculate the next geometric progression using "NextGap = Gap * 2.3".

Using the above-mentioned gap sequence will improve the speed of the Shell sort, but there are a few drawbacks. The sort code is larger, mostly for the additional instructions required to find and load the next gap sequence. Also, memory usage is increased slightly to store that Gap sequence table.

The Shell sort is one of the fastest, non-recursive code, sorts where memory access and write times are similar.

Quick Sort

The Quick sort, also known as a Stack sort, partitions the list on a random pivot point, placing all items smaller or larger on either side of the pivot, and then recursively calls itself passing over the split lists. The recursive calls continue until the algorithm receives a list where there are only one or two items in it, at which time this part of the list is sorted into the correct position. Charles Hoare developed it in 1960 for use with the ALGOL-based Elliot systems.

Because of the recursive nature of the algorithm, the Quick sort does require a rather large processor stack, and does use a relatively large amount of memory to store many of the values used in each recursive call. However, in situations where speed is of utmost essence and memory is not a factor, Quick sort does a tremendous job of sorting lists on the fly.

The only disadvantage of using the Quick sort algorithm is if you have a system with a limited stack size, or have added it to a program that makes heavy use of the stack, you may end up with "Out of Stack" or "Stack Overflow" errors when trying to sort relatively larger lists. In the absolute worst case, the Quick sort will do n-2 recursive calls and in the best case it will do log2(n)-recursive calls; however, the general accepted maximum recursive call depth is 4 log2 n. This needs to be taken into consideration and a non-recursive sorting algorithm may need to be used as an alternative if there is insufficient stack space.

An advantage of Quick sort is that, with the divide and sort method, the algorithm can be written so that each recursive call is run in a new thread, and even on a separate processor. Because the list is split and each part of the list is not dependent on the other, parallel processing without the need for cross thread communication can take place, and further speed up the sorting.

Listing 8 shows a Quick sort based on Charles Hoare's original algorithm, using a randomly selected pivot point.

As with almost everything else, there is always place for improvement. Over the years, many variations of the Quick sort have cropped up, many giving only a slight improvement of speed. However, there is a variant that almost halves the amount of sorting time.

Listing 9 shows this variant. The algorithm make a single tail recursive call during processing, which unfortunately does not allow for multi threading. However, with the method of pre and post processing, an effective high speed sort can be accomplished.