Design

Algorithm Improvement through Performance Measurement: Part 1

By Victor J. Duvanenko, September 15, 2009

Optimizing sort algorithms for today's CPUs

Improving Selection Sort

Since Selection Sort is similar in form to Insertion Sort, it may be possible to eliminate self-assignment operation, as was done in Insertion Sort. The issue is noticeable in the original implementation of Selection Sort (Listing One), where the swap operation is outside of the inner for loop and will be executed even if the first comparison of the for loop fails and the inner for loop never executes. In this case, a[i] = a[i] redundant self-assignment operation will be performed.

The first step of this process is shown in Listing Seven (selectionSort_FewerWrites function), where an if statement has also been added at the front to fix the issue with array size of zero in the original implementation. The second step eliminates this if statement by modifying the outer loop limits, and thus not incurring this overhead for all other array sizes (selectionSort_FewerWrites2 function).

Similar to the idea in Insertion Sort, self-assignment is eliminated by unrolling the loop iteration where self-assignment would have occurred, and restructuring the code to use it in several ways. In Selection Sort, however, we start the for loop one ahead of the current element and look for the minimum from that point on. Then this minimum is compared against the current element and only if the current element is not the minimum is the swap performed. In other words, if the current element a[i] is the minimum, then nothing needs to be done and self-assignment is eliminated, without performing extra work.

Performance measurements comparing the original Selection Sort algorithm with the improved version are shown in Tables 7, 8 and 9.

Table 7

Table 8

Table 9

Selection Sort is known for performing fewer writes to the array than other sorting algorithms. This may be important in some environments where writes are an expensive operation, such as flash memory [4]. The number of writes has been reduced further by eliminating self-assignment, which improves performance for some input data set statistics.

In general, the probability of self-assignment is low, because for this case to occur the current element has to turn out to be the minimum within the unsorted portion of the array. However, this case occurs every time for the presorted input data case. In this case, the improved Selection Sort does not write any data to the array. In the case of reverse input data, the case of the current element being the minimum never occurs, and for random input data it occurs seldom. A nice attribute of this optimization, however, is that it does no extra work to accomplish it (no additional if statements were added that would slow down all other cases). If the input data statistics tend toward presorted input data then performance benefit is realized.

Insertion Sort vs. Selection Sort

Table 10 compares the four sorting algorithms by showing their distinguishing features.

Table 10

To cover the case of writes to the array being critical, only these writes are counted — shown in the top line of each row within the table. To also cover the case where all writes of items need to be counted (to the array and to local variables) all writes are counted — shown in the bottom line of each row within the table.

Insertion Sort and Selection Sort have many similarities. They are of the same order, due to similar two nested for loop structure. The notable difference is Insertion Sort is able to adapt and exit the inner for loop early, whereas Selection Sort does not. Another difference is Insertion Sort copies array elements whereas Selection Sort copies indexes. This implies that Insertion Sort has an additional dependency on the size of the array element, whereas Selection Sort does not. Thus, we expect Insertion Sort to slow down as array elements grow in size, due to more data to copy.

Measurement results above demonstrate that as the size of elements increases Insertion Sort performance increases consistently for all tested element sizes, performing at least O(n2 ) and at times approaching O(n3 ), seen by the slope of nearly 3 at times on a log plot in Figure 4. This behavior can be seen from the algorithm comparison in Table 10, in which Insertion Sort performance is characterized as n(n+1)/2 - 1 + w(w-1)/2 + (w-1), which is approximately O(n2 + w2). Since both parameters, n and w, are squared, no matter which of them dominates, the resulting graph stays on a squared path.

Measurement results above demonstrate that as the size of elements increases Selection Sort performance increases at different rates for various tested element sizes, performing at O(n2) for small element sizes and approaching O(n) for large element sizes, seen by the slope of nearly 1 on Figure 5. This behavior can be seen from the algorithm comparison in Table 10, in which Selection Sort performance is characterized as n(n-1)/2 + 3(w-1), which is approximately O(n2 + w ). The resulting performance depends on which of these two parameters, n or w, dominates resulting in a squared graph for small w and large n, or in linear graph for small n and large w, and in between in other cases.

From the measurement data it can be seen that for element size of 1 float Insertion Sort and Selection Sort have nearly equal performance. This performance is consistent with the previously measured data points of the best Insertion Sort and Selection Sort implementations. However, it is only true for reverse input data set, because Insertion Sort does not get a chance to take advantage of its adaptive capabilities. Below, measurements for Insertion and Selection Sort algorithms with random numbers input data set are shown in Table 13, Figure 6, and Table 14 and Figure 7.

Table 13

Figure 6

Table 14

Figure 7

With random number input data set Insertion Sort consistently dominates Selection Sort for element size of 1 float — by 2X. However, at element size of 10 floats and larger Selection Sort takes the performance lead by 2X to 4X. This lead grows substantially as the element size grows.

Insertion Sort is adaptive to input data. However, it performs substantial amount of movement of its elements in the inner loop. As element size grows this work spent on moving its elements begins to dominate. Selection Sort on the other hand is not adaptive (original implementation) and performs the same number of comparison in best and worst cases. Selection Sort does not move its elements, but keeps track of the index of the minimum element in the inner loop. Selection Sort moves two array elements (swaps) in the outer loop, and this advantageous behavior shows up as element size increases.

The number of comparisons performed by Insertion Sort and Selection Sort are nearly equal in the worst case, O(n2), but Insertion Sort is linear in performance in the best whereas Selection Sort stays O(n2).

Cost of Comparison

All sorting algorithms that use comparisons of elements have sensitivity to the cost of performing the comparison. For example, sorting an array of integers will perform differently than sorting an array of variable length strings. Cost of comparisons can be significant. Also, the cost of alternate program flow in comparison if statements is high due to long pipelines of today's processors. The cost of comparisons and its influence on the choice of sorting algorithm will be explored in a future article.

Acknowledgements

Thank you to the authors of Wikipedia pages on sorting and other websites sited below. You produced incredible resources with some of the deepest thoughts on the topic and terrific animations that truly help visualize the algorithms in action and provide insight. Your efforts prevented many time-wasting attempts at reinventing the wheel.

[7] http://en.wikipedia.org/wiki/Insertion_sort is a great website with substantial detail on various sorting algorithms implemented in C++, with benchmarking code posted as well (which I have not tried). Sadly however, the Insertion Sort algorithm implementation is mistakenly Selection Sort algorithm instead.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!