Dr. Quack: comparing two ways of approaching the same sort algorithm

I was surprised to learn that it's been almost a year since I wrote this post on a way of sorting two sets of search results in way I'd been thinking about for a while. I called it a "duck sort".

Anyway, when I wrote the function to do the sort I was comparing two lists, specifically the first item of each. When I decided which list's item "won" the comparison, I would remove the item from the winning list and continue to make the comparisons until one of the lists became empty.

I'd always wondered if it would be better to traverse the lists instead of continually removing the first item of each list. In terms of clarity in my head, I went with the "remove" approach instead of the "traverse" approach because it was easier for me while just trying to figure out the basic logic of the sort method. But now I have this Python timer thing I just posted about here …

So, I made a "traverse" version of the sort method and made a little test script to take an increasingly larger data set and time the two approaches to the algorithm and see what happened.

Not surprisingly, traversing the lists was increasingly more efficient and faster than my original "remove" approach, i.e. altering/re-drawing the lists.

So, here is the new "traverse" approach to the "duck sort" and the original "remove" approach.

I ran the test on the same "Hansel and Gretel" data from my original post on the "duck sort". But this time, I simply multiplied the data set by a number starting from "0" and ending in "10,000" in increments of 5,000. This way I could compare the speed of the two approaches on a larger and larger data set.

The results are interesting. Here's a table with the time (seconds) of the "traverse" and "remove" approaches. The "data_length" column is simply the size of the multiplier used to augment the data set.

data_length

traverse

remove

0

0

0

5000

0.0130000114

0.0939998627

10000

0.0279998779

0.3059999943

15000

0.0360000134

0.5829999447

20000

0.0479998589

1.0199999809

25000

0.0610001087

1.7070000172

30000

0.0789999962

2.3399999142

35000

0.0850000381

3.2030000687

40000

0.1010000706

4.1009998322

45000

0.111000061

5.2359998226

50000

0.1240000725

7.9670000076

55000

0.132999897

8.3550000191

60000

0.1679999828

12.876999855

65000

0.1769998074

16.1730000973

70000

0.1919999123

16.2030000687

75000

0.2049999237

15.4270000458

80000

0.2190001011

17.5679998398

85000

0.2230000496

19.131000042

90000

0.2339999676

21.3770000935

95000

0.243999958

24.8929998875

100000

0.254999876

25.8870000839

Perhaps more interesting is a graph of the two approaches, below. With the graph, it's hard to even see that the "traverse" approach is trending upward.