As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
If this question can be reworded to fit the rules in the help center, please edit the question.

1

It would help if you were to tell us a little bit about the data. There is no one answer to which is the fastest. It depends on the data, and the resources available. Is this a homework question?
–
EvilTeachNov 18 '09 at 3:22

no it isn't a homework question. the data alternates between numbers and strings and there is around 25 fields in each record. the question was intended to be open-ended
–
ihtkwotNov 18 '09 at 3:24

15 Answers
15

There is no such thing as a "fastest" sorting algorithm. Like most things in computer science, when you pick an algorithm you accept certain trade-offs.

That being said, for general purposes, quicksort is likely to be the fastest sorting algorithm for in-memory sorts in most typical cases, although in certain cases it is slower than other sorts, especially if your data-set is mostly sorted already.

Again, there are always trade-offs depending on your requirements and data set. For example, merge-sort is usually a bit slower than quicksort, but unlike quicksort it is stable (i.e. same-value keys retain their original order.) Insertion sort has worse average-case time-complexity than quicksort, but it is likely to be faster than quicksort for very small data sets. Bubble-sort also has worse average-case time complexity than quicksort, but it can be much faster than quicksort for data sets which are already sorted. Radix sort may also be faster than quicksort if you are sorting integers. And for extremely large data sets where the data is so large that it doesn't fit in memory, external merge sort would be faster than quicksort.

Again, it's all about trade-offs and your data set. But the best answer for general purposes is probably quicksort. For that reason, most implementations of the C++ standard library std::sort function use quicksort (or some quicksort hybrid algorithm, like introsort) internally.

Actually, for quicksort the worst case sequence depends on how a pivot is chosen; with median of 3 (which I thought is what stl/std usually use), a mostly sorted sequence should be equivelant to most random sequences.
–
CoderTaoNov 18 '09 at 3:13

That depends on implementation. Mergesort has the disadvantage that it can't be done inplace; that whole O(n) memory overhead thing. But it does have the advantage that it is gauranteed to run in O(n*log(n))
–
CoderTaoNov 18 '09 at 3:24

FYI : (same-value keys retain their original order) is usually called a "Natural" sort.
–
dar7ylNov 18 '09 at 3:35

This will give you both practice with C/C++ but also give you a feel for the fact that

each sort algorithm has its benefits and drawbacks...

...including the Bozo sort! , although that one's advantages is mostly to be fun and of occasional use in the context of unrelated stochastic algorithms.

Without going to the extreme and reading TAOCP Vol #3 (Donald Knuth), maybe start with this Comparative list of sort algorithms.
(BTW, if you get serious about programming, reading TAOCP, should be part of your long term plan; one of these things to do!)

Cocktail sort! I must have missed that one in my Knuth book.
–
Loki AstariNov 18 '09 at 4:06

1

@Martin Y, actually, Knuth's got it too; it is not overly famous because this is one of many "twists" on bubble sort. This said we should have it more often, and maybe after a few of these, we can be ready for some bozo sort (itl.nist.gov/div897/sqg/dads/HTML/bozoSort.html)
–
mjvNov 18 '09 at 4:18

Like DigitalRoss said Radix Sort is the fastest for the CPU architecture. It is able to cheat by using the architecture of the RAM and CPU setup. It is not often taught in computer science because it has this cheating effect and is not really doing the sorts by itself. In most cases Radix is your fastest solution. The only time it would not be fastest is when the list is really small (like less than 10) or when the list is almost sorted already. It really only sorts integers but most things can be converted to a integer in a clock cycle or two such as a float. Converting a float to a "sortable like int" can be done by simply flipping the sign bit and then it can be sorted and then flipping all the sign bits back after sorting.

How it basically works is it looks at a number, say 88, and puts it into memory bank 88. Then it looks at the next number, say 23, and then it put it in memory bank 23. If one of the items overlaps then it puts it in the next empty bank. And then when you’re done with the list the list is actually sorted in memory.

I just explained the very basic nature of a radix sort just so you get the general idea. There are some other technical details that I left out to keep it simple - I'm sure someone is going to tell me about it. :)

Actually, a modified (bi-directional) bubble can be extremely efficient, such as in the case where the data set is already sorted before adding one element at the end. I've actually used it in one application where a built-in sort wasn't available and that sort was the quickest to implement. Bubble sort will, in that case, execute as if it were an O(n) algorithm, based on n being the number of items added.

That's because the efficiency of an algorithm depends entirely on the properties of the data being sorted. The extreme case can be seen if the data is already sorted - bubble sort makes one pass of the data with zero swaps.

So asking which is the fastest is not really valid. You should ask which will have the best behavior for random unsorted data sets.

Keep in mind that sort performance doesn't really matter that much if you're only sorting 100 items. The difference between 0.1 seconds and 1 second is not that relevant (unless you're sorting multiple times per second in which case you'd be better off finding a better way to manage your data). The performance only really becomes an issues once the data sets get large.

I would say that, if you have a built-in sort in whatever environment you're working in, just use that. If you have to implement your own, figure out first the likely disposition of your data sets then choose based on that.

Bubble sort is terrible way of sorting, and even for the case you mentioned, where the data is already sorted, only to insert some number, then a binary search, with or without some LUT methods, will most likely reduce the sort to <=O(logN).
–
user0002128Dec 15 '12 at 13:00

1

With respect, that's wrong (the "terrible" comment, I mean). Discounting any algorithm without understanding why is a bad idea. It's in the same vein as people discounting gotos or multiple exit points without knowing why. Yes, they're usually bad but not in all cases. For example, for data sets of a small enough size, there is no difference between most sort algorithms and, if bubble sort is faster to implement, it has a clear advantage.
–
paxdiabloDec 15 '12 at 14:00

If you 1) can't make the assumption that the data is partially sorted, 2) can't use std::sort, or 3) want to understand the algorithm as opposed to just using a boxed one, here is an excellent performing sort for indexable collections (random access iterator in STL terms).

People like to criticize the quicksort for poor performance with certain inputs, especially when the user has control of the input. The following approach yields performance matching midpoint selection but expected complexity exponentially approaches O(n log n) as the list grows in size.

Initialize the quicksort with a random positive integer I. That value will be used throughout the sorting process (don't have to generate multiple numbers).

Pivot is selected as I mod SectionSize.

For additional performance, you should always switch your quicksort to shell sort for "small" list segments - I've seen lengths from 15-100 chosen as the cutoff.

It has an incredible amount of detail, yet is still accessible to someone that may not be too familiar with all of the concepts. Not to mention that the animations are great for seeing the algorithm in action - something that you normally do not get to see. I hope this helps!

It is probably the fastest of them all for appropriate inputs, and although not frequently referenced today, it is possibly the oldest automated sorting algorithm known. (A more widely applicable sort with similar time complexity would be the aptly-named quicksort.)

Radix sort has the built-in advantage of automatically limiting its comparisons based on the maximum possible number of values for its key.