I've noticed than many numerical sorting methods seem to sort by 1, 10, 2, 3... rather than the expected 1, 2, 3, 10... I'm having trouble coming up with a scenario where I would need the first method and, as a user, I get frustrated whenever I see it in practice. Are there legitimate use cases for the first style over the second? If so, what are they? If not, how did the first sort style ever come into being? What are the official names for each sort method?

Not an answer to your question, but if you have to sort a list of strings which could contain numbers, you probably want to use the Alphanum algorithm: davekoelle.com/alphanum.html
– TehShrikeJan 1 '12 at 11:28

It's very very simple. When sorting, the algorithm scans from left to right. So, when it comes to a 1 and a 5, the 5 is larger, and it just dumbly goes with this EVEN if the 1 is actually part of a larger number like 134234. To know that 134234 is larger than 5 we must actually scan past the number to the last digit(actually the first digit) 4 then work backwards and see that the one is actually a 100000 which is much larger than 5. So, your typical blind sort doesn't do this as it just compares character to character ignoring what occurs after(or before) in the comparison.
– AbstractDissonanceJun 30 '17 at 17:07

1

If you read en.wikipedia.org/wiki/Natural_sort_order it should make sense. In natural order, strings of digits are grouped as a single "character". Not physically, just logically so we can still sort of have character comparisons like the first case, but we will be able to compare integer strings to integer strings rather than characters to characters, which will allow us to compare the full value. All sorts should be this way because this is the way we humans read things(for numbers, we actually read right to left, even in a left to right string 1234 = 1000+200+30+4, not 4000+300+20+1
– AbstractDissonanceJun 30 '17 at 17:11

4 Answers
4

that is lexicographic sorting which means basically the language treats the variables as strings and compares character by character ("200" is greater than "19999" because '2' is greater than '1')

to fix this you can

ensure that the values are treated as integers,

prepend '0' to the strings so all have equal lengths (only viable when you know the max value).
This is why you'll see episode numberings on media files (S1E01) with a prepended 0 so a lexicographic sort doesn't mess things up and allows programs to simply play/display in alphabetical order,

or make a custom comparator that first compares the length of the strings (shorter strings being smaller integers) and when they are equal compare the lexicographically (careful about leading '0')

Alphabetically, 1 comes before 2. Whenever you see the first method, it's not because it's desirable, but because the sorting is strictly alphabetical (and happens left-to-right, one character at a time): 1, 2, 10 makes sense to you but not to a computer that only knows alphabetic comparison. There's no way in that kind of simple comparison to know that a one followed by a 0 actually comes after a two.

When you see mixed word and number sorting that treats numbers correctly, it's because the sorting is more intelligent, and on top of that, still usually only works at the beginning or end of a string.

Others have answers what this sort is, but no one every really answered your question about why you see it. The answer isn't really that exciting. It's usually a bug. Most sorting methods will default to one or the other and the programming likely careless of changing the default when sorting numbers.

In mixed alphabetic/numeric contexts, experienced users will tend to prefer lexiographic sorting, because it's consistent and predictable. Every app that tries to "intelligently" mix lexiographic and numeric sorting does so a little bit differently, making the sort of questionable utility.
– j__mJan 3 '17 at 4:08

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).