Tuesday, December 13, 2011

Merge sort is based around the principle that merging two sorted sets into a single sorted set is easy. If we take two sorted sets, S1 and S2 such that:

S1 = [1, 3, 4] and S2 = [2, 5]

And we want to merge them into a new set, S3, which is also sorted, here's how we do it the mergesort-way. First, start with the two original sets, S1 and S2, and have S3 be empty.

S1 = [1, 3, 4] and S2 = [2, 5] and S3 = []

What we want to do is add the elements to S3 in order so that once the merge is complete the sorting is already finished. So the first step is to add the smallest element to S3. Since the sets S1 and S2 are already sorted, the smallest elements of each are their first elements. All we need to do to find the smallest element is look at the first element of each and take whichever is smaller. So we look at:

v v
S1 = [1, 3, 4] and S2 = [2, 5]

and notice that 1 is smaller than 2, so we can be certain that 1 is the smallest element of either of the sets. So we can take that out of S1 and add it as the first element of S3. After that we just repeat the process and compare the first elements again.

v v
S1 = [3, 4] and S2 = [2, 5] and S3 = [1]

In this case 2 is less than three, so we append it to S3 and repeat.

v v
S1 = [3, 4] and S2 = [5] and S3 = [1, 2]

Three is less than 5, so the next step looks like:

v v
S1 = [4] and S2 = [5] and S3 = [1, 2, 3]

4 is less than 5, so we append it to S3, leaving S1 empty.

S1 = [] and S2 = [5] and S3 = [1, 2, 3, 4]

Now that S1 is empty, we can just append the entire contents for S2 on to the end of S3, since they're already in order, and we end up with S3 as the complete, sorted merge of S1 and S2

S1 = [] and S2 = [] and S3 = [1, 2, 3, 4, 5]

Of course this whole time you've been wondering, how does this apply to sorting in general? We don't normally get the luxury of starting with two sorted lists so this whole thing is stupid and you're stupid for writing it. Well, although I'm going to have to disagree about the me being stupid part, I see where you're coming from. The way it works, just like with quicksort, is that it has to be applied recursively. Here is a trivial example, let's use merge sort to sort this small set:

S1 = [4, 1, 3, 2]

You should notice two differences between this and the starting conditions of the other example: It's one list instead of two, and it's not sorted. The first difference is easy to fix, let's just split it in half down the middle.

S1 = [4, 1] and S2 = [3, 2]

There's one problem solved, now let's work on the other. Neither S1 nor S2 is sorted, so we'll have to sort them. The way we do that is, of course, with mergesort! Let's start with S1.

S1 = [4, 1]

Once again we've got one unsorted set, and once again, we begin by splitting it in half.

S1 = [4] and S2 = [1]

Now the sets only contain one element each, which makes them sorted by definition! Just like in quicksort, the base case of mergesort is that a set containing 1 or 0 elements is automatically sorted. Now we just need to merge them like we did before, comparing the first element of each set and adding the smaller one to the result. I won't bore you with the step by step of the merge process since we already went over it in depth, but just trust me when I say that the result will look like this

S1 = [] and S2 = [] and S3 = [1, 4]

Just to clarify, I've left the empty sets S1 and S2 in just to illustrate that we got to this result the same wasy as before; by taking elements one at a time out of S1 and S2 until they were both empty. Now we've got one of our subsets sorted and we can bring it back a level to get here:

S1 = [1, 4] and S2 = [3, 2]

Now we just need to do mergesort on S2 and we'll be able to merge them together. Once again we split S2 into two sets:

One of the problems with the explainations of quicksort that I've noticed is that they often fail to seperate the concept from the implementation. It's easy to get bogged down in the details of iteration, element swapping, etc. without even getting to the meaty center of the discussion: how quicksort works. So here's my attempt to explain the concept without getting into the implementation.
lets take the set:

[5, 3, 4, 9, 1, 7]

First pick a random element of the set to be the pivot. Let's just pick the first one, 5. So now we have:

Pivot - 5, remaining set [3, 4, 9, 1, 7]

Now iterate through the set and compare them to the pivot. If they are "less than" the pivot, place them in one set, if they are "greater than" the pivot, place them in the other set. We should end up with this:

Less than [3, 4, 1], pivot - 5, greater than [9, 7].

From here we can see that all the elements are in place with respect to the pivot, 5. If we somehow sorted the "less than" and "greater than" sets, we could just merge all the sets together and the whole thing would be sorted. The trick is to sort those "less than" and "greater than" sets, and the way to do that is, of course, using quicksort! Let's sort the "less than" set first. So we have:

[3, 4, 1]

Let's pick the first element to be the pivot again, so we end up with:

pivot - 3, remaining set [4, 1]

repeating the comparison step from before we get:

less than [1], pivot - 3, greater than [4]

Now, we're in the same situation as before only one thing is different; the "less than" and "greater than" sets only have one element, and a one element set is automatically sorted! This is the base case of quicksort: performing quick sort on a one element set (or an empty set) just returns the set. So now all we have to do is merge the three sets together and it's sorted. We end up with:

[1, 3, 4]

Now we go back to where we were before, only now we have the "less than" set already sorted. So we're here:

Less than [1, 3, 4], pivot - 5, greater than [9, 7]

Now we just have to sort that greater than set, once again using quicksort and selecting the first element as the pivot we get:

less than [7], pivot - 9, greater than []

The less than set and greater than set each have 1 or less elements so they're already sorted and we can just merge them together! We get:

[7, 9]

Bringing this sorted set back to where we started we get

less than [1, 3, 4], pivot - 5, greater than [7, 9]

Now the less than and greater than sets are sorted and we can just merge the whole thing together! We get: