Monday, 21 July 2014

Mergesort is implemented by John Von Neumann in 1945. Even though very old, it is still used all the time in practice because this is one of the methods of choice for sorting. Mergesort is also preferred for sorting linked lists.

First i will start the discussion with an example, Let's see.

Consider you have a sheet of paper. First tear the paper into 2 pieces by making the sheet into half, then tear the 2 pieces into 4 pieces, do this for up to some extent. Now give a number to each piece of paper.

Take half of the pieces in one hand, another half to another hand and arrange them in order from lowest to highest.

Now we will see, How to sort these numbered pieces in ascending order.

Compare top sheets of two piles and place the less valued piece on the desk. Now the number of sheets is 1 less than total number of sheets before.

By continuing this procedure, you'll end up in the following situations

a. No pieces left in both hands
b. some pieces left in either left hand or right hand.

Now you have already arranged pieces in prior. The only thing you have to do is remove each piece from hand and place on the desk.

Finally you can see all the pieces arranged in order from lowest to highest.

We can consider this simple example as inspiration for Mergesort.

In Mergesort the approach to solve the problem is divide and conquer. We would split the list of objects into half, sort the halves, and then merge the sorted halves together. This is the idea behind Mergesort

Conceptually Mergesort is one of the simple sorting algorithms, and has good performance both in the asymptotic sense and in empirical running time.

Mergesort follows a recursive approach to sort the problem set. Firt of all, it divides the list into two halves and sorts the right sub list and then sort the left sub list. Finally merges both sorted left and right sublists.

How Mergesort Works ?

Merge sort splits the input Array into two subarrays. In reality we are not allocating memory to subarrays. In each recursive call, we will pass bounds of subarray to Mergesort method. Which inturn calculates mid position of sub array, and this division continues recursively until the begin index and end index are same i.e when the subarray contains only one element. This division process requires "logn" levels of recursion.

With the help of temporary second array, Mergesort would merge subarrays. Note that this approach requires twice the amount of space than insertion sort, selection sort, & bubble sort, which is a serious disadvantage for Mergesort. It is possible to merge the subarrays without using a second array, but this is extremely difficult to do efficiently and is not really practical.

Mergesort is used when the data structure doesn't support random access, such as linked lists, double linked lists. It is also widely used for external sorting, where random access can be expensive compared to sequential access.

Properties Of Mergesort :

1. O(n) for extra space
2. O(n logn) - Running time
3. stable - runnint time is same in all cases

Measuring Time Complexity Of Insertion Sort :

Analysis of Mergesort is straightforward, despite the fact that it is a recursive algorithm. The merging part takes time O(i) where i is the total length of two subarrays being merged. The array to be sorted is repeatedly split in half until subarrays of size 1 are reached, at which time they are merged to be of size 2, these merged to subarrays of size 4 and so on. Thus the depth of the recursion is logn for n elements.

The first level of recursion can be though of as working on one array of size n, the next level working on two arrays of size n/2, the next on four arrays of size n/4, and so on. The bottom of the recursion has n arrays of size 1. Thus, n arrays of size 1 are merge ( requiring O(n) total steps), n/2 arrays of size 2 (again requiring O(n) total steps), and so on. At each of the log n levels of recursion, O(n) work is done, for a total cost of O(n*logn). This cost is unaffected by the relative order of the values being sorted, thus this analysis holds for the best, average, worst cases.