i have to write a quicksort algorithm that uses the the median of the array as the pivot. from my general understanding from what i read in a book, I have to use a select algorithm in which i split the array into n/5 sub arrays, sort each of the sub arrays using insertion sort, find the median, then recursively call the select algorithm to find the median of the medians. Im not even sure how to start this and im pretty confused. the selectMedian algorithm call is supposed to look something like this: SelectMedian(int first, int last, int i) where i is the ith index i want to select (in this case it would be the middle index, so array.length/2). The book im reading gives this description of it:

I believe the "book" the OP mention is CLRS. Just a thought.
–
Óscar LópezOct 28 '11 at 21:13

The exact question is "Write a quick sort algorithm that uses the median as the pivot". This is not the books description for the quicksort algorithm, this is the description for a select algorithm which looks for the "i-th" largest element in the array
–
WongerOct 28 '11 at 23:13

Using median of three means the time complexity of finding the median will be greater than O(NlogN) - ruining our goal of getting to O(N^2). Using a random pivot means you'll expect O(NlogN), but performance will not be consistent, which is required for some applications.
–
KaneOct 28 '11 at 20:45

What are you talking about? Our "goal" isn't O(N²), that's worst case, and finding the median of three elements is quicker than partitioning the array and finding the median of five for each part.
–
AusCBlokeOct 28 '11 at 21:02

Apologies, I meant ruining out goal of getting to better than O(N^2). While it is true that searching 3 items takes less time than searching 5, they are both O(1) operations. However, searching n/3 arrays for medians is O(N^2) and searching n/5 arrays for medians is O(NlogN). You need to do the T analysis to see this - I believe CLRS has an example. As an aside, how do you do the superscript? I couldn't find it in the help.
–
KaneOct 28 '11 at 21:06

1

Oh you're talking about searching n/ 3 smaller arrays. I'm only talking about getting three elements, getting the median of those and using that as the pivot. Only doing that once, simple as that. With the "²", it's an actual character I just copied off the net, I don't think the editor will do that for you.
–
AusCBlokeOct 28 '11 at 21:13

1

because,according to the book, using the median of the array gaurantees an even split which therefor gaurantees O(nlgn), assuming the algorithm to find the median is written correctly
–
WongerOct 28 '11 at 23:18

I assume you can figure out created n/5 sub-arrays of 5 elements each.

Finding the median of a subarray is fairly easy: you look at each element and find the element which has two smaller elements.

For example, you have 1 4 2 3 5. 1 has no smaller elements. 4 has three smaller elements. 2 has one smaller element. 3 has two smaller elements; this is the one you want.

Now you have found n/5 medians. You want to find the median of the medians, so you run the algorithm again.

Example:

1 7 2 4 9 0 3 8 5 6 1 4 7 2 3

[1 7 2 4 9][0 3 8 5 6][1 4 7 2 3]

findMedian([1 7 2 4 9]) = 4;

findMedian([0 3 8 5 6]) = 5;

findMedian([1 4 7 2 3]) = 3;

[4 5 3]

findMedian([4 5 3]) = 4;

4 is your pivot.

The reason you do this is to try and split your array evenly; if your array is split lopsided, you'll get O(N^2) performance; if your array is split evenly, you get O(NlogN) performance.

Selecting a random pivot means you could get either one - in practice it would balance out to O(NlogN) but a lot of applications want consistent performance, and random quicksort is not consistent from run to run.

The reason we use 5 (instead of 3 or 7) is because we're adding another term of time complexity searching for the median - this term has to be less than O(NlogN) but you want it to be as small as possible. Using 3 gets you O(N^2), using 5 gets you O(NlogN), and 5 is the smallest number for which this is true.

(the algorithm to find the median in linear time was given by Blum, Floyd, Pratt, Rivest and Tarjan in their 1973 paper "Time bounds for selection", and answered a famous open problem)