Quicksort And Random Pivoting
Quicksort is the most widely used sorting algorithms in practice. It works extremely well both in terms of time and space complexity compared to its counterparts. Given an unsorted list , quicksort works by picking a pivot, creating two sublists, one with elements strictly less than the pivot and the other with elements strictly greater than the pivot. Quicksort is later recursively called on both sublists. Below is pseudocode for quicksort
Below is an animation I made illustrating the above algorithm.
The base case is when there is a single element to be sorted. Choosing a pivot plays a pivotal role (pun intended) in the run time of the algorithm. The main cost associated with quicksort is the number of comparisons between the pivot and elements in the list. Ideally, we want to pick a pivot that is not too far from the median to prevent the depth of recursion (number of function calls) from being large and the number of comparisons being low. For example, in the best case, we pick the median to be the pivot and it splits the list into two equal halves. Suppose is the cost of quicksort, it would lead to,
Solving the recurrence, we get .
In the worst case, quicksort has runtime of . We will give an example of the worst case instance. Let be the list we wish to sort.
In the worst case, in the first step, the pivot . In this case, the sublist and or . In the first step, we are doing comparisons. When we run quicksort on , in the worst case, the pivot is and the number of comparisons in . This trend continues where the size of the list decreases by 1 as the depth increases by 1. When the depth is , the number of comparisons is . In this worst case instance, the total number of comparisons is,
In randomized quicksort, instead of picking some arbitrary element to be the pivot, we pick the pivot uniformly at random. In the worst case, the run time of randomized quicksort is , but in the average case, or on expectation, it does extremely well and achieves runtime. We will now get into the analysis and prove that the expected runtime of randomized quicksort is .
Let where be a Bernouilli or a 0-1 random variable where
In the worst case scenario, we mentioned before, all where
But what were’re interested in is, i.e., the expectated number of comparisons. We will use the fact that if and are random variables, then . This property is called linearity of expectation.
Using linearity of expectation, all we need to do compute is computing the value of . Since is a bernouilli random variable,
What we are left with computing is the probability of two elements and being compared? Suppose we take a random ordering of numbers from 1 to 10 . Suppose we pick 5 to be the pivot. One sublist would be and the other sublist would be . In this case, 5 is compared to every number in the original list, but no element in one sublist would be compared with a member of the other sublist.
Suppose a pivot is picked, there will be no future comparison between and where . We will make two important observations,
- Suppose and are compared, then either or is the pivot in that step.
- Suppose any elements in is chosen as the pivot, then and will never be compared.
Hence, the only way and are compared is if or is picked as a pivot before any of the elements in . This can be also rephrased as saying, what is the probability or is picked as the pivot first from the set .
We showed the expected runtime of randomized quicksort is . It is quite interesting to analyze the probability of the runtime being far from the expected runtime. It can be shown randomized quicksort runs in with probability . I’ll save this analysis for another post.