CLRS tells us to exchange A[r]
with A[i]
where i is a random variable between p and r. However If I were to randomly pick up a variable as the pivot in the quicksort function and exchange the values what will be the time complexity now ?
Will the algorithm now perform worse than that given in the book ?
Here is the code
package Clrs;
import java.util.Random;
public class Clrs
{
public static void main(String[] args)
{
int[] a = new int[]{7,6,5,4,3,2,1};
quicksort(a,0,a.length-1);
System.out.println();
for (int i : a) {
System.out.println(i);
}
}
static void quicksort(int a[],int p,int r)
{
int q;
if(p<r)
{
Random random = new Random();
int y = p+random.nextInt(r-p+1);
int temp = a[y];
a[y]=a[r];
a[r]=temp;
q = partition(a,p,r);
quicksort(a,p,q-1);
quicksort(a,q+1,r);
}
}
static int partition(int a[],int p,int r)
{
int x=a[r];
int i=p-1;
for (int j = p; j <= r-1; j++) {
if(a[j]<=x)
{
i++;
swap(a,i,j);
}
}
swap(a,i+1,r);
return i+1;
}
static void swap(int a[],int x,int y)
{
int t=a[x];
a[x]=a[y];
a[y]=t;
}
}
The theoretical complexity remains the same.
O(nlogn)
average case andO(n^2)
worst case.The idea if choosing a random pivot is not to eliminate the worst case performance of
O(n^2)
- that can still happen, with low probability.The idea is to make the algorithm more resillient to attacks of malicious inputs.
It is much harder to predict which array will cause the worst case behavior when the pivot is pseudo-randomly selected, so if someone wants to attack our program, and cause it to work significantly slower - it will be much harder for him to do it.
Another issue is making sure that if your data has tendency to appear in a certain pattern - the worst case won't repeat itself over and over again (with high probability).
As a side note, taking the first (or last) element as a pivot as the "classic" quicksort does is a bad idea since the probability of the array being sorted or almost sorted in real life applications is much higher than one would expect, causing the algorithm to fall into worst case behavior pretty often. More info about it can be found in this thread: Why are we interested in how long it takes to sort a file that is already sorted?