Best case performance of quicksort (tilde notation)

1.2k views Asked by At

My professor stated that the best case for quicksort is when the split is balanced (i.e. when the pivot is always the element in the middle).

Now the complexity of best-case quicksort can be determined by using a recurrent relation, as folows:

T(n) = 2*T(n/2) + partitioning(n)

In the next step he states:

thus = ~n * log2(n)

Could someone elaborate how exactly you can calculate this. I've searched around a lot, but all explanations are in big-Oh notation or don't really explain how the recurrency is solved/calculated.

2

There are 2 answers

0
amit On BEST ANSWER

I am going to address Tilde Notation in this answer, with the assumption of partitioning(n) is itself ~n, which means it runs in time G(n) = n + C.

In this case, complexity function of best case is:

With base clause of T(1) = 1+C

T(n) = 2T(n/2) + (n+C)

Claim: For all n>0:

T(n) = n(1+logn) + (2n-1)C

Proof by induction.
Base clause: For T(1) the claim holds as the base.

T(n) = 2T(n/2) + (n + C)
T(n) = 2(n/2 log(n/2) + n/2 + (n-1)C) +(n + C) //induction hypothesis
T(n) = nlog(n/2) + n + (2n-2)C + n  + C
T(n) = nlog(n/2) + nlog(2) + n + (2n-2)C + C
T(n) = nlog(n/2 * 2) + n +(2n-2+1)C
T(n) = n(logn+1) + (2n-1)C

Since T(n) = nlogn + (2C+1)n - C, we can conclude it is indeed ~nlogn.

1
skrtbhtngr On

The answer comes from the Master Theorem.

The recurrence relation for the best-case running time of Quicksort is:

T(n) = 2T(n/2) + Θ(n)

Master theorem generalizes a recurrence relation to the form:

T(n) = aT(n/b) + f(n)

Comparing the two equations:

a = 2

b = 2

f(n) = Θ(n)

Since f(n) = Θ(n), from Case 2 of the Master theorem:

If f(n) = &theta;(n<sup>log<sub>b</sub>a</sup>), then T(n) = &theta;(n<sup>log<sub>b</sub>a</sup>lg(n))

So, the time complexity of Quicksort in the best-case reduces to

T(n) = Θ(n lg(n))

EDIT: If you are asking why it is in tilde notation ~n * log2(n) and not in Big-O notation, it is because in your case, the goal must be to create an approximate model of the running time (without ignoring the constants if there were any) instead of developing an upper bound on the complexity. Moreover, tilde notation is very useful for predicting performance whereas Big-O notation is not.