And? Quicksort in general runs faster than Heapsort, yet the worst case performa...

lliiffee · on June 17, 2010

Note that properly implemented quicksort has expected O(n log n) time independent of the data (dependent only on the random choices of pivots), whereas interpolation sort's performance is data dependent. That's a pretty important distinction.

pbiggar · on June 18, 2010

Interestingly, that's not quite true: see http://www.scss.tcd.ie/~pbiggar/jea-2008.pdf, section 8.3.

I'll point out too that no quicksort implementation uses a truly random choice of pivot. Typically they use the median of three pivots to be pseudo-random.

lliiffee · on June 19, 2010

Your second point may be valid, though not literally. I've implemented quicksort, and I used (pseudo) random pivots, damnit! :)

The article you point to doesn't seem to address the complexity of quicksort with random pivots. However, that is proven to be expected O(n log n) time for any data in Motwani's Randomized Algorithms book, and I don't think the correctness of that has ever been questioned.

pbiggar · on June 20, 2010

> The article you point to doesn't seem to address the complexity of quicksort with random pivots.

That paper is about experimental algorithmics, and avoids discussions of things which aren't practical (such as using truly random pivots).

> However, that is proven to be expected O(n log n) time for any data in Motwani's Randomized Algorithms book, and I don't think the correctness of that has ever been questioned.

Right, but since there are no truly random quicksorts, its hardly a concern.

lliiffee · on June 21, 2010

Please clarify: Why are random pivots not practical? And why do you say there are no truly random quicksorts? Because pseudorandom numbers aren't really random, or for some other reason?

pbiggar · on June 22, 2010

> Why are random pivots not practical?

The cost of true randomness is really high (in execution time)

> And why do you say there are no truly random quicksorts?

Because pseudo-random numbers aren't really random.

The only reason this is important is that the pseudo-randomness means that its possible the construct an adversarial input, which could drive the quicksort to take N^2 time.

ramchip · on June 17, 2010

> If they implemented it, I'm sure they've found that there's a performance boost for large datasets.

Argh! Perhaps I take things too literally, but the logic in this statement is just painful.

pbiggar · on June 17, 2010

Only naive implementations of quicksort take O(n^2) time. Most good implementations will switch to another sort (generally heapsort) after detecting adversarial conditions.

jemfinch · on June 17, 2010

In which case they cease to be quicksort, and are instead called "introsort": http://en.wikipedia.org/wiki/Introsort

pbiggar · on June 18, 2010

My point was that we should give good hackers the benefit of the doubt when they say quicksort, as they are likely to actually be talking about introsort. GNU C uses introsort to implement qsort, and if I were to implement quicksort for an important job, I certainly wouldn't overlook the fact that it can run to O(N^2).