Hacker News new | past | comments | ask | show | jobs | submit login

And? Quicksort in general runs faster than Heapsort, yet the worst case performance of the first is O(n^2) while the later only O(n log n). If they implemented it, I'm sure they've found that there's a performance boost for large datasets.



Note that properly implemented quicksort has expected O(n log n) time independent of the data (dependent only on the random choices of pivots), whereas interpolation sort's performance is data dependent. That's a pretty important distinction.


Interestingly, that's not quite true: see http://www.scss.tcd.ie/~pbiggar/jea-2008.pdf, section 8.3.

I'll point out too that no quicksort implementation uses a truly random choice of pivot. Typically they use the median of three pivots to be pseudo-random.


Your second point may be valid, though not literally. I've implemented quicksort, and I used (pseudo) random pivots, damnit! :)

The article you point to doesn't seem to address the complexity of quicksort with random pivots. However, that is proven to be expected O(n log n) time for any data in Motwani's Randomized Algorithms book, and I don't think the correctness of that has ever been questioned.


> The article you point to doesn't seem to address the complexity of quicksort with random pivots.

That paper is about experimental algorithmics, and avoids discussions of things which aren't practical (such as using truly random pivots).

> However, that is proven to be expected O(n log n) time for any data in Motwani's Randomized Algorithms book, and I don't think the correctness of that has ever been questioned.

Right, but since there are no truly random quicksorts, its hardly a concern.


Please clarify: Why are random pivots not practical? And why do you say there are no truly random quicksorts? Because pseudorandom numbers aren't really random, or for some other reason?


> Why are random pivots not practical?

The cost of true randomness is really high (in execution time)

> And why do you say there are no truly random quicksorts?

Because pseudo-random numbers aren't really random.

The only reason this is important is that the pseudo-randomness means that its possible the construct an adversarial input, which could drive the quicksort to take N^2 time.


> If they implemented it, I'm sure they've found that there's a performance boost for large datasets.

Argh! Perhaps I take things too literally, but the logic in this statement is just painful.


Only naive implementations of quicksort take O(n^2) time. Most good implementations will switch to another sort (generally heapsort) after detecting adversarial conditions.


In which case they cease to be quicksort, and are instead called "introsort": http://en.wikipedia.org/wiki/Introsort


My point was that we should give good hackers the benefit of the doubt when they say quicksort, as they are likely to actually be talking about introsort. GNU C uses introsort to implement qsort, and if I were to implement quicksort for an important job, I certainly wouldn't overlook the fact that it can run to O(N^2).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: