Timsort

shuzchen · on Nov 9, 2011

Why not read about the algorithm in a write-up by the author himself:

http://hg.python.org/cpython/file/tip/Objects/listsort.txt

Sits in the actual source code to cpython. One of the many reasons I love python. (check out the notes on how dicts are implemented too)

ch0wn · on Nov 9, 2011

    > (check out the notes on how dicts are implemented too)

Here's the link. It's absolutely worth reading: http://hg.python.org/cpython/file/tip/Objects/dictnotes.txt

stevep98 · on Nov 9, 2011

A customer once reported a problem with a product I worked on. The issue involved a user interface intended to show a list of users which were fetched from the database. They said that the software took about an hour to display all the users. The expectation was that it would be pretty much instant.

In our testing, we had only tested to 100 users, but when we tried 4000, we couldn't reproduce the problem. It took 5 seconds to show the data, not an hour.

It took a long time to track this down. It turned out that the results from the database were in sorted order, and then the UI sorts them for display with quicksort. A list in pre-sorted order is the worst case (n^2) for quicksort.

(As an aside, the results were in sorted order, even though the query did not specify that they should be sorted. The reason for this was that this table was loaded into the database from some other source, and that other source outputted the data in sorted order... So, it wasn't just a case of removing a 'ORDER BY' from the query)

After thinking about this a bit, I decided to randomize the data after it was fetched, and this brought the time required back down into the sub-10-second range. The problem with sort algorithms is that the worst cases are sometimes fatal to your application, and aren't as rare as you might think. Having an algorithm like timsort that is tuned for these not-so-rare cases sounds quite useful.

afhof · on Nov 9, 2011

While Quick Sort does have the worst case runtime of O(n^2), a proper implementation defeats this behavior almost completely. Rather than randomize the data, it would be better to randomize the algorithm (choose a random pivot). Not only is the expected runtime O(n log n) but the probability of quick sort taking even twice as long (2 * n log n) is exponentially small.

stevep98 · on Nov 10, 2011

Interesting. The quicksort we were using was in one of the standard JDK classes from some ancient JDK.

sanxiyn · on Nov 9, 2011

Timsort in Java 7 written by Google:

http://hg.openjdk.java.net/jdk7/tl/jdk/file/tip/src/share/cl...

tef____ · on Nov 9, 2011

notably, java 7 uses timsort for the default stable sort (on objects)

and dual pivot quick sort as the default unstable sort.

dual-pivot quicksort is worth a look too, if you enjoy timsort

http://gdtoolbox.com/DualPivotQuicksort.pdf

shin_lao · on Nov 9, 2011

On C++ the tests I did two years ago showed that introsort was still the best "all around" algorithm for "uniformly distributed data". I also tested an algorithm called Smoothsort that was supposed to work better for "almost sorted data" but in practice it was slower. Timsort might be a good candidate, although I wonder if the gain will translate to the C++ world.

What was surprising is that in hinting the compiler through template hacking I was able to squeeze a little more speed out of introsort (If you're interested: https://github.com/flexsortea/flex_sort).

JoachimSchipper · on Nov 9, 2011

Timsort isn't intended for uniformly distributed data. In fact, I'm really surprised that quicksort didn't beat introsort in your tests - it may have some nasty edge cases, but you'll never hit those on truly random input and avoiding the (minimal) extra work of introsort should help.

jemfinch · on Nov 9, 2011

Timsort optimizes for partially sorted data, but that's a long way from saying it's not intended for uniformly distributed data. It's intended for all data, that's why it's the default sort in Python. It's no less intended for all data than merge sort is.

Even on random data, 50% of the time you'll pick a bad pivot. Get a slightly unlucky run and you're into quadratic behavior for that subset; the more data you're sorting, the more likely introsort's switch to heapsort on deep recursion (and insertion sort on small ranges) will help.

JoachimSchipper · on Nov 9, 2011

Did you ever actually time qsort? It's variability is nowhere near "it goes quadratic 50% of the time". Timings from http://www.joachimschipper.nl/posts/20111109/qsort_timing.c (portable C, note the appended non-introspective qsort() implementation):

qsort: 39 mergesort: 39 qsort: 38 mergesort: 40 qsort: 37 mergesort: 39 qsort: 38 mergesort: 40 qsort: 38 mergesort: 40 qsort: 38 mergesort: 40 qsort: 38 mergesort: 39

jemfinch · on Nov 9, 2011

Did you read my post? What I said was nowhere near "it goes quadratic 50% of the time". You're arguing against a straw man.

shin_lao · on Nov 9, 2011

Introsort only adds a depth check and size check to each recursion, that's only a couple of instructions with little memory access.

CJefferson · on Nov 9, 2011

I tried having a very quick dig through your code, but I couldn't actually see what you were doing.

Could you perhaps post a quick description, and the gains you were seeing? I'm interested because if it isn't too hard it could always be ported into g++ / clang's libc++.

shin_lao · on Nov 9, 2011

It should work to g++ easily, provided the usual "typename" adjustments.

Basically I wrote each sort block as individual functors (partitioning, merging). You run that through a filter stack that decides what to do depending on the input size and the current recursion depth.

It's faster (I think) because when the recursion depth is a compile-time value, the compiler can unroll the recursion.

It's old code I wrote for Boost but never had the time to finish & submit. As I quickly read through it there's a couple of rough edges to polish.

KaeseEs · on Nov 9, 2011

Function objects are not 'functor's. Please do not call them that. Functor is a word that has a specific meaning in category theory that winds up being relevant in some other programming languages (Haskell, ML and friends) and we don't need to shoot ourselves in the foot by overloading it with another meaning.

shin_lao · on Nov 10, 2011

A functor is another word for "function object".

http://en.wikipedia.org/wiki/Function_object

gfx · on Nov 10, 2011

I've ported Java's timsort to C++:

https://github.com/gfx/cpp-TimSort

and concluded that it can be useful for scripting languages like Python but not so useful for C++.

ww520 · on Nov 9, 2011

Worst case of O(nlogn) is pretty good, better than QuickSort.

tylerneylon · on Nov 9, 2011

A bad implementation of quicksort has O(n^2) worst case, and that's not really a fair comparison. I'm not defending quicksort, just pointing out that a fair comparison should keep in mind that we can easily select better pivots to achieve best-, average-, and worst-case complexity of O(n log n) for quicksort.

Comparing sort algorithms like this shows that we really care about more than just time complexity. We care about memory used or swapped down to the byte, quirks on popular processors, and expected run times on real-world data.

jemfinch · on Nov 9, 2011

You say that you're not "defending quicksort"--not that it really matters if you were, it's a great algorithm--but you are. And you're wrong.

We can't easily select better pivots for quicksort. For every deterministic selection of a pivot based on a fixed number of elements, a case can easily be constructed which will demonstrate its descent into quadratic behavior.[1]

In order to select a "good enough" pivot for quicksort, you can't just examine 3 elements or 5 elements or 10 elements; you must examine 0.5N elements or 0.25N elements or 0.1N elements. If you pick the median of that subset of elements (which you must do in linear time, mind you! Have you read CLR's "median of medians" algorithm, the one you have to use because quickselect doesn't guarantee linear time?) then you're guaranteed to complete in O(n log n) time, but at what cost? You've massively increased both the complexity and the constant factors of your solution.

[1] http://www.cs.dartmouth.edu/~doug/mdmspe.pdf is one example of how to do so.

eru · on Nov 9, 2011

> [...] but at what cost? You've massively increased both the complexity and the constant factors of your solution.

That's what you get for derandomizing your algorithms.

tef____ · on Nov 9, 2011

also merge sort is stable :-)

rkalla · on Nov 9, 2011

Yea, Tim is an organized guy.... jazz hands

haileys · on Nov 9, 2011

This is not reddit

jemfinch · on Nov 9, 2011

The downvotes will suffice in the future, I think.

tef____ · on Nov 9, 2011

the approach is used in functional languages too (without galloping)

for example 'run sort' in prolog

http://www.scss.tcd.ie/publications/tech-reports/reports.05/...