The Skip list

silentbicycle · on March 6, 2010

Previously: http://news.ycombinator.com/item?id=229668

First things first, read Pugh's original paper (ftp://ftp.cs.umd.edu/pub/skipLists/skiplists.pdf). That would be the better link, but HN doesn't let you submit links to FTP sites.

Also: A lecture I watched compared skip lists to subway routes with multiple (randomly routed) express lines. That's one of the best metaphors for a data structure I've ever heard.

jaydub · on March 7, 2010

A lecture I watched compared skip lists to subway routes with multiple (randomly routed) express lines.

I had Prof. Pugh for an honors CS course @ UMD and that's how he first described it to the class.

CoryOndrejka · on March 6, 2010

I read Pugh's original paper in the late 90's and thought skip lists were a heck of lot simpler than red-black trees. Ended up using them all over Second Life's original code base. By 2004 or so we switched the code over the STL and the LLSkipList class was retired to the dark depths of the CVS/SVN repository.

tptacek · on March 6, 2010

Maybe simpler, but better? Probably not so much. I wanted red-blacks on a project that couldn't use STL, and found that porting STLPort's <map> red-black to C-with-void*'s was pretty straightforward. Didn't even have to write and debug a red-black tree: <map> is bulletproof.

eru · on March 6, 2010

You can even let the type system guarantee the red-black-tree laws. That's even better than idiot-proof. (http://www.cs.kent.ac.uk/people/staff/smk/redblack/rb.html has an implementation in Haskell.)

Shouldn't trees be a nightmare to maintain without a GC? (Are you modifying in-place?)

CoryOndrejka · on March 6, 2010

In hindsight that would have made a lot of sense :-) at the time we had looked at STL, but getting it to work across Debian and Win2k (which is what SL was running on at the time) looked like a PITA. Didn't even think about porting STL, especially since skiplists were so simple to implement.

antirez · on March 6, 2010

Redis sorted sets are implemented using skip lists (augmented skip lists for O(log(N)) Rank operation in Redid master).

Redis Sorted Sets additional info here: http://code.google.com/p/redis/wiki/SaddCommand

ntoshev · on March 6, 2010

I was looking at Redis yesterday and noticed this. Is there any particular reason you chose skip list instead of btrees except for simplicity? Skip lists consume more memory in pointers and are generally slower than btrees because of poor memory locality so traversing them means lots of cache misses.

I also suggested a way to improve throughput when you guarantee each command's durability (at the end of the wiki page): http://code.google.com/p/redis/wiki/AppendOnlyFileHowto

Also, have you thought about accommodating read-only traffic in an additional thread as a way to utilize at least two cores efficiently while sharing the same memory?

antirez · on March 6, 2010

There are a few reasons:

1) They are not very memory intensive. It's up to you basically. Changing parameters about the probability of a node to have a given number of levels will make then less memory intensive than btrees.

2) A sorted set is often target of many ZRANGE or ZREVRANGE operations, that is, traversing the skip list as a linked list. With this operation the cache locality of skip lists is at least as good as with other kind of balanced trees.

3) They are simpler to implement, debug, and so forth. For instance thanks to the skip list simplicity I received a patch (already in Redis master) with augmented skip lists implementing ZRANK in O(log(N)). It required little changes to the code.

About the Append Only durability & speed, I don't think it is a good idea to optimize Redis at cost of more code and more complexity for a use case that IMHO should be rare for the Redis target (fsync() at every command). Almost no one is using this feature even with ACID SQL databases, as the performance hint is big anyway.

About threads: our experience shows that Redis is mostly I/O bound. I'm using threads to serve things from Virtual Memory. The long term solution to exploit all the cores, assuming your link is so fast that you can saturate a single core, is running multiple instances of Redis (no locks, almost fully scalable linearly with number of cores), and using the "Redis Cluster" solution that I plan to develop in the future.

Andys · on March 7, 2010

I have implemented a skip list library that stores all nodes of the same "level" together in the same file/memory map.

This fixes the cache locality problem.

Skip lists also lend themselves well to concurrent multi-threaded access, especially update and deletion.

crux_ · on March 6, 2010

Without knowing how Redis works, maybe they were chosen because it's really quite easy to build a highly concurrent (lock-free) skip list?

ntoshev · on March 6, 2010

No, Redis is single-threaded so they sidestep all concurrency issues. This and persisting the data in a forked process are my favorite parts of Redis' architecture.

eru · on March 6, 2010

How do you develop a persistent (i.e. purely functional) version of them?

camccann · on March 6, 2010

Brief nitpicking: "persistent" and "purely functional" are not really the same thing.

On the "purely functional" end, any stochastic algorithm is by definition not pure in the "referentially transparent" sense, so that's right out, and building a PRNG into the data structure is probably suboptimal.

As far as persistence/immutability go, the skip list doesn't seem well suited to that, either--any time you change an element, you'd have to replace all the elements with pointers to it, and elements with pointers to those, and so on recursively. There are standard solutions to this, but all would entail nontrivial changes to the structure.

In the end, it would be easy to make something not obviously broken that looked like a skip list, but I don't know whether it would have the same performance characteristics, or even worthwhile performance at all.

Given that skip lists are, in essence, a way of retrofitting some of the benefits of a tree structure onto a linked list, I suspect it would be more worthwhile to find an existing immutable, functional tree structure and retrofit some of the benefits of a linked list onto it.

eru · on March 6, 2010

> Brief nitpicking: "persistent" and "purely functional" are not really the same thing.

You are right about that. But I did not have time for a longer explanation and some people take "persistent" to mean "can be saved to disk".

For stochastic algorithms you will probably use the same trick as with monadic I/O --- you can make the description of the algorithm purely functional, but not its execution.

camccann · on March 6, 2010

For stochastic algorithms you will probably use the same trick as with monadic I/O --- you can make the description of the algorithm purely functional, but not its execution.

Certainly possible, but how well this works depends on how, and to what extent, referential transparency is enforced by the language.

For instance, in Haskell, there really isn't a satisfying solution that I'm aware of. Threading a PRNG (as with the state monad) keeps the code "pure", but enforces a strict sequencing on use of the structure, wrecking laziness. On the other hand, any nondeterminism without threading a PRNG probably requires unsafePerformIO.

chancho · on March 6, 2010

'Probabilistic' doesn't have to mean 'nondeterministic'. Hash tables are deterministic and probabilistic. You could probably do the same with a skip list: hash the key to see what level of the skip list it gets promoted to.

eru · on March 7, 2010

If you split your PRNG, you can 'tree' it through your algorithm instead of threading it. That should help with lazyness.

keefe · on March 6, 2010

local state, I like to have it... I think this is a nice argument for using oop

chancho · on March 6, 2010

local state, you will not like what concurrent threads do to it... For every nice argument there is a counter.

keefe · on March 6, 2010

I'm absolutely fine with what concurrent threads would do to it? Thread synch stuff is really not hard once you've done it a while. It's actually one of the more fun pieces of programming, given that you have jprofiler (or similar) to inspect what's going on in the jvm as you try different things. To me, it is a very logical idea and powerful tool to pair up a bit of memory with a bit of algorithm.

eru · on March 8, 2010

But --- is it composable?

keefe · on March 8, 2010

Probably not, but I'm getting good throughput and low latency in performance tests and I never see a deadlock so who cares?

pmjordan · on March 6, 2010

There aren't any cycles, so this shouldn't be difficult. I'm not sure how efficient it'll be though, even with laziness.

crux_ · on March 6, 2010

An easy first guess as to efficiency would be O(n) for insertions and deletions, since that's what you need to maintain a sorted list (which is the lowest level of a skip-list).

chancho · on March 6, 2010

It doesn't have to be a linked list though. There are other purely functional sequence data structures which support O(log n) insertion and deletion, e.g. Haskell's Data.Sequence, which is a finger tree of some sort. If you could find a way to do binary search on one of these in better than O(log^2 n) time you'd pretty much have your purely functional skip list.

pkrumins · on March 6, 2010

MIT Algorithms Video Lecture on skip lists:

http://www.catonmat.net/blog/mit-introduction-to-algorithms-...

keefe · on March 6, 2010

It's certainly a very interesting data structure. However, I don't think this level of an optimization should show up in your code often. I prefer coding to Map<K,V> and constructing the particular instance in some function that starts with a hash table and gets optimized later depending on exactly where performance problems are.

lucifer · on March 6, 2010

There is a very nice implementation of a concurrent skiplist map in Java collections: http://java.sun.com/javase/6/docs/api/java/util/concurrent/C...

__bjoernd · on March 6, 2010

Is this the week of posting Wikipedia links to HN then?

Interesting, though.