Hacker News new | past | comments | ask | show | jobs | submit login
A Fast Wait-Free Hash Table that Scales Linearly to Hundreds of Threads (youtube.com)
59 points by caustic on May 16, 2010 | hide | past | favorite | 14 comments



If you are into wait-free and lock-free data structures, you owe it to yourself to get Herlihy and Shavit's "Art of Multiprocessor Programming."

http://www.amazon.com/Art-Multiprocessor-Programming-Maurice...

Not only does it provide a well-debugged, well-written collection of wait-free data structures; it teaches you to think about concurrency in a structured way. The mutual exclusion chapter doesn't teach you which pthread routines to use, but instead tells you how to implement mutual exclusion, and the trade-offs inherent in, e.g., spin locks vs. queueing locks. A significant bonus is Maurice Herlihy's distinctive prose voice: correct, concise, and funny in that order. It is the best computer book I've read recently, and the one I recommend to all colleagues who are trying to take their concurrency chops up to the next level.

(I took a course from Herlihy in '99, but am otherwise unaffiliated.)


I just got this book as an interlibrary loan based on this recommendation, and so far I'm not finding it that helpful. This isn't because it's a bad book: to the contrary, just as you say it's very readable. But it's also very strongly a book about concurrency in Java.

My first thought was that this wouldn't matter --- an algorithm is an algorithm. But for the purposes I'm interested in (multiprocessor malloc replacements, fast userspace read-write locks, generally low-level Linux on x86) I'm finding the high-level theoretical view to be impractical.

It's not that that using Java for the examples is a problem, but that apart from the appendices there is very little discussion of what I find are the real performance issues: cache misses, memory bandwidth, and processor pipelines. All of which could be considered hardware specific implementation details, but all of which make the difference between a 'provably correct solution' and one that actually performs well in the real world.

Anyway, a fine book, but perhaps better titled "The Theory of Concurrency in Java". I'll keep reading it, and I'm sure I'll learn from it, but I'll also keep looking for something more nitty and gritty.



He also gave a version of the talk linked here as a Google Tech Talk http://video.google.com/videoplay?docid=2139967204534450862#


This is actually a two-part talk. At the 50 minute mark, there's a case study on optimizing a heavily-threaded Java app. It's partially about throwing heaps of RAM and cores at the problem (using the Azul platform), but it's also a general refresher on how to attack performance problems that involve lots of threads.


Now that is something worth patenting... I'm kidding of course (I am all against them), but it seems non-trivial and non-obvious approach, unlike many other ridiculous patents.


Is there a downloadable version of the video anywhere?


How does this compare to Clojure's approach?


Clojure relies on persistent containers - i.e. immutable containers where every mutating operation returns a logically new container, which are cheap (logarithmic) when implemented in terms trees - with transactional memory slots, which may refer to such a container. An update might be in the form of a function mapping the old container value to the new container value (with a value added or removed), but it may be applied multiple times before it "gets in", according to transactional semantics (rollback if conflict etc.).

This approach is different; for the specific purpose of concurrent updates to a hash table, Cliff's approach is far more scalable to multiple threads. But in many ways, the fact that it's a hash table is beside the point. Cliff is presenting this as a different way of thinking about lock free concurrent programming: consider every possible state and transition, and make sure that they all make sense.

As a way of reasoning about multithreaded programs, Cliff's approach is not as scalable as Clojure's approach, but it's very useful for thinking about low-level implementation details.


OK, but let me restate my question then:

Does Cliff's approach result in a faster hash table than Clojure's? Is it "better"? In some situations or in all? If so, which?

When would you want to use Cliff's instead of Clojure's built in one? Would you at all?


Far faster. You'd want to use Cliff's version if you have thousands of threads all potentially mutating the table at the same time.

But thinking in these terms is crude; it's more important to understand the principles, hence my answer. It's like asking which is better: taking the train or driving in a car?


Sure, I understand, but I don't think my question was crude. I agree it's important to understand the principles.

I understand the basic idea of how Clojure's data structures work with STM and how they're immutable and how a "new" data structure is created upon mutation without copying the old one. But that understanding isn't helping me.

I don't understand Cliff's version well enough to be able to perform benchmarks in my head, nor do I understand either of the data structures well enough to understand the principles (as you said) to know when to pick one over the other.

So you said that when thread-count > 1000 I should pick Cliff's? Is it that simple? I would suspect not. What if there's only 2 threads involved? I'd just like a generalized overview of which is better for what types of situations, and by all means, to understand the "why" of it would be great too.

Or is the idea that because Cliff's version is mutable and does the whole "conflating state w/identity" thing, that it has no place in Clojure at all?


With STM and persistent data structures, you're fundamentally limited by how fast you can change a single memory location; and you can't freely modify the memory location because of the risk of missed updates. Cliff's table operates concurrently over the whole data structure - there's no single bottleneck, rather every slot in the table is potentially an update location.

When I said thousands of threads, I was stressing the massively concurrent potential of Cliff's design, but more importantly, the point where the hardware is fully subscribed and physical hardware threads are frequently trying to update the data structure. If there are only 2 threads involved on a dual-core machine, depending on everything else (constant factors), I would still expect Cliff's table to be faster.

Clojure's approach is at a different level of abstraction than Cliff's approach. Probably a better analogy would be whether it's better to manufacture a car or use a travel agent. Not everybody could safely manufacture a car, but almost anyone can use a travel agent with moderate phone, social and organizational skills. But they also solve problems with radically different scopes.


I don't think you can have a hashtable in closure. From what I understand, none of their collections are array based, where as this is.

Clojure uses trees whereas java uses maps primarily.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: