With STM and persistent data structures, you're fundamentally limited by how fas...

With STM and persistent data structures, you're fundamentally limited by how fast you can change a single memory location; and you can't freely modify the memory location because of the risk of missed updates. Cliff's table operates concurrently over the whole data structure - there's no single bottleneck, rather every slot in the table is potentially an update location.

When I said thousands of threads, I was stressing the massively concurrent potential of Cliff's design, but more importantly, the point where the hardware is fully subscribed and physical hardware threads are frequently trying to update the data structure. If there are only 2 threads involved on a dual-core machine, depending on everything else (constant factors), I would still expect Cliff's table to be faster.

Clojure's approach is at a different level of abstraction than Cliff's approach. Probably a better analogy would be whether it's better to manufacture a car or use a travel agent. Not everybody could safely manufacture a car, but almost anyone can use a travel agent with moderate phone, social and organizational skills. But they also solve problems with radically different scopes.