Remember -- this is an on-disk compactor, so it's not quite the same as collecti...

karterk · on March 5, 2015

In another thread, you had mentioned that "I don't think we'd use this design again". Curious to hear how you would design such a system differently, without using a LSM tree approach that's popular these days?

coffeemug · on March 5, 2015

I don't think the LSM tree approach really panned out (in a sense that the benefits don't outweigh the drawbacks). You get much better insert and update throughput at the cost of significant production stalls. Most people don't need that level of insert throughput (and if they do, they can get it by scaling horizontally). Even if you need that throughput on a single box, most people aren't ok with long stalls. Facebook has been doing some work to minimize stalls in an LSM storage engine, but this is a significant engineering effort that only really makes sense for a few companies.

RethinkDB's storage engine uses a different architecture -- it gets you better insert/update performance on SSDs without stalls (but not as good a throughput as LSM-based engines), in exchange for significant engineering effort to make the engine bulletproof. Again, most of the time, people can get that by scaling horizontally.

I think that in 99% of cases a traditional storage engine approach works just fine. We all tried to reinvent the wheel, but ultimately it turned out to be a lot of work for fairly little benefit.

Padding · on March 5, 2015

> I think that in 99% of cases a traditional storage engine approach works just fine. We all tried to reinvent the wheel, but ultimately it turned out to be a lot of work for fairly little benefit.

Please publish this in a paper or at least a blog article so I can properly quote you the next time a discussion on ACID comes up. :)