it seem that the performance gain of not having all update go to a single log co...

ezrahoch · on March 12, 2017

1) Keeping a log mechanism per bucket would be expensive, since it is a heavy mechanism. It will cost memory & network traffic. The additional information needed by Bizur for a bucket is minimal, basically, just a version.

2) In Bizur, rejoining the cluster includes the equivalent of copying the snapshot. However, that node can start participating in the Bizur even before it finished reading the entire snapshot, because of the key-value data model. Once a bucket has been copied, it can start serving it, which is much faster than log-based algorithms.

3) You are correct about the size of a bucket. In our implementation we aim to have up to 8-10 keys in a bucket, to avoid the overhead you mentioned. That's why it's especially important to be very efficient pet-bucket (see point #1).

skyde · on March 15, 2017

Thanks a lot for clarification.