Hacker News new | past | comments | ask | show | jobs | submit login

Cassandra user in production here. Could you elaborate a little bit on what was messed up (size, false positives etc.) and which version this was? Did you have trouble with Cassandra itself that you tracked down to the bloom filters or trying to re-use the bloom filters in another project?



I'm only familiar with a Python clone of the Cassandra implementation ("hydra" -- used it a while back), but two issues I do remember are: 1) I believe it only uses 32-bit ints for the bit array addressing, so you can overflow it (and this also may be less-than-ideal from a hash distribution perspective, but I don't know offhand); and 2) as someone coming from a different background, I found the whole thing to have a bit too much "OO" magic, with several helper classes to set up the filter that (to me) obfuscate what's really going on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: