Hacker News new | past | comments | ask | show | jobs | submit login

What happens if the table is one with a big number of deletes? The Bloom filter false positive rate will keep increasing as time goes on.

One way to address this is to recalculate it every n deletes, but this sounds similar to AUTOVACUUM issues in PostgreSQL and might result in unexpected drops in performance




We do rotating filters for that. Items are added to the current and next bloom filters, and we stop serving from one, serving from the next, delete the former, and start the next filter.

Another option is a cuckoo filter; it is like a bloom but allows deletion


The article says "at the start of the join operation the bits will be set in the bloom filter".

So maybe this is built for every query? Would be needed anyway if there is a where clause for the joined table.


You might reset the single position (which would invalidate all matching items) which is better than recalculating everything or mass invalidation.

My guess though is that many use cases are likely data stores that don't experience deletes at all.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: