I wonder why they are having so much trouble getting this working properly with smaller RAM footprints. We have been using commercial storage appliances that have been able to do this for about a decade (at least) now, even on systems with "little" RAM (compared to the amount of disk storage attached).
Just store fingerprints in a database and run through that at night and fixup the block pointers...
That's why. Due to reasons[1], ZFS does not have the capability to rewrite block pointers. It's been a long requested feature[2] as it would also allow for defragmentation.
I've been thinking this could be solved using block pointer indirection, like virtual memory, at the cost of a bit of speed.
But I'm by no means a ZFS developer, so there's surely something I'm missing.
It looks like they’re playing more with indirection features now (created for vdev removal) for other features. One of the recent summit hackathons sketched out using indirect vdevs to perform rebalancing.
Once you get a lot of snapshots, though, the indirection costs start to rise.
You can also use DragonFlyBSD with Hammer2, which supports both online and offline deduplication. It is very similar to ZFS in many ways. The big drawback though, is lack of file transfer protocols using RDMA.
I've also heard there are some experimental branches that makes it possible to run Hammer2 on FreeBSD. But FreeBSD also lacks RDMA support. For FreeBSD 15, Chelsio has sponsored NVMe-oF target, and initiator support. I think this is just TCP though.
Just store fingerprints in a database and run through that at night and fixup the block pointers...