Just that I'm trusting the OS to re-duplicate it at block level on file write. The idea that block by block you've got "okay, this block is shared by files XYZ, this next block is unique to file Z, then the next block is back to XYZ... oh we're editing that one? Then it's a new block that's now unique to file Z too".
I guess I'm not used to trusting filesystems to do anything but dumb write and read. I know they abstract away a crapload of amazing complexity in reality, I'm just used to thinking of them as dumb bags of bits.
ZFS is COW with checksums so it wouldn't edit the same blocks and there is the possibility for sending snapshots to another pool (which may/may not use deduplication).
Although, deduplication comes with a performance cost. I had all my photos spread out on various disks and external media, sometimes extra copies (as I did not trust certain disks) - if I remember it correctly I went from 3.6T to 2.2T usage by consolidating all my photos to a deduplicated pool. All fine, but the zpool wanted way more RAM and felt slower than my other pool.
After I removed duplicates (with help of https://github.com/sahib/rmlint ), I migrated my photos to an ordinary zpool instead.
I mean, it's happening whether you're using snapshots or not; ZFS doesn't overwrite things in place basically ever. Snapshots just mean it doesn't delete the old copy as having nothing referencing it.
CoW is always happening but without snapshots you can know that every block has exactly zero or one references and it's much simpler. No garbage collection, no complicated data structures, all you need is a single tree and a queue of dead blocks.
Just that I'm trusting the OS to re-duplicate it at block level on file write. The idea that block by block you've got "okay, this block is shared by files XYZ, this next block is unique to file Z, then the next block is back to XYZ... oh we're editing that one? Then it's a new block that's now unique to file Z too".
I guess I'm not used to trusting filesystems to do anything but dumb write and read. I know they abstract away a crapload of amazing complexity in reality, I'm just used to thinking of them as dumb bags of bits.