Deduplicating compressed data is difficult if you want to be able to reconstruct the original. Zip for example has many ways to compress the data, so it's difficult it decompress a zip and keep the information necessary to reconstruct the exact same compressed file.
I'm aware of one tool for zip:
https://pypi.org/project/deterministic-zip/
of course that only helps if it's used to create the original zip.
Is there a list of "reversible" formats somewhere? Are there other ways to deal with this?