Because people use it as a unique identifier for data. Let's say you have a file...

bick_nyers · on April 25, 2022

I would expect a tiered approach when it comes to deleting data. E.g. before deleting you actually check byte by byte, but because that is too expensive to run on every file, you use the hash to narrow down what files you are testing against each other (and maybe even a cheaper hash on top of that one that determines whether you will spend CPU cycles on doing an expensive hash).

Perhaps that's how it's actually done, I'm not sure.