Hacker News new | past | comments | ask | show | jobs | submit login

How you will calculate hash of file, when it broken, to lookup for?



You have all the hashes in the .torrent file. All you need is a regular check with it

(but then the .torrent file itself has to be stored on a storage that resists bit flipping)


If you’re worried about bit-flipping, you could just store multiple copies of the hash and then do voting, since it’s small. If you’re worried about correlated sources of error that helps less, though.


>storage [...] bit flipping

As someone with no storage expertise I'm curious, does anyone know the likelyhood of an error resulting in a bit flip rather than an unreadable sector? Memory bit flips during I/O are another thing but I'd expect a modern HDD/SSD to return an error if it isn't sure about what it's reading.


Not sure if this is what you mean, but most HDD vendors publish reliability data like “Non-recoverable read errors per bits read”:

https://documents.westerndigital.com/content/dam/doc-library...


Thanks for the link. I think that 10^14 figure is the likelyhood of the disk error correction failing to produce a valid result from the underlying media, returning a read error and adding the block to pending bad sectors. A typical read error that is caught by the OS and prompts the user to replace drives.

What I understand by bit flip is a corruption that gets past that check (ie the "flips balance themselves" and produce a valid ECC) and returns bad data to the OS without producing any errors. Only a few filesystems that make their own checksums (like ZFS) would catch this failure mode.

It's one reason I still use ZFS despite the downsides, so I wonder if I'm being too cautious about something that essentially can't happen.


Just hash it before it's broken.


Maybe this is a joke that’s over my head, but the OP wants a system where damaged media can be repaired. They have the damaged media so there’s no way to make a hash of the content they want.


How far would error correction go?


if you store the merkle tree that was used to download it, you'll be able to know exactly which chunk of the file got a bit flip.


You could do a rolling hash and say that a chunk with a given hash should appear between two other chunks of certain hashes


That seems like a recipe for nefarious code insertion.


oh shit yeah it does lol


Just use the sector number(s) of the damaged parts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: