How you will calculate hash of file, when it broken, to lookup for?

rakoo · 2024-04-16T08:41:16.000000Z

You have all the hashes in the .torrent file. All you need is a regular check with it

(but then the .torrent file itself has to be stored on a storage that resists bit flipping)

arijun · 2024-04-16T09:19:19.000000Z

If you’re worried about bit-flipping, you could just store multiple copies of the hash and then do voting, since it’s small. If you’re worried about correlated sources of error that helps less, though.

Dibby053 · 2024-04-16T10:33:07.000000Z

>storage [...] bit flipping

As someone with no storage expertise I'm curious, does anyone know the likelyhood of an error resulting in a bit flip rather than an unreadable sector? Memory bit flips during I/O are another thing but I'd expect a modern HDD/SSD to return an error if it isn't sure about what it's reading.

halfcat · 2024-04-16T11:20:14.000000Z

Not sure if this is what you mean, but most HDD vendors publish reliability data like “Non-recoverable read errors per bits read”:

https://documents.westerndigital.com/content/dam/doc-library...

Dibby053 · 2024-04-16T12:17:48.000000Z

Thanks for the link. I think that 10^14 figure is the likelyhood of the disk error correction failing to produce a valid result from the underlying media, returning a read error and adding the block to pending bad sectors. A typical read error that is caught by the OS and prompts the user to replace drives.

What I understand by bit flip is a corruption that gets past that check (ie the "flips balance themselves" and produce a valid ECC) and returns bad data to the OS without producing any errors. Only a few filesystems that make their own checksums (like ZFS) would catch this failure mode.

It's one reason I still use ZFS despite the downsides, so I wonder if I'm being too cautious about something that essentially can't happen.

everfree · 2024-04-16T05:04:19.000000Z

Just hash it before it's broken.

jonhohle · 2024-04-16T14:07:00.000000Z

Maybe this is a joke that’s over my head, but the OP wants a system where damaged media can be repaired. They have the damaged media so there’s no way to make a hash of the content they want.

OnlyMortal · 2024-04-17T16:45:28.000000Z

How far would error correction go?

alex_duf · 2024-04-16T07:55:22.000000Z

if you store the merkle tree that was used to download it, you'll be able to know exactly which chunk of the file got a bit flip.

01HNNWZ0MV43FF · 2024-04-16T04:25:28.000000Z

You could do a rolling hash and say that a chunk with a given hash should appear between two other chunks of certain hashes

arijun · 2024-04-16T09:20:47.000000Z

That seems like a recipe for nefarious code insertion.

01HNNWZ0MV43FF · 2024-04-19T03:32:58.000000Z

oh shit yeah it does lol

selcuka · 2024-04-16T05:12:54.000000Z

Just use the sector number(s) of the damaged parts.