Hacker News new | past | comments | ask | show | jobs | submit login

Don't SSD have some kind of cyclic redundancy check?



> Don't SSD have some kind of cyclic redundancy check?

Yes, but they're pushing the the technology so far to get higher densities that error correction is required for normal operation:

https://en.wikipedia.org/wiki/Multi-level_cell

> The primary benefit of MLC flash memory is its lower cost per unit of storage due to the higher data density, and memory-reading software can compensate for a larger bit error rate.[5] The higher error rate necessitates an error correcting code (ECC) that can correct multiple bit errors; for example, the SandForce SF-2500 Flash Controller can correct up to 55 bits per 512-byte sector with an unrecoverable read error rate of less than one sector per 1017 bits read.

Making the cells smaller and cramming more bits per cell reduces the amount of energy/radiation required to trigger a bit flip, making the data more vulnerable. It sounds like they may also be increasing the energy put out by the scanners. Not a very good combination if you care about your data. Maybe they'll push the fraction of storage reserved for error-correction data even higher to compensate.

IIRC, radiation-hard chips often used processes with larger feature sizes and different materials in order to be more resistant to the effects of radiation.


Interesting! I had no idea there is increased need for error correction.

Regarding radiation, I just found this cool link about ICs in space: http://cpushack.com/space-craft-cpu.html

From my understanding they often use redundant CPUs computing the same instructions in lockstep and rollback when they disagree.


I posted the same comment yesterday and for some reason got downvoted to oblivion. SSDs do not rely on bits being correct. Raw bit error rates in multi-level flash are generally over 1%. In other words if you read a 4K page you are guaranteed to get several wrong bits. The only reason a of it works at all is because of sophisticated error-correcting codes.


I posted the same comment yesterday and for some reason got downvoted to oblivion.

Probably because of the "90% bit error rate" part of the comment. Obviously SSDs can't tolerate anything close to 90% bit errors.

If that kind of error rate could be corrected in the general case, it would have profound implications for information theory in general, not just hard drive manufacturing. It would be equivalent to a data compression algorithm that could crunch anything by 90%.

Basically the only way this could work would be if the SSD's actual physical capacity were several times larger than specified, so that the vast majority of its space could be used for redundant encoding. I'm sure that's true to some extent, but I doubt a 100 GB drive is actually a 1 TB drive with 90% redundancy.


Based on publications from SSD controller vendors, it looks like current 3D TLC NAND has raw bit error rates on the order of 10^-3 or better when it's healthy. Raw bit error rates in the 1-2% range correspond to a drive that's either worn out its write endurance, or has been sitting on a shelf in high temperatures for years and has some data retention issues. It looks like most SSD controllers are designed to maintain some degree of usability with ~1% raw bit error rates (albeit with performance penalties), but 2% RBER is pushing the limits of even the last-resort layer of ECC.


Your logic is flawed and/or the statement misleading.

While storing more than a single bit with bit error rate >= 50% does require at least some error modeling, it does not mean profound implications on information theory¹.

Yes, normal SSDs would not have raw bit error rate anywhere near 90%. Still, I recall tales of flash media approaching raw 90% error rate - mostly shitty SD cards making use of the not-completely-trashed sectors of scrap high-capacity chips with heaps of ECC.

¹ unless you really do mean general (theoretical) 90% error correction, which would indeed imply things such as P = NP and P != NP, yet it cannot be applied as an argument on specific real-world errors.


Anyway it’s irrelevant because to the filesystem the SSD is supposed to look like it does not in fact have all these errors. ZFS doesn’t really care about the storage format apart from perhaps supporting TRIM.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: