Why the heck did they decided to focus on CRC? Early optimization?
First CRC16 is in fact used as a "perfect hash function". It is confusing to focus on the implementation and not the purpose. It is used to evenly distribute data in bucket. It is a 'perfect hash function' they should use and focus on (CRC being a special subcase of totally legitimate candidates for being perfect hash function, but if tommorrow a super hyper fast perfect hash function is out there ... ).
The CRC64 is used for a valid case of CRC BUT in an invalid domain.
Mathematically I could prove they are wrong. And I know where they have a probability problem and how there are sharpening the transition between functional/dysfunctional state, but how they are making so irreversible they will be FUBAR.
1) They are leaving of course there ass open to an attack by image by a malevolent attacker (having access to the storage for which they do the checksum (it is almost used as a digital signature so it should be a crypto hash function)).
2) probability of random collisions are not balanced by the fact a cause altering the signal normally also has some odds to alter the CRC and normally the signal is also a validation of the CRC that is the reason why CRC in real life should be a fixed fraction of the signal according to the probability of coalteration of both. So if you have a tolerance for faulty storage, as a result you will use it, and you might also reconstruct data from randomly corrupted with same CRC. It will be a mess the very exceptional but predictable day one shit happens, that's all I am saying. Every safeguards will be green :)
A dict/document/transaction maybe a fractal like structure, but the probability of alteration is not happening with the same shape/locality.
After reading this http://www.slideshare.net/Dataversity/redis-in-action , I am surprised.
Why the heck did they decided to focus on CRC? Early optimization?
First CRC16 is in fact used as a "perfect hash function". It is confusing to focus on the implementation and not the purpose. It is used to evenly distribute data in bucket. It is a 'perfect hash function' they should use and focus on (CRC being a special subcase of totally legitimate candidates for being perfect hash function, but if tommorrow a super hyper fast perfect hash function is out there ... ).
The CRC64 is used for a valid case of CRC BUT in an invalid domain.
Mathematically I could prove they are wrong. And I know where they have a probability problem and how there are sharpening the transition between functional/dysfunctional state, but how they are making so irreversible they will be FUBAR.
1) They are leaving of course there ass open to an attack by image by a malevolent attacker (having access to the storage for which they do the checksum (it is almost used as a digital signature so it should be a crypto hash function)). 2) probability of random collisions are not balanced by the fact a cause altering the signal normally also has some odds to alter the CRC and normally the signal is also a validation of the CRC that is the reason why CRC in real life should be a fixed fraction of the signal according to the probability of coalteration of both. So if you have a tolerance for faulty storage, as a result you will use it, and you might also reconstruct data from randomly corrupted with same CRC. It will be a mess the very exceptional but predictable day one shit happens, that's all I am saying. Every safeguards will be green :)
Something seems fishy.