Hacker News new | past | comments | ask | show | jobs | submit login

Page 5 of https://www-staging.commandprompt.com/uploads/images/Command... says "This system can checksum data at about 300 MB/s per core."

It lacks page numbers. Page 5 is first page with gray box at the top of the page.




That's measuring 'cksum', which must have an awfully slow implementation. The document notes that this is distinct from measuring PG's checksum performance. (I think it's a pretty useless measurement.)

Earlier (page 4):

> How much CPU time does it take to checksum...

> ...a specific amount of data? This is easy to estimate because PostgreSQL uses the crc32 algorithm which is very simple, and (GNU) Linux has a command line program that does the same thing: cksum.

Yeah, using cksum as an estimate here appears to be very flawed.


That is weird. Seems like crc optimization is quite a rabbit hole.

https://github.com/komrad36/CRC has a massive section about it in the README. Really interesting.


Yeah. crc32 may be simple in theory, but doing it as fast as possible utilizing the various execution units of modern hardware is challenging.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: