Making CRC calculations in Mojo 18x faster than Python and 3x slower than Python

molenzwiebel · 2024-10-06T12:40:21 1728218421

For those that thought the process of speeding up CRC was interesting, I strongly recommend reading [1]. It describes a step by step process on how a naive CRC implementation might be improved, until finally arriving at an implementation in assembly with a staggering throughput of 62 processed bits (almost 8 bytes) per CPU cycle. Yes, you read that right.

[1]: https://github.com/komrad36/CRC

fnands · 2024-10-06T12:51:06 1728219066

Yup, it's a fantastic read. I based most of my post off it (I clearly mention so) and it's worth it to read at least the first part of it before reading my post.

jorams · 2024-10-06T13:29:47 1728221387

For what it's worth, it appears the paper "Everything we know about CRC but afraid to forget" was originally published as part of the release of crcutil on Google Code[1]. This is a hg repository with one commit that includes the paper, the source of the paper, and an implementation.

[1]: https://code.google.com/archive/p/crcutil/

fnands · 2024-10-06T16:05:20 1728230720

Thanks! I'll add it to the post.

onethumb · 2024-10-06T20:36:55 1728247015

FYI, I forked and improved [1] a Rust implementation that supports both table- and SIMD-accelerated CRC-64/NVME [2] calculations. The SIMD-accelerated (x86/x86_64 and aarch64) version delivers 10X over the table (16-bytes at a time) implementation.

The original implementation [3] did the same thing but for CRC-64/XZ [4].

[1]: https://github.com/awesomized/crc64fast-nvme

[2]: https://reveng.sourceforge.io/crc-catalogue/all.htm#crc.cat....

[3]: https://github.com/tikv/crc64fast

[4]: https://reveng.sourceforge.io/crc-catalogue/all.htm#crc.cat....

Genbox · 2024-10-06T20:43:14 1728247394

How does it compare to the built-in CRC32 instruction? [1]

[1] https://www.intel.com/content/www/us/en/docs/ipp/developer-g...

onethumb · 2024-10-06T22:18:36 1728253116

This is computing CRC-64, not CRC-32, so there's not really a comparison. But perhaps most importantly, ours works with a variety of polynomials (there are a lot! [1])... we're just using the NVME one, but it's trivially adaptable to most (all?) of them. (The Intel instruction you link to only works with two - CRC32 and CRC32C)

Finally, it's based on Intel's paper [2], so they also believe it's extremely fast. :)

[1]: https://reveng.sourceforge.io/crc-catalogue/all.htm

[2]: https://web.archive.org/web/20131224125630/https://www.intel...

adsharma · 2024-10-06T15:15:18 1728227718

https://github.com/py2many/py2many/pull/653

Transpiling the python version in the blog to mojo gives me a 4x speedup.

Had to hand edit a couple of things:

    - List to bytearray conversion is not working yet
    - Can't iterate over SIMD. So a "for x in list" loop has to be rewritten as a range based loop.

With a bit more work, the manual edits won't be necessary.

adsharma · 2024-10-06T15:20:48 1728228048

Generated code: https://gist.github.com/adsharma/22ec18664ce4a59750b78fddc58...

Edited code: https://gist.github.com/adsharma/72516d8333a25fd8e350472568d...

fnands · 2024-10-06T16:04:35 1728230675

Oh! Cool project!

You mean 4x speedup over Python?

adsharma · 2024-10-06T16:14:57 1728231297

Yes. Details in the pull request linked above

bsaul · 2024-10-06T13:00:46 1728219646

How is mojo doing ? Has it made its way as a niche language in some places ?

fnands · 2024-10-06T16:03:12 1728230592

It's coming along. I don't think anyone is using it for anything serious yet, but it is starting to feel like a real language.

My guess is that it will start being used as a library language (i.e. have libraries written in Mojo being called from Python) before it really gets going as its own thing.

khimaros · 2024-10-06T14:54:29 1728226469

and is it still closed source?

fnands · 2024-10-06T15:56:54 1728230214

Yup, language is still closed, stdlib is open.

chrislattner · 2024-10-06T17:18:07 1728235087

Great post @fnands!

fnands · 2024-10-07T07:15:27 1728285327

Thanks Chris!

It has been about a year since I started playing around with Mojo and it's impressive how far it (and the community) has come!