https://github.com/htot/crc32c has some interesting implementations of CRC32 algorithms of different speeds, the highest I see is (function, aligned, bytes, MiB/s) :
FYI, CRC32C is not the same checksum as CRC32. CRC32C is used in iSCSI, btrfs, ext4. CRC32 is used in Ethernet, SATA, gzip. Intel's SSE4.2 provides an instruction implementing CRC32C but not CRC32. ARM defines instructions for both.
This article seems to miss that distinction but appears to be testing CRC32, so it's not quite correct to compare against something using Intel's CRC32C instruction.
It seems a good idea to start a Code Golf competition.