> And why 32 entries? I ran this benchmark with a bunch of different bucket size...

taeric · 2024-08-28T15:53:04.000000Z

I really like the way Knuth benchmarks many of his later programs. He basically puts a counter for how many times something has to be loaded from memory. Would be curious to know if you could approximate how many times you have to clear cache lines, in the same way?

VHRanger · 2024-08-28T01:15:47.000000Z

Yeah when benchmarking by batch sizes it's common to see huge jumps associated with the memory hierarchy:

- word size (64bits) - cache alingment fetch size (generally 64bytes as mentioned above) - OS page size (4-16kb) - L1 size (~80kb/core) - L2 (low megabyte number)

hinkley · 2024-08-28T04:07:30.000000Z

With lots of bizarre artifacts if you don’t force alignment.