Hacker News new | past | comments | ask | show | jobs | submit login

Yes, I consider this to fall under the umbrella of general microarchitectural improvements I mentioned. GCC and LLVM are regularly updated with microarchitectural scheduling models to better emit code that matches the underlying architecture, and have featured these for at least 5-7 years; there can be a big difference between say, Skylake and Zen 2, for instance, so targeting things appropriately is a good idea. You can use the `-march` flag for your compiler to target specific architectures, for instance -march=tigerlake or -march=znver3

But in general I think it's a bit of a red herring for the thrust of my original post; first off you always have to target the benchmark to test a hypothesis, you don't run them in isolation for zero reason. My hypothesis when I ran my own for instance was "General execution of bog standard scalar code is only up by about 50-60%" and using the exact same binary instructions was the baseline criteria for that; it was not "Does targeting a specific microarchitecture scheduling model yield specific gains." If you want to test the second one, you need to run another benchmark.

There are too many factors for any particular machine for any such post to be comprehensive, as I'm sure you're aware. I'm just speaking in loose generalities.




> You can use the `-march` flag for your compiler to target specific architectures, for instance -march=tigerlake or -march=znver3

Note that -march will use instructions that might be unavailable on other CPUs of the target. -mtune (which is implied by -march) is the flag that sets the cost tables used by instruction selection, cache line sizes, etc.


I recently watched the CppCon 2019 talk by Matt Godbolt "Compiler Explorer: Behind The Scenes"[1], and a cool feature he presented is the integrated LLVM Machine Code Analyser tool. If you look at the "timeline" and "resource" views of how a Zen 3 executes a typical assembly snippet, it is absolutely mind blowing. That beast of a CPU has so many resources it's just crazy.

[1] https://www.youtube.com/watch?v=kIoZDUd5DKw




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: