I'm talking about which instructions and idioms are optimal. AFAIK, with intrinsics the compiler won't completely change what you've written.
Back in the days REP MOVSB was the fastes way to copy bytes, then Pentium came and rolling your own loop was better. Then CPUs improved and REP MOVSB was suddenly better again[1], for those CPUs. And then it changed again...
Similar story for other idioms where implementation details on CPUs change. Compilers can respond and target your exact CPU.
Back in the days REP MOVSB was the fastes way to copy bytes, then Pentium came and rolling your own loop was better. Then CPUs improved and REP MOVSB was suddenly better again[1], for those CPUs. And then it changed again...
Similar story for other idioms where implementation details on CPUs change. Compilers can respond and target your exact CPU.
[1]: https://github.com/golang/go/issues/14630 (notice how one comments the same patch that gives 1.6x boost for OP gives them a 5x degradation)