If you implement your own ifunc instead of using the compiler-supplied FMV ifunc, you could do your benchmarks from your custom ifunc that runs before the program main() and choose the fastest function pointer that way. I don't think FMV can currently do that automatically, theoretically it could but that would require on additional modifications to GCC/LLVM. From the sounds of it, running an ifunc might be too early for you though, if you have to init OpenMP or something non-stateless before benchmarking.
SIMD Everywhere is for a totally different situation; if you want to automatically port your AVX2 code to ARM NEON/etc without having to manually rewrite the AVX2 intrinsics to ARM ones.
SIMD Everywhere is for a totally different situation; if you want to automatically port your AVX2 code to ARM NEON/etc without having to manually rewrite the AVX2 intrinsics to ARM ones.