I’m not sure what “suited to SIMD” means exactly in this context. I mean, it is clearly possible for a compiler to apply some SIMD optimizations. But the program is essentially expressed as a sequential thing, and then the compiler discovers the SIMD potential. Of course, we write programs that we hope will make it easy to discover that potential. But it can be difficult to reason about how a compiler is going to optimize, for anything other than a simple loop.
Suites for SIMD means you write the scalar equivalent of what you'd do on a single element in a SIMD implementation.
E.g. you avoid lookup tables when you can, or only use smaller ones you know to fit in one or two SIMD registers. gcc and clang can't vevtorize it as is, but they do if you remove the brancjes than handle infinity and over/under-flow.
In the godbolt link I copied the musl expf implementation and icx was able to vectorize it, even though it uses a LUT to large for SIMD registers.
#pragma omp simd and equivalents will encourage the compiler to vectorize a specific loop and produce a warning if a loop isn't vectorized.
I shouldn’t have started my comment with the sort of implied question or note of confusion. Sorry, that was unclear communication.
I agree that it is possible to write some C programs that some compilers will be able to discover the parallel potential of. But it isn’t very ergonomic or dependable. So, I think this is not a strong counter-argument to the theory of the blog post. It is possible to write SIMD friendly C, but often it is easier for the programmer to fall back to intrinsics to express their intent.