These are some strange complaints. As a C++ developer optimizing numerical code, I have never once worried about any of the things mentioned here. I care about instruction folding, vectorization, loop reordering, and efficient instruction emission. If your optimized program is worried about startup latency of 100s of ms, you are doing something very very wrong.
You don't really need the `@simd` macro, though. The compiler is pretty good at vectorizing on its own now.
From the docs: "In many cases, Julia is able to automatically vectorize inner for loops without the use of `@simd`. Using `@simd` gives the compiler a little extra leeway to make it possible in more situations."
I guess negativity gets clicks.