> As opposed to writing worse performing and less readable code?
I can count the number of coworkers I have that have experience with inline assembly on one hand (it is less than 1). Also the first reaction I usually get to vector intrinsics are questions about the wtfness of shuffle instructions, no you can't make that readable without sacrificing performance.
> Then just write it in the simplest way and hope autovectorization will help you.
Spoiler: It wont't. Compilers often don't have the context and some of the biggest hot spots I had to deal with simply used the single value versions of vector instructions.
Or rather: If that had worked I wouldn't be there hand optimizing the code.
Yes, shoot portability in the foot so you can keep your code free of well defined language constructs.