The sibling comment recommending compiler intrinsics is probably the best way to go for writing SIMD code. A mixture of `<i32 x 16>` style types and intrinsics to specify instructions is a solid 90% solution compared to assembly.
If you want that last 10%, I think macros are putting the emphasis in the wrong place. They're a somewhat easy way to build up a language abstraction which will work if held carefully, but I'm confident the dev experience using this abstraction when you write invalid code will be deeply confusing.
I would suggest to write a parser instead of the macros. That'll tell you clearly when the syntax is invalid (though possibly not with much precision) and it'll give you a place to put semantic analysis for where valid syntax encodes nonsense. Do the equivalent of the macro expansions on the parsed tree instead of on the text. Emit asm as the "back end".
So far intrinsics supplemented with functions that wrap single instructions cover my needs. The thing I occasionally want to be easier is manually configurable unrolling. For example if you have some kernel that processes the input in 3 streams, or that loads 5 registers worth of input from a single stream and processes all of those in the same way, the resulting code duplication is a useless place where typos can give you a bad time. A macro system slightly different from the one C actually has could let you get rid of the duplication and just change the constant to control the unrolling.
The sibling comment recommending compiler intrinsics is probably the best way to go for writing SIMD code. A mixture of `<i32 x 16>` style types and intrinsics to specify instructions is a solid 90% solution compared to assembly.
If you want that last 10%, I think macros are putting the emphasis in the wrong place. They're a somewhat easy way to build up a language abstraction which will work if held carefully, but I'm confident the dev experience using this abstraction when you write invalid code will be deeply confusing.
I would suggest to write a parser instead of the macros. That'll tell you clearly when the syntax is invalid (though possibly not with much precision) and it'll give you a place to put semantic analysis for where valid syntax encodes nonsense. Do the equivalent of the macro expansions on the parsed tree instead of on the text. Emit asm as the "back end".