Isn't there a sort of obvious "best of both worlds" by having a vector instructi...

codedokode · on May 23, 2022

In my opinion "the best" would be to support only one, fixed, largest vector size and emulate legacy instructions for smaller sizes using microcode (this is slow, but requires minimum transistor count). There is no need to optimize for legacy software; instead it should be recompiled and compiler should generate versions of code for all existing generations of CPUs.

This way the CPU design can be simplified and all the complexity moved into compilers.

brucehoult · on May 23, 2022

The entire Wintel monopoly has been built on people NOT recompiling old software. They still want it to run as fast as currently possible.

adgjlsfhk1 · on May 23, 2022

That would be awful for power consumption and only works well if everyone compiles everything from source. Otherwise every binary is awful performance for almost everyone.

jrockway · on May 23, 2022

> everyone compiles everything from source

One thing I learned from the M1 transition is that people will do it if someone tells them to. I bought a Mac this weekend to do exactly that; lots of users complaining about lack of M1 support. Time to add it (in a way that I can test). I have no choice.

astrange · on May 23, 2022

M1 has a very good x86 emulator for the moment. The users probably underestimate how good it is.

But GPU programs get compiled from source at runtime all the time, and sort of are all about vectors. (M1’s GPU doesn’t actually have vectors.)

snvzz · on May 23, 2022

Intel tried that (Itanium, with a variation of VLIW called EPIC). Didn't go so well. It was more like Epic Fail.

They poured a lot of money into the compiler. It didn't work out.

Again and again, complexity is proven cancerous. RISC is the way to go.

_0w8t · on May 23, 2022

Itanium tried to move branch prediction to the compiler from CPU. Plus it’s failure was not necessary related to CPU design, it can well be a bad business execution.

The grandparent comment is about a compiler generating multiple code copies for different CPU architecture iterations not to support all legacy instructions or at least to allow to implement those via microcode emulation in later CPUs.

userbinator · on May 23, 2022

and all the complexity moved into compilers.

Do not want. Especially seeing how hostile to programmers they've become.