Itanium tried to move branch prediction to the compiler from CPU. Plus it’s failure was not necessary related to CPU design, it can well be a bad business execution.
The grandparent comment is about a compiler generating multiple code copies for different CPU architecture iterations not to support all legacy instructions or at least to allow to implement those via microcode emulation in later CPUs.
They poured a lot of money into the compiler. It didn't work out.
Again and again, complexity is proven cancerous. RISC is the way to go.