Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have decent speedups been gotten by previous CPUs by the addition of conditional moves? IIRC for some the SPECcpu impact was negligible, amd many RISCs don't have it. RISC is about quantifying this kind of thing and skipping marginal additions after all.


> Have decent speedups been gotten by previous CPUs by the addition of conditional moves?

This is not a direct answer to your question, but: I recently had to tune the conditional move generation heuristics in the GraalVM Enterprise Edition compiler. My experience has been that you can absolutely get decent speedups of 10-20% or more with a few well-placed conditional moves. The cases where this matters are rare, but they do occur in some real-world software, where sticking a conditional move in some very hot place will have such an impact on the entire application. Conversely, you can get slowdowns of the same magnitude with badly placed conditional moves.

It's a difficult trade-off, since most branches are fairly predictable, and good branch prediction and speculative execution can very often beat a conditional move.


I'm not sure about this "RISC way" stuff. From a uarch standpoint the RISC vs CISC distinction is moot and from an ISA standpoint the only real quantifiable difference seems to be being a load-store architecture.

ISAs without conditional moves tend to have predicated instructions which are functionally the same thing. I'm not actually aware of any traditionally RISC architectures that have neither conditional moves or predicated instructions. While ARMv7 removed predicated instructions as a general feature ARMv8 gained a few "conditional data processing" instructions (e.g. CSEL is basically cmov), so clearly at least ARM thinks there's a benefit even with modern branch predictors.

Conditional instructions are really, really handy when you need them. It's an escape hatch for when you have an unbiased branch and need to turn control flow into data flow.


We were talking ISAs so let's focus on that.

The quantifiability comes from measuring results when you give compilers new instructions, vs paying implementation complexity (time, money and future baggage to support the insn forever). The upsides and downsides here come in different units so it's still tricky.

Lots of instructions can be proposed with impressive qualitative speeches convincing you how dandy they are, but in the end it's down to the real world speedup yield vs the price you pay in complexity and resulting second order effects.

(In rarer cases the instructions might be added not for performance reasons but to ease complexity and cost, that's where qualitative arguments still have a place when arguing for adding instructions).

It's fine if we don't have the evidence in this thread - I was just asking on the off chance that someone can point to a reference.


It's not like someone is proposing some crazy new instruction to do vector math on binary coded decimals while also calculating CRC32 values as a byproduct. It's conditional move. Every ISA I can think of has that.


This prompted me to look through some RISC ISAs (+x86), there may be errors since I made just a cursory pass.

Seems the following have conditional moves: MIPS since IV, Alpha, x86 since PPRo, SPARC since SPARCv9

The following seem to omit conditional moves: AVR, PowerPC, Hitachi SH, MIPS I-III, x86 up to Pentium, SPARC up to SPARCv8, ARM, PA-RISC (?)

PA-RISC, PowerPC, ARM at least do a lot of predication and make a high investment to conditional operations (by way of dedicating a lot of bits in insn layout to it), but also end up using it a lot more often than conditional move tends to be used.


ARMv7's Thumb2 has general predication of hammocks via "if-then", and ARM itself had general predication. ARMv8 has conditional select, which is quite a bit richer than conditional move. POWER has "isel". Seeing an ISA evolve a conditional move later in life is pretty strong evidence that it was useful enough to include. So would modify your list to be:

ISAs that evolved conditional move:

  - MIPS
  - SPARC
  - x86
  - POWER (isel)
ISAs that started life with it:

  - ARM (via general predication)
  - Alpha
  - IA64 (via general predication)


Good list.

Observation re list of ISAs that evolved conditional move vs ISAs that omit conditional move: MIPS, POWER, x86, SPARC all targeted high power "fat core" applications at the point where it got added. AVR, Hitachi SH, PowerPC didn't add it while being driven more by low power / embedded applications. And many ISAs continued to see wide use in the pre-cmov versions of the ISA in embedded space (eg MIPS) after the additions. (PowerPC even removed it when being modeled after POWER)


To be clear for anyone not so up-to-speed on this: what AArch64 has (conditional select) is strictly less expressive than AArch32 (general predication).

The take away there is that general predication was found to be overly complex where the vast (vast!) majority of the benefit can be modelled with conditional select.


Its less than general predication, but a little bit more than cmov/csel. The second argument can be optionally incremented and/or complemented. Combined with the dedicated zero register, you can do all sorts of interesting things to turn condition-generating instructions into data. A few interesting ones include:

   y = cond ? 0 : -1;
   y = cond ? x : -x;
   x = cond ? 0 : x+1;  //< look ma, circular addressing!


Yes. There are cases where cmov is a killer beast and for example it makes your browser faster.

JSC goes to great efforts to select it in certain cases where it’s a statistically significant overall speed up. I think the place where it’s the most effective for us is converting hole-or-undefined to undefined on array load. Even on x86 where cmov is hella weird (two operands, no immediates) it ends up being a big win.


You get 2x speedup on Quicksort and all related algorithms using CMOV instructions, so: yes.

https://cantrip.org/sortfast.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: