Hacker News new | past | comments | ask | show | jobs | submit login

Several things:

- The fprem1 instruction is actually a long microcode sequence and is quite slow; 26-50 cycles on SNB according to Agner Fog. Several iterations of that loop are necessary for a complete reduction of some operands.

- There is no analogous instruction on arm (or really, any platform that isn't x86), anyway.

- If you're using the floating-point remainder operation in a performance-sensitive context, You're Doing It Wrong. Programmers have gotten so used to this that there is little value in optimizing remainder; it is rarely used in situations where the optimization would matter.




This was just a random example I picked, I'm not using that function currently. But let's just say abs(), sin(), cos()...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: