Hacker News new | past | comments | ask | show | jobs | submit login

It would sidestep the overflow, but prima facie it appears to be slower than the provided:

    low + (high - low)/2;
Your solution takes, depending on the architecture, between several more and nearly double the trips to the ALU.

Gimmie a sec; I've got bugger all to do at work. I'll compile it and profile it.




Would be interested in seeing the profile. Re-ordering it to the following (which the compiler might do too)

  low&high&1 + low>>1 + high>>1
Should compute in a single cycle if the register coloring is working in the pipeline. The <reg> >> 1 come out of the barrel shifter stage, the low&high&1 resolves in the load, so you end up with a single sum of three operands. Since its being stored in a separate register that would avoid a write stall in the pipeline as well.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: