486 ALU stuff is mostly 1 cycles as well (other than shifts, MUL and DIV, and all the weird stuff like AAD), as well as stores and loads. Taken conditional jumps are 3 cycles, non-taken 1 cycle, so loop unrolling helps a lot on 486.
You can approach 0.8 IPC on 486, assuming you know about AGI stalls and other quirks.
General 486-era C-compiler produced code, though, would be lucky to hit 0.4 IPC.
I was taking intel's 80486 promo materials at face value. Plus 486 ALU may be 1-cycle but loads and stores are not 2-cycle like they are for C-M0 so it'll still lose :)
You can approach 0.8 IPC on 486, assuming you know about AGI stalls and other quirks.
General 486-era C-compiler produced code, though, would be lucky to hit 0.4 IPC.