I think the only real way to compare IPC is to actually talk to the architects. Trying to write microbenchmarks is a fools errand when you aren't aware of how the cpu processes the instructions you give it. Are you actually stressing the fpu, or is the cpu speculatively executing and then branch predicting the workload (common for micro loops)? If it is, is that what you meant to test? Are you trying to compare like for like (in which case you have to write assembly), or are you trying to write performance benchmarks (and then the only meaningful metric is cpu time)?
This is an interesting idea, but I'm not sure how you could derive meaning from comparing two vastly different architectures at such a high level.
This is an interesting idea, but I'm not sure how you could derive meaning from comparing two vastly different architectures at such a high level.