IPC is _usually_ a good measure for the last phase of optimization. But it is only the local Δ that is meaningful, comparing IPC across different vendors is only useful as a gross measure.
It's not even useful as a gross measure, unfortunately. Too many moving parts in the way.
Say, if you used IPC only then you'd probably pick the latest Apple ARM CPU. Except it cannot go as high clock in any of the subunits as top AMD and Intel, cache is slower, and memory bandwidth abysmal in comparison.
Performance in seconds or performance per watt (unit is 1/(W*s)) in the workload you want to run is useful.
You cannot even estimate anything using microbenchmarks anymore easily since they expanded per unit local clocking in x86... (AMD in Zen+ and expanded in Zen 2, most ARM mobile CPUS, Intel since Broadwell E, expanded in Skylake.)
You get traps such as going for AVX and locally overheating the CPU where SSE2 equivalent would go faster in real life. It's all funny business.
IPC also heavily favors RISC instead of SIMD, likewise is biased against multicore. (Though not as much.)
What counts as an instruction anyway?