Is there a reasonable way to do this without disabling preemption? All of these examples require the code to be in the kernel or for the kernel to be modified, which isn't really useful in all the places you need to benchmark.
Its very easy to get two timestamps at the start and end of the routine and subtract the difference with Chrono. No cpu pinning or preemption control is necessary.
https://www.intel.com/content/dam/www/public/us/en/documents...
3.2.1 The Improved Benchmarking Method