This is a microbenchmark. It's a tight loop that calls the function 500 million times. The function itself just adds two numbers. It's pretty close to the best possible improvement for inlining a function. If that's what your program does, and it's a big part of what your program does, then inlining it may be a big performance win. Even then, addressing this may not be a worthwhile tradeoff. Even then, as others have pointed out, you could use a minimizer.
If your program is not CPU-bound, or if it doesn't have a function that's called millions of times in a tight loop, or if that doesn't make up the majority of time spent on CPU, or if that function does more than execute a couple of instructions, then the performance difference will likely be enormously less.
Agreed that's interesting (in a mostly academic sense), but the thrust of the post is about using that fact to make choices in writing code, and there's an awful lot of discussion here about that idea. That's what I was addressing.
If your program is not CPU-bound, or if it doesn't have a function that's called millions of times in a tight loop, or if that doesn't make up the majority of time spent on CPU, or if that function does more than execute a couple of instructions, then the performance difference will likely be enormously less.