That's not really true - it's just (possibly partial) defunctionalisation. The problem isn't that we don't know how to do it, but that the necessary whole program architecture has various drawbacks.
See Stalin and MLton for examples of a static compiler performing such analyses.
Consider the case of a program which applies a processing function to pixels in an image. Which processing function to run depends a command line parameter. How would whole program analysis help you know which function you are going to use? But a JIT will see you keep calling the same function and inline it. Not even profile directed feedback will help you if each time you run the program you use a different function.
I know Stalin and MLton but not the research you mention - can you point me at any papers?
It's true that whole program compilation doesn't cover speculation (and many other cases of dynamicism, like running code that you download or construct at runtime). But it does allow inlining through a function pointer as in the OP, which you suggested is impossible for a static compiler.
The classic paper on defunctionalisation is Reynolds' "Definitional Interpreters for Higher-Order Programming Languages". There's also a huge whack of papers at http://mlton.org/References, some of which go into MLton's compilation strategy (I don't remember which ones to point you at, though).
See Stalin and MLton for examples of a static compiler performing such analyses.