Sure! I tried pybind11, and some other things. cppyy was the first I tried that didn't give me any trouble. I've been using it pretty heavily for about a year, and still no trouble.
It seems like you might be able to enable some optimizations with EXTRA_CLING_ARGS. Since it's based on cling, it's probably subject to whatever limitations cling has.
To be honest, I don't know much about the speed, as my use-case isn't speeding up slow code.