Not only the reader and the compiler, that kind of threaded code really helps performance thanks to leveraging the CPU's branch target buffer to obtain great branch prediction of interpreted code, and that is one of the killer aspects to get current CPUs to perform well.