In addition to the other answers, "closer to the machine" isn't necessarily true either.
The obvious implementation of a register file for an interpreter is using an array of registers, something like:
uintptr_t registers[N];
You can then access the i-th register as registers[i], like the machine does. Except that these registers live in memory, not in CPU registers, so you aren't really that close after all! And the compiler cannot in general map these array entries to CPU registers because they are accessed indirectly, by index. Also, you typically have a different number of virtual registers from the actual CPU.
So the advantages of register machines do not come from this theoretical closeness to the machine. They can still be better, though, because you eliminate a lot of bytecode for shuffling data around. I think the canonical source for the possible speedups is "Virtual Machine Showdown: Stack vs. Registers": http://www.usenix.org/events/vee05/full_papers/p153-yunhe.pd...
This paper also gives a very nice copy propagation algorithm that eliminates much of the need for a real register allocator, if you start from stack-based input code.
The obvious implementation of a register file for an interpreter is using an array of registers, something like:
You can then access the i-th register as registers[i], like the machine does. Except that these registers live in memory, not in CPU registers, so you aren't really that close after all! And the compiler cannot in general map these array entries to CPU registers because they are accessed indirectly, by index. Also, you typically have a different number of virtual registers from the actual CPU.So the advantages of register machines do not come from this theoretical closeness to the machine. They can still be better, though, because you eliminate a lot of bytecode for shuffling data around. I think the canonical source for the possible speedups is "Virtual Machine Showdown: Stack vs. Registers": http://www.usenix.org/events/vee05/full_papers/p153-yunhe.pd...
This paper also gives a very nice copy propagation algorithm that eliminates much of the need for a real register allocator, if you start from stack-based input code.