I recall it was a microprogrammed processor - a set of chips including a rom 'executed' the instruction set. More of an emulator than a processor. So maybe nowhere in that design for a cache?
Possibly - the design started back in 1975, which is when we were still figuring out a lot of how processors should be designed. That's also far back enough that, frankly, I don't have much of an intuition for what people were thinking.
By the way, I am struck by how close their object model sounds like the JVM. A major difference, though, is that the JVM can do some extra work at runtime to figure out when it is profitable and safe to throw away the overhead causes by safety and isolation - JITing. Then it can just execute instructions optimized for the hardware, not constrained by the object model. When your hardware maps to your object model, you can't do that.
The IBM 801 project started around the same time but its proto-RISC approach was kind of the anti-432. The 801 papers give a pretty readable explanation for "why we are proposing the opposite of what everyone else is doing".
I don't. I asked my brother but he just reminisced:
"I was at Intel as that project ramped up. One of my friends moved to Portland to work on it. I tried to move, but they didn't want any system software people. If I had, my life would have gone a totally different path...