That is very, cool. I'm particularly interested in the compressed-first approach because I have some projects where minimising BRAM usage is paramount so code density really matters. The use of microcode to emulate 32-bit instructions reminds me a lot of ZPU (I still have a soft spot for that architecture) - was that an influence?
I've heard of the ZPU in passing but never looked in much detail - I didn't realize there was a GCC back-end for these machines. James Bowman's J1 CPU [0] is also stack-based and has definitely helped me shape my preferences.