Isn't the problem the lack of advanced features for executing the current ISA with speed? I thought RISC-V chips seen in the wild do not pipelining, out-of-order, register renaming, multiple int/float/logic units, speculation, branch predictors, multi-tier caching, etc. The lack of speed isn't really related to a few missing vector instructions.