It’s difficult to write a fast interpreter in Go. It doesn’t optimize interpreter inner loops particularly well, you can’t do the data structure tricks you can do in C, and there is no JIT compiler available like in Java.
The best way forward might be an option to compile some Joker code by generating Go code for it.
(This is based on writing an interpreter a few years ago. Perhaps Go’s compiler has improved?)
Just speculating on what GP specifically meant, but Go lacks the ability to pack structs, and probably lacks the equivalent of tricks like computed gotos [0] to increase bytecode interpreter speeds. In general, Go seems to (intentionally) lack a lot of low-level control of code generation, preferring a "there's one way to do things, and it either works, or it's a bug we'll fix" approach.
Which is probably for the best in "most" software, but interpreters typically use weird hacks to squeeze more performance out of the rock. Wren is a good example [1] of some of these optimizations, and has splendid comments/documentation.
For one thing you have C unions. You can store different types in the same place and use information stored elsewhere to know what the bits really mean. It’s horribly unsafe compared to a Go interface or the tagged unions in other languages, but uses less space.
Also, there is the NaN boxing trick [1], which allows you to store a value that is essentially a tagged union that is either a double, a pointer, or an integer, in the same space as a double. To get these in a safe language, they essentially have to be built in already.
Go’s interface type is very high overhead compared to the union types used in other languages because it uses an entire pointer as a tag. Though, it only matters in special cases like interpreters. Most performance-intensive code isn’t nearly so dynamic.
Yes, the problem is that you still need pointers, and furthermore, pointers to different kinds of objects, of different sizes. Also, Go’s garbage collector should know about the pointers.
You could represent the heap as an array, use array offsets instead of pointers, and do your own garbage collection, but that’s pretty low level and you might as well compile to WebAssembly.
The best way forward might be an option to compile some Joker code by generating Go code for it.
(This is based on writing an interpreter a few years ago. Perhaps Go’s compiler has improved?)