Well, Go and Rust can do better threading on this problem than a not-really-optimized C code with OpenMP. Thats a start. Message passing has to be shown next, and then one can decide whether the installation and runtime of Go/Rust/Python/Julia on something like an IBM Blue Gene/Q with its custom compute node OS is easier than a link flag.