Hah that was me ~12 years ago trying to get WebCL (OpenCL) through the same gate keepers. Meanwhile, in Python, we are doing multi-node multi-GPU. Maybe OpenAI's and soon Apple's success with LLMs will change the economics for them.
This is why I don't like Khronos APIs, even when actually those are the ones I know relatively well, the way they work end up being a much worse experience than writting backend specific plugins ourselves with much better tooling, also the extension spaghetti ultimately doesn't save us from multiple code paths anyway, given the differences between some of those extensions.
To pick your example, something like PyTorch ends up being a much better developer experience, similar to game engines, than relying on Khronos APIs.
https://registry.khronos.org/webgl/specs/latest/2.0-compute/
https://github.com/9ballsyndrome/WebGL_Compute_shader/issues...