A problem thats getting bigger is accelerator diversity.
The choices for regular devs used to be
- CUDA
- (unoptimized) gpu shaders
- Some weird proprietary block that only does weird proprietary things, like mobile NPUs or the old Intel blocks.
But now we have proper matrix instructions in AMD/Intel GPUs, proper AMD/Intel NPUs in their laptops, Apple GPUs and NPUs, a growing number of reasonably affordable and increasingly ergonomic non-GPU cloud accelerators...
The choices for regular devs used to be
- CUDA
- (unoptimized) gpu shaders
- Some weird proprietary block that only does weird proprietary things, like mobile NPUs or the old Intel blocks.
But now we have proper matrix instructions in AMD/Intel GPUs, proper AMD/Intel NPUs in their laptops, Apple GPUs and NPUs, a growing number of reasonably affordable and increasingly ergonomic non-GPU cloud accelerators...