Hacker News new | past | comments | ask | show | jobs | submit login
ML serving is about optimization and portability (hamel.dev)
5 points by cgwu on Feb 16, 2023 | hide | past | favorite | 1 comment



A problem thats getting bigger is accelerator diversity.

The choices for regular devs used to be

- CUDA

- (unoptimized) gpu shaders

- Some weird proprietary block that only does weird proprietary things, like mobile NPUs or the old Intel blocks.

But now we have proper matrix instructions in AMD/Intel GPUs, proper AMD/Intel NPUs in their laptops, Apple GPUs and NPUs, a growing number of reasonably affordable and increasingly ergonomic non-GPU cloud accelerators...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: