Hacker News new | past | comments | ask | show | jobs | submit login

Is GPU assembly an actually in-demand skill?





So I would say skill at GPU assembly is in-demand for the elite tier of GPU performance work. Not necessarily writing much of it (though see [1] for an example, this is the kernel of multisplit as used in Nvidia's Onesweep implementation), but definitely in being able to read it so you can understand what the compiled code is actually doing. I'll also cite as evidence of that the incredible work of the engineers on Nanite. They describe writing the core of the microtriangle software renderer in HLSL but analyzing the assembler output to optimize down to the cycle level, as described in their "deep dive into Nanite virtualized geometry" talk (timestamp points to the reference to instruction-level micro-optimization).

[1]: https://github.com/NVIDIA/cccl/blob/2d1fa6bc9235106740d9373c...

[2]: https://www.youtube.com/watch?v=eviSykqSUUw&t=2073s


Only as debugging skill, when going through graphical debuggers for GPUs might be helpful, but that is about it.

Each card model is a snowflake in what they support, hence why dynamic compilers are used.


In games only one console vendor allows you to write shaders in asm, though it is not very productive, especially with RDNA. Reading the compiler output is a good-to-have skill however, for teasing the compiler into better register usage, reducing divergency, identifying problematic folded math, and debugging live GPU hangs.

In China and other places where you want to squeeze all the performance from gpus of a generation or two ago. But it's not a portable skill set (Google won't hire you) so be careful of what you wish for...



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: