Designing it is easy and always has been. Programming it is the bottleneck. Othe...

markhahn · on May 13, 2024

but programming it is "import pytorch" - nothing nvidia-specific there.

the mass press is very impressed by Cuda, but at least if we're talking AI (and this article is, exclusively), it's not the right interface.

and in fact, Nv's lead, if it exists, is because they pushed tensor hardware earlier.

achierius · on May 13, 2024

Someone does, in fact, have to implement everything underneath that `import` call, and that work is _very_ hard to do for things that don't closely match Nvidia's SIMT architecture. There's a reason people don't like using dataflow architectures, even though from a pure hardware PoV they're very powerful -- you can't map CUDA's, or Pytorch's, or Tensorflow's model of the world onto them.

WithinReason · on May 13, 2024

I'm talking about adding Pytorch support for your special hardware.

Nv's lead is due to them having Pytorch support.

KaoruAoiShiho · on May 13, 2024

Eh if you're running in production you'll want something lower level and faster than pytorch.