I've been critical of AMD's failure to compete in AI for over a decade now, but ...

matchagaucho · 2024-02-12T16:39:41 1707755981

But AMD was formed to shadow Intel's x86?

atq2119 · 2024-02-12T18:49:13 1707763753

AMD was founded at almost the same time as Intel. X86 didn't exist at the time.

But yes, AMD was playing the "follow x86" game for a long time until they came up with x86-64, which evened the playing field in terms of architecture.

modeless · 2024-02-12T16:43:44 1707756224

ISAs are smaller and less stateful and better documented and less buggy and most importantly they evolve much more slowly than software APIs. Much more feasible to clone. Especially back when AMD started.

paulmd · 2024-02-12T16:57:10 1707757030

PTX is just an ISA too. Programming languages annd ISA representations are effectively fungible, that’s the lesson of Microsoft CLR/Intermediate Language and Java too. A “machine” is a hardware and a language.

modeless · 2024-02-12T17:00:52 1707757252

PTX is not a hardware ISA though, it's still software and can change more rapidly.

paulmd · 2024-02-12T17:33:06 1707759186

Not without breaking the support contract? If you change PTX format then CUDA 1.0 machines can no longer it and it's no longer PTX.

Again, you are missing the point. Java is both a language (java source) and a machine (the JVM). The latter is a hardware ISA - there are processors that implement Java bytecode as their ISA format. Yet most people who are running Java are not doing so on java-machine hardware, yet they are using the java ISA in the process.

https://en.wikipedia.org/wiki/Java_processor

https://en.wikipedia.org/wiki/Bytecode#Execution

any bytecode is an ISA, the bytecode spec defines the machine and you can physically build such a machine that executes bytecode directly. Or you can translate via an intermediate layer, like how Transmeta Crusoe processors executed x86 as bytecode on a VLIW processor (and how most modern x86 processors actually use RISC micro-ops inside).

these are completely fungible concepts. They are not quite the same thing but bytecode is clearly an ISA in itself. Any given processor can choose to use a particular bytecode as either an ISA or translate it to its native representation, and this includes both PTX, Java, and x86 (among all other bytecodes). And you can do the same for any other ISA (x86 as bytecode representation, etc).

furthermore, what most people think of as "ISAs" aren't necessarily so. For example RDNA2 is an ISA family - different processors have different capabilities (for example 5500XT has mesh shader support while 5700XT does not) and the APUs use a still different ISA internally etc. GFX1101 is not the same ISA as GFX1103 and so on. These are properly implementations not ISAs, or if you consider it to be an ISA then there is also a meta-ISA encompassing larger groups (which also applies to x86's numerous variations). But people casually throw it all into the "ISA" bucket and it leads to this imprecision.

like many things in computing, it's all a matter of perspective/position. where is the boundary between "CMT core within a 2-thread module that shares a front-end" and "SMT thread within a core with an ALU pinned to one particular thread"? It's a matter of perspective. Where is the boundary of "software" vs "hardware" when virtually every "software" implementation uses fixed-function accelerator units and every fixed-function accelerator unit is running a control program that defines a flow of execution and has schedulers/scoreboards multiplexing the execution unit across arbitrary data flows? It's a matter of perspective.

modeless · 2024-02-12T19:30:10 1707766210

You are missing the point. PTX is not designed as a vendor neutral abstraction like JVM/CLR bytecode. Furthermore CUDA is a lot more than PTX. There's a whole API there, plus applications ship machine code and rely on Nvidia libraries which can be prohibited from running on AMD by license and with DRM, so those large libraries would also become part of the API boundary that AMD would have to reimplement and support.

Chasing CUDA compatibility is a fool's errand when the most important users of CUDA are open source. Just add explicit AMD support upstream and skip the never ending compatibility treadmill, and get better performance too. And once support is established and well used the community will pitch in to maintain it.