NVidia is making boatloads of money because their driver works and they have a software library called CUDA that accelerates neural networks.
Nobody expects AMD to match them, but George thought they could at least write a GPU driver. If AMD can get that driver out, then George could provide a competitor to CUDA (for neutral networks only). They'd both make boatloads of money.
However, AMD was less capable than expected and their drivers were too buggy to run neural networks like those needed for the MLPerf benchmark. So now, it appears that AMD, Tinybox, and investors like me won't be making boatloads of money.
>However, AMD was less capable than expected and their drivers were too buggy to run neural networks like those needed for the MLPerf benchmark. So now, it appears that AMD, Tinybox, and investors like me won't be making boatloads of money.
This is where the melodrama kicks in.
They reverse-reversed course less than a week later and now they're back on AMD again.
He talks about in his streams how horrible the communication he’s received from AMD is. From what I listened to, it seemed like that was more of why he was giving up initially. Why do a bunch of free work for a massive company that won’t even communicate with you? Especially when he’s doing them a massive favor for next to no investment on their end?
There's a chance they (maybe specifically the lawyers) know something we don't. I mean, maybe they are absurdly incompetent in listening to feedback, while at the same time achieving technically great things in hardware. But after so many years and seeing all the AI money going to the competitor... That seems less and less likely every day.
"Organizationally unable to make competent software, perfectly able to make great hardware" seems to be the common case with hardware companies, if not de facto standard. Exceptions are rare.
Apart from "trying to implement this will cost us more in CUDA API copying lawsuits then it could earn", I don't know.
But it's not just that they can't make competent software. It's that everyone tells them they should try, that it looks like a pile of money ready to pick up, that people try doing it on their own... and AMD does nothing. They're not even taking the chance to fail/succeed. Can you imagine that Lisa Su doesn't get asked about this at least once a week?
One individual doesn't suffer from the problem of being pulled in multiple different directions by multiple people. A company is not typically led by one person "dictator-style" but instead groups of people who try to make decisions together, sometimes not agreeing.
Sure, but I don't think this is "too many cooks in the kitchen," I think it's the opposite: hardware companies tend to be structurally incapable of spending as much as they should on software because everyone in the hardware space has the same bias. The economics of the space select for it in the short term and against it in the long term, creating the neverending foot-gun party we observe.
AMD has simply never invested into software. Their code has been atrocious since even before AMD bought ATI. ATI "Catalyst Control Center" was their consumer driver code before Vista and into Windows 7 IIRC, and that was utter trash. Granted, nVidia's drivers were ALSO trash back then, accounting for literally 65% of ALL Vista BSODs.
nVidia decided to redouble their efforts, and now they might still crash occasionally, but are largely way better at driver stability and they brought CUDA into the world at the same time.
AMD decided that shitty software didn't seem to stop them from selling GPUs, and also we're too busy desperately surviving a decade of Intel anti-competitive practices that nearly killed the company, and bet everything on Ryzen. They also worked to make pretty good physical GPU hardware. Meanwhile, their GPUs still couldn't run Blender as fast as a similarly specced nVidia card because their OpenCL implementation was god awful. They ran at literally half the render speed of a similar nVidia GPU. It was stuck on OpenCL 1.x the whole time, because the 2.x implementation was literally broken. They nearly didn't have ANY hardware render solution for an update to the Blender rendering engine in 3.x because OpenCL 1.x literally couldn't do what they wanted, and ROCm is a joke. AMD engineers helped put together an emergency/late breaking fix to create an HIP implementation, and that works, at least mostly.
My pet theory is that not only does AMD not give a fuck about software, but they saw how nVidia was struggling with market segmentation from consumer cards being effective compute cards, and didn't want to run into those same struggles if they had a real CUDA competitor. Instead, they got to rub nVidia's face into the dirt with their GPUs that had way more VRAM, and not worry that it would chew into their professional GPU profit margins, because you can't compute on consumer cards. Oh, I forgot to mention, the whole time this nonsense is going on, AMD is pushing really hard to get their professional GPUs into Supercomputer clusters, and has several premier supercomputer implementations where their GPUs have no problem being used for top level compute tasks, almost like they CAN actually write GPU compute software and just don't give it to consumers.
Slight mistake in your description: CUDA is an out of date API that was replaced by Khronos's own official compute APIs. Khronos is a standards consortium that Nvidia is a founding member of.
Although the marketing department at Nvidia still pushes for greenfield CUDA codebases, no new code should be written in it, and they should opt for open source international standards only. Khronos APIs are implemented by over 120 vendors.
With my past experience in Khronos, NVIDIA is indeed a member and they sent decent guys to the meetings -- but only for strategic reasons, rather than "advocating for open standards" as you described. My experiences there actually told me the opposite that they will never drop CUDA. Objectively they also have incentive to do so: fighting with 100-ish companies to ratify something is always slower than rolling out an feature in an ecosystem you have total control of.
How may standards Khronos endorced over the years/decades on compute? From an uneducated and external view, it seems every 2-5years there is a new standard.
Ultimately 2. OpenCL, and Vulkan (via it's compute shader).
Sycl's job isn't that, its meant to abstract implementations of common components across different kinds of hardware, and doesn't force you into any particular style of impl. As in, I could write a component for Sycl for my GPU in OpenCL and what Sycl would abstract away from the consumer of my component would be the entire usage of OpenCL itself; but I could write a component for a DSP, and it'd use an entirely closed source SDK for that hardware and is entirely opaque, and a Sycl user could use that impl for that function of they owned that DSP (instead of a CPU-based or GPU-based impl).
Also, Vulkan's compute doesn't replace OpenCL (not even in the sense that Vulkan, as a graphics API, replaces OpenGL). They're different levels of abstraction. Most Vulkan games are written almost entirely in compute shaders (ex: the powerhouse that is the Doom 2016 and Doom Eternal engines; and why they perform so fucking amazingly on paltry hardware like the original revision Xbox One, or hell, even the Switch).
In addition, I almost consider DX12 a flavor of Vulkan. Same job, written largely by the same people from the same companies, but instead of being OpenGL C-dialect flavored, its D3D C++-dialect flavored, but they both have entirely equivalent APIs that often call the same driver internals and produce nearly identical MIR. Microsoft did this on purpose to reflect the nature of how modern GPUs are almost entirely software renderers, sans certain parts of the texture units.
Nobody expects AMD to match them, but George thought they could at least write a GPU driver. If AMD can get that driver out, then George could provide a competitor to CUDA (for neutral networks only). They'd both make boatloads of money.
However, AMD was less capable than expected and their drivers were too buggy to run neural networks like those needed for the MLPerf benchmark. So now, it appears that AMD, Tinybox, and investors like me won't be making boatloads of money.