>I'm not a big fan of $150K high-end servers filled with $5000 GPUs that can be bested with clever code on a $25K server fill with $1200 consumer GPUs. But I am a huge fan of charging what you can while you are unopposed. It's just that I think that state is temporary.
There is virtually not a single "enterprise" grade product which can't be made at least 50% cheaper (or sometimes 10 times...) with off the shelf consumer grade hacked hardware....
Enterprise products always have a pretty steep markup, but what you lose with those 1200$ GPUs is both features (e.g. virtualization, thin provisioning, DMA/Cuda Direct etc.) and support.
When you buy a 5000$ CPU over a 500$ with the same performance what you pay for is reliability and support, if you don't care about that then fine, but when you need to launch a 100M$ service on top of that platform you won't really care about the price tag it's all in the cost of doing business.
Virtualization? Don't care, in fact, virtualization is what disabled P2P copies and created craptastic upload/download perf on AWS until the P2 instance.
DMA/CUDA Direct? Say Hello to P2P and staged MPI transfers, faster, cheaper (and usually better). Know your PCIE tree FTW.
Support? As someone who has been playing with GPUs for over a decade, bugs get fixed in the next CUDA release no matter Tesla or GeForce, if ever.
$100M service? Yep I'm with you. But I prefer a world without a huge barrier to entry to building that service, especially a barrier built 99% on marketecture. I want to build on commodity hardware and deploy in the datacenter.
Unfortunately, sales types seem to hate that outlook.
I don't understand your argument, so maybe this is off base, but if you are saying people in industry aren't replacing their supercomputers with commodity gpu's, you're wrong; both apple and google have massive purchase orders for commodity nvidia gpus because they aren't just cheaper, they are better at this application. And I imagine other companies are as well.
Edit: "replace" is probably not the right word, this is work that the old systems don't do well, but they aren't throwing out x86 racks for gpus of course. It's just instead of buying more of the same for machine learning applications.
They aren't buying consumer GPU's they aren't buying the NVIDIA dedicated servers, but they aren't running Geforce chips either.
If nothing else is that because you cannot virtualize Geforce line GPU's, there is no CUDA Direct or NVLINK support etc.
If you are telling me that Google is buying Geforce GPU's and flashing the bios with a custom bios ripped off a Quadro card so they can do PCIe passthrough in a hypervisor and initialize the cards then sorry not buying it.
Containers would imply there is no hypervisor involved, only a dri device exposed by the kernel and bind-mounted into the namespace. You would still need support for multiple contexts but that doesn't require multiple (virtual) PCI devices or an IOMMU.
There is virtually not a single "enterprise" grade product which can't be made at least 50% cheaper (or sometimes 10 times...) with off the shelf consumer grade hacked hardware....
Enterprise products always have a pretty steep markup, but what you lose with those 1200$ GPUs is both features (e.g. virtualization, thin provisioning, DMA/Cuda Direct etc.) and support. When you buy a 5000$ CPU over a 500$ with the same performance what you pay for is reliability and support, if you don't care about that then fine, but when you need to launch a 100M$ service on top of that platform you won't really care about the price tag it's all in the cost of doing business.