>I'm not a big fan of $150K high-end servers filled with $5000 GPUs that can be ...

varelse · on Nov 13, 2016

Virtualization? Don't care, in fact, virtualization is what disabled P2P copies and created craptastic upload/download perf on AWS until the P2 instance.

DMA/CUDA Direct? Say Hello to P2P and staged MPI transfers, faster, cheaper (and usually better). Know your PCIE tree FTW.

Support? As someone who has been playing with GPUs for over a decade, bugs get fixed in the next CUDA release no matter Tesla or GeForce, if ever.

$100M service? Yep I'm with you. But I prefer a world without a huge barrier to entry to building that service, especially a barrier built 99% on marketecture. I want to build on commodity hardware and deploy in the datacenter.

Unfortunately, sales types seem to hate that outlook.

mattnewton · on Nov 12, 2016

I don't understand your argument, so maybe this is off base, but if you are saying people in industry aren't replacing their supercomputers with commodity gpu's, you're wrong; both apple and google have massive purchase orders for commodity nvidia gpus because they aren't just cheaper, they are better at this application. And I imagine other companies are as well.

Edit: "replace" is probably not the right word, this is work that the old systems don't do well, but they aren't throwing out x86 racks for gpus of course. It's just instead of buying more of the same for machine learning applications.

dogma1138 · on Nov 12, 2016

They aren't buying consumer GPU's they aren't buying the NVIDIA dedicated servers, but they aren't running Geforce chips either.

If nothing else is that because you cannot virtualize Geforce line GPU's, there is no CUDA Direct or NVLINK support etc.

If you are telling me that Google is buying Geforce GPU's and flashing the bios with a custom bios ripped off a Quadro card so they can do PCIe passthrough in a hypervisor and initialize the cards then sorry not buying it.

cambion · on Nov 12, 2016

While I agree that Google is not buying GeForce GPUs, their general use-case for GPUs does not require virtualization.

They use containers to isolate and throttle different tasks/jobs running on the same hardware.

At their scale, virtualization would be significantly wasteful in terms of manageability and overhead.

dogma1138 · on Nov 13, 2016

I think we have different meaning for virtualization when it comes to GPU.

I'm not talking about running virtual OS, I'm talking about things like rCUDA, GPU direct and RDMA.

But still even for their containers solution they need support for gpu passtrough and vGPU if not they can't run containers.

NVIDIA doesn't allow you to run GeForce cards over a hypervisor.

tmzt · on Nov 13, 2016

Containers would imply there is no hypervisor involved, only a dri device exposed by the kernel and bind-mounted into the namespace. You would still need support for multiple contexts but that doesn't require multiple (virtual) PCI devices or an IOMMU.

lightcatcher · on Nov 13, 2016

> both apple and google have massive purchase orders for commodity nvidia gpus

source?

varelse · on Nov 13, 2016

Rather than attempt to out these downlow Ge Force deep learners, why don't you ask yourself why you can only buy Titan X Pascal from NVIDIA itself.