Suppose I ask for two H100s. Will I have GPU P2P capabilities?

ekzhang · 2024-12-02T22:54:35 1733180075

Yep! This is something we have internal tests for haha, you have good instincts that it can be tricky. Here's an example of using that for multi-GPU training https://modal.com/docs/examples/llm-finetuning

doctorpangloss · 2024-12-02T23:27:48 1733182068

Okay, well think very deeply about what you are saying about isolation; the topology of the hardware; and why NVIDIA does not allow P2P access even in vGPU settings except in specific circumstances that are not yours. I think if it were as easy to make the isolation promises you are making, NVIDIA would already do it. Malformed NVLink messages make GPUs fall off the bus even in trusted applications.

thundergolfer · 2024-12-02T22:53:20 1733180000

Yes it will.

(I work at Modal.)