He's running quantized Q4 671b. However, MoE doesn't need cluster networking so ...

		cma 9 days ago \| parent \| context \| favorite \| on: How to Run DeepSeek R1 671B Locally on a $2000 EPY... He's running quantized Q4 671b. However, MoE doesn't need cluster networking so you could probably run the full thing on two of them unquantized. Maybe the router could be all resident in GPU RAM instead of in contrast offloading a larger percentage of everything there, or is that already how it is set up in his gpu offload config?