> but self-hosting these wont be a viable alternative to using providers like Op... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		menaerus 8 days ago \| parent \| context \| favorite \| on: How to Run DeepSeek R1 671B Locally on a $2000 EPY... > but self-hosting these wont be a viable alternative to using providers like OpenAI for business applications for a while. Why not? While 3-4 tok/s is still on the lower end, it is still usable to the point that I can use it for any task that doesn't require me to get into a real-time communication with the model. In other words, I don't mind waiting a 1-minute for good-enough response from the model for topic that would take me multiples of that to compile and research on my own. It's a clear net win.

NeutralCrane 3 days ago [–]

If you have so little throughput that you don’t need more than 3 tokens a second, you are processing so little data that your costs from the LLM providers won’t even sniff the $2000 you will spend on the hardware to self host.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact