Hacker News new | past | comments | ask | show | jobs | submit login

DeepSeek does charge for the API (just 20-30x cheaper than o1), I assume OP is eating the cost.

https://api-docs.deepseek.com/quick_start/pricing/






i see and where is https://groq.com/ in this. They used to be the cheapest?

Right now Deepseek's official hosting is cheaper than everyone else who can manage to run the model, including Deepinfra. I haven't seen any good hypotheses as to why other than their large batch size and speculative decoding.

DeepSeek-V2/V3/R1's model architecture is very different from what Fireworks/Together/... were used to.

That's their "business" model (okay, they don't care about business that much for now, but still) too: you can't run it efficiently without doing months of work we already did, so even with all weights open you can't compete with us.


Speaking of which, Groq just announced the distilled Llama 70B variant: https://x.com/GroqInc/status/1883734298488180826

The "why is DeepSeek so much cheaper than o1" question is currently a mystery, with the best guess being that compute in China is cheaper.

DeepSeek is use way less computing to achieve the same results. That's the whole difference.

Then a US compute provider should be able to launch a similarly-priced competitor (e.g. to capture the enterprise market concerned about the China associations) using the open-source version and drastically undercut OpenAI.

I suspect that won’t be the case.


> Then a US compute provider should be able to launch a similarly-priced competitor

Right, you just need a few months to implement efficient inference for MLA + their strangely looking MoE scheme + ..., easy!

Oh wait, the inference scheme described in their tech report is pretty much an exact fit for H800s. So if you run the recipe on H100s you are wasting the potential of your H100s. Otherwise have fun making variations to the serving architecture.

To be fair, we had chance. If someone decided to replicate the effort to serve their models back in May 2024 when DeepSeek-V2 was out we'd have it now. But nobody had interest as DS-V2 was pretty mediocre. They (and whoever realized the potential) made big bet and it is paying off.


The model is much smaller. Compute in China should be more expensive considering all the US restrictions

No one knows the size of o1. The only hint was a paper that suggested it was 200B parameters.

Meanwhile, DeepSeek R1 is known to be 671B parameters because it is open-source.


R1 is a mixture model with only 37B active params. So while it's definitely expensive to train, it's rather light on compute during inference. All you really need is lots of memory.

Electricity prices in China are about half of what they are in the US, I expect the rest is similar.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: