Hacker News new | past | comments | ask | show | jobs | submit login

The cost of inference has be dropping by ~100x in the past 2 years.

https://a16z.com/llmflation-llm-inference-cost/




Hmm the link is saying the price of an LLM that scores 42 or above on MMLU has dropped 100x in 2 years, equating gpt 3.5 and llama 3.2 3B. In my opinion gpt 3.5 was significantly better than llama 3B, and certainly much better than the also-equated llama 2 7B. MMLU isn't a great marker of overall model capabilities.

Obviously the drop in cost for capability in the last 2 years is big, but I'd wager it's closer to 10x than 100x.


*infernonce


*inference




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: