Hacker News new | past | comments | ask | show | jobs | submit login

Latest GPUs and efficiency are not mutually exclusive, right? If you combine them both presumably you can build even more powerful models.





Of course optimizing for the best models would result in a mix of GPU spend and ML researchers experimenting with efficiency. And it may not make any sense to spend money on researching efficiency since, as has happened, these are often shared anyway for free.

What I was cautioning people was be that you might not want to spend 500B on NVidia hardware only to find out rather quickly that you didn't need to. You'd have all this CapEx that you now have to try to extract from customers from what has essentially been commoditized. That's a whole lot of money to lose very quickly. Plus there is a zero sum power dynamic at play between the CEO and ML researchers.


Not necessarily if you are pushing against a data wall. One could ask: after adjusting for DS efficiency gains how much more compute has OpenAI spent? Is their model correspondingly better? Or even DS could easily afford more than $6 million in compute but why didn't they just push the scaling?

right except that r1 is demoing the path of approach for moving beyond the data wall

Can you clarify? How are they able to move beyond the data wall?

because they’re able to pass signal on tons of newly generated tokens based on whether they result in a correct answer, rather than just fitting on existing tokens.

it’s on the path to self play


That's Jevons Paradox in a nutshell



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: