> 2) Almost, Chinchilla concerns itself with minimizing cost(training) + cost(in...

numeri · 2024-04-19T14:54:05 1713538445

Yeah, this is correct and I'm not sure what paper GP was thinking of – Chinchilla is only about finding the point at which it would be more useful to scale the model rather than training longer.

Chinchilla optimal scaling is not useful if you want to use the model, just if you want to beat some other model on some metric for the minimal training costs.

candiodari · 2024-04-23T22:36:04 1713911764

Well, my point is that "scale the model" is equivalent to upping inference costs.