Hacker News new | past | comments | ask | show | jobs | submit login

The secret sauce is the data.

I wouldn't hold my breath on getting access to it.






Just about anything useful in the secret sauce data can be distilled from the model by inspecting the logits; for example, they published distills using Llama 3.1 70b as a base, Qwen 32b, etc etc.

There is no "secret" sauce. Only sauce.

Additionally, R1-Zero shows that you don't even really need much secret sauce data, since they trained it with zero SFT data. Take an existing base model, do GRPO RL, and tada: you have a SOTA reasoning model. SFT data improves it, but the secret sauce isn't in the data.


Indeed. Litigation exposure is just too great when releasing the training data.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: