Interesting, has anyone been doing this? I.e. training/fine-tuning an LLM agains... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

falcor84 3 months ago | parent | context | favorite | on: RLHF is just barely RL

Interesting, has anyone been doing this? I.e. training/fine-tuning an LLM against an actual coding environment, as opposed to just tacking that later on as a separate "agentic" contruct?

bjornsing 3 months ago [–]

I suspect that the big vendors are already doing it, but I haven’t seen a paper on it.

Consider applying for YC's W25 batch! Applications are open till Nov 12.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact