Hacker News new | past | comments | ask | show | jobs | submit login

> but couldn't we add some training data to teach the LLM how to spell?

Sure, but then we would lose a benchmark to measure progress of emergent behavior.

The goal is not to add one capability at a time by hand - because this doesn’t scale and we would never finish. The goal is that it picks up new capabilities automatically, all on its own.




Training data is already provided by humans and certainly already does include spelling instruction, which the model is bind to because of forced tokenization. Tokenizing on words is already an arbitrary capability added one at a time. It's just the wrong one. LLMs should be tokenizing by letter, but they don't, because they aren't good enough yet, so they get a massive deus ex machina (human ex machina?) of wordish tokenization.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: