Hacker News new | past | comments | ask | show | jobs | submit login

In context learning may act like fine tuning, but crucially does not mutate the state of the system. The same model prompted with the same task thousands of times is no better at it the thousandth time than the first.



GPT-3 is horrible at arithmetic. Yet if you define the algorithmic steps to perform addition on 2 numbers, accuracy on addition arithmetic shoots up to 98% even on very large numbers. https://arxiv.org/abs/2211.09066 Think about what that means.

"Mutating the system" is not a crucial requirement at all. In context learning is extremely over-powered.


> Yet if you define the algorithmic steps to perform addition on 2 numbers, accuracy on addition arithmetic shoots up to 98% even on very large numbers. https://arxiv.org/abs/2211.09066 Think about what that means.

That means that even with the giant model, you need to stuff even the most basic knowledge for dealing with problems of that class into the prompt space to get it to work, cutting into conversation depth and per-response size? The advantage of GPT-4’s big window and the opportunity it provides for things like retrieval and deep iterative context shrinks if I’ve got to stuff a domain textbook into the system prompt so it isn’t just BSing me.


> Think about what that means.

It means you have natural language programming. We would need to prove that natural language programming is more powerful than traditional programming at solving logical problems, I haven't seen such a proof.


> Yet if you define the algorithmic steps to perform addition on 2 numbers

You’re limited by the prompt size, which might be fine for simple arithmetic.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: