Why is retraining not allowed in this scenario? Yes, the model will know the breakthrough if you retrain. If you force the weights to stay static by fiat, then sure it's harder for them to learn, and will need go learn in-context or whatever. But that's true for you as well. If your brain is not allowed to update any connections I'm not sure how much you can learn either.
The reason that the models don't learn continuously is because it's currently prohibitively expensive. Imagine OpenAI retraining a model each time one of its 800m users sends a message. That'd make it aware instantly of every new development in the world or your life without any context engineering. There's a research gap here too but that'll be fixed with time and money.
But it's not a fundamental limitation of transformers as you make it out to be. To me it's just that things take time. The exact same architecture will be continuously learning in 2-3 years, and all the "This is the wrong path" people will need to shift goalposts. Note that I didn't argue for AGI, just that this isn't a fundamental limitiation.
The reason that the models don't learn continuously is because it's currently prohibitively expensive. Imagine OpenAI retraining a model each time one of its 800m users sends a message. That'd make it aware instantly of every new development in the world or your life without any context engineering. There's a research gap here too but that'll be fixed with time and money.
But it's not a fundamental limitation of transformers as you make it out to be. To me it's just that things take time. The exact same architecture will be continuously learning in 2-3 years, and all the "This is the wrong path" people will need to shift goalposts. Note that I didn't argue for AGI, just that this isn't a fundamental limitiation.