A lot of the progress in the last 3-4 years was predictable from GPT-2 and especially GPT-3 onwards - combining instruction following and reinforcement learning with scaling GPT. With research being more closed, this isn't so true anymore. The mp3 case was predictable in 2020 - some early twitter GIFs showed vaguely similar stuff. Can you predict what will happen in 2026/7 though, with multimodal tech?
I simply don't see it a being the same today. The obvious element of scaling or techniques that imply a useful overlap isn't there. Whereas before researchers brought together excellent and groundbreaking performance on different benchmarks and areas together as they worked on GPT-3, since 2020, except instruction following, less has been predictable.
Multi modal could change everything (things like the ScienceQA paper suggest so), but also, it might not shift benchmarks. It's just not so clear that the future is as predictable or will be faster than the last few years. I do have my own beliefs similar to Yann Lecun about what architecture or rather infrastructure makes most sense intuitively going forward, and there's not really the openness we used to have from top labs to know if they are going these ways, or not. So you are absolutely right that it's hard to imagine where we will be in 20 years, but in a strange way, because it is much less clear than in 2020 where we will be in 3 years time onwards, I would say it is much less guaranteed progress than it is felt by many...
I was also thinking about how quickly AI may progress and am curious for your or other people's thoughts. When estimating AI progress, estimating orders of magnitude sounds like the most plausible way to do it, just like Moore's law has guessed the magnitude correctly for years. For AI, it is known that performance increases linearly when the model size increases exponentially. Funding currently increases exponentially meaning that performance will increase linearly. So, AI will increase linearly as long as the funding does too. On top of this, algorithms may be made more efficient, which may occasionally make an order of magnitude improvement. Does this reasoning make sense? I think it does but I could be completely wrong.
I simply don't see it a being the same today. The obvious element of scaling or techniques that imply a useful overlap isn't there. Whereas before researchers brought together excellent and groundbreaking performance on different benchmarks and areas together as they worked on GPT-3, since 2020, except instruction following, less has been predictable.
Multi modal could change everything (things like the ScienceQA paper suggest so), but also, it might not shift benchmarks. It's just not so clear that the future is as predictable or will be faster than the last few years. I do have my own beliefs similar to Yann Lecun about what architecture or rather infrastructure makes most sense intuitively going forward, and there's not really the openness we used to have from top labs to know if they are going these ways, or not. So you are absolutely right that it's hard to imagine where we will be in 20 years, but in a strange way, because it is much less clear than in 2020 where we will be in 3 years time onwards, I would say it is much less guaranteed progress than it is felt by many...