I agree with you.
That's why all to the small model are showing some benchmarks putting them close to GPT3.5 or even 4, only because they use specific test tasks !
In a way it just shows the amazing performance that will come from small future models
In a way it just shows the amazing performance that will come from small future models