The entrenchment of smartphones in society dramatically increased between iPhone 4 and 14. Technical capability is just one axis.
Still, I think LLMs are different than phones in terms of scaling. Faster processor speeds don’t necessarily result in more user value for phones, but scaling up LLMs seem to predictability improve performance/accuracy.
There are no signs of diminishing returns just yet though, and no one knows if that will be at GPT-5 or GPT-5000. I suspect the performance will keep increasing drastically at least until we have a model that's been trained with essentially all available text, video and audio data. Who knows what will happen after we have something that's been trained on all of YouTube. After this maybe we (or an AI) will have figured out how to keep improving without any more data.
yeah it is, gpt3 scored in the 80th percentile for the bar, gpt4 scored top 20 percentile and is much better at math, plus having 4x the context alone gives it much more power.
it's just it's different in capabilities. chatgpt delivers different results and both have unique characteristics.
gpt4 being able to not only create images but also decipher what's in then is another huge advancement.
Gen2 another ai can create amazing videos from a text prompt. Any director or film maker wannabe with more prowess on creating the story than filming it, can now just use ai to create the film from their vision.
even more exciting is the speed that things are progressing. it was supposed to take 8 years to get chatGPT quality training down to 400k price instead of millions. Stanford did it in 6 weeks with llama and alpaca. it can run for under 600 or slower on home PCs.
Analyzing thousands of trends, both industry/niche specific and society wide. Tracking campaigns that work by monitoring social media likes, references to past slogans, etc. Potentially dedicating thousands of years worth of brain power and analysis to the coffee shop down the street's new slogan.
currently using it like driving a junior programmer.
after gpt has written some functions to my specs in natural language.
I can say for example:
- "add unit tests". It writes for all functions tests. Not perfect but not bad for short instruction like this.
- rewrite x to include y etc
the original post way back was talking about marketing, they were underwhelmed. I recently generated some slogans. They sucked.
When someone mentioned predictability/accuracy how does that apply to marketing slogans. I know how it applies to writing unit tests. The unit tests writing comes pretty close to the original posters definition of GPT as filling out templates. The sucky slogans I got were also very template like.
Would accuracy be if slogans did not suck?
At any rate there seems to be a lot of things people want to use it for where the terms accuracy / predictability don't make much sense. So making claims based on those qualities naturally causes me to ask how do they apply to all these cases - such as slogan generation where accuracy predictability are not normally metrics that apply.
Still, I think LLMs are different than phones in terms of scaling. Faster processor speeds don’t necessarily result in more user value for phones, but scaling up LLMs seem to predictability improve performance/accuracy.