Agreed. Release announcements and benchmarks always sound world-changing, but the reality is that every new model is bringing smaller practical improvements to the end user over its predecessor.
The point above is the said amazing multimodal version of ChatGPT was announced in May and are still not the actual offered way to interact with the service in September (despite the model choice being called 4 omni it's still not actually using multimodal IO). It could be a giant leap in practical improvements but it doesn't matter if you can't actually use what is announced.
This one, oddly, seems to actually be launching before that one despite just being announced though.
In the world of hype driven vaporware AI products[1], giving people limited access is at least proof they're not lying about it actually existing or it being able to do what they claim.
Ok, but the point is that they told me I would have flirty ScarJo ASMR whispering to me at bed time that I am a good boy, but that's not what we got is it?
I've been a subscriber since close to the beginning, cancelled 2 weeks ago. I got an email telling me that this is available, but only for Plus.
But for 30 posts per week I see no reason to subscribe again.
I prefer to be frustrated because the quality is unreliable because I'm not paying, instead of having an equally unreliable experience as a paying customer.
Not paying feels the same. It made me wonder if they sometimes just hand over the chat to a lower quality model without telling the Plus subscriber.
The only thing I miss is not being able to tell it to run code for me, but it's not worth the frustration.
Recently I was starting to think I imagined that. Back then they gave me the impression it would be released within week or so of the announcement. Have they explained the delay?
When you go into the regular, slow, audio mode there's a little info circle in the top right corner. Over time that circle has been giving periodic updates. At one point the message was that it would be delayed, and now it's saying it's "on it's way" by the end of fall.
Not perfect but they've been putting their communications in there.
The text-to-text model is available. And you can use it with the old voice interface that does Whipser+GPT+TTS. But what was advertised is a model capable of direct audio-to-audio. That’s not available.
Interestingly, the New York Times mistakenly reported on and reviewed the old features as if they were the new ones. So lots of confusion to go around.
Audio has only rolled out to a small subset of paying customers. There's still no word about the direct-from-4o image generation they demo'd. Let alone the video capabilities.
Yep, all these AI announcements from big companies feel like promises for the future rather than immediate solutions. I miss the days when you could actually use a product right after it was announced, instead of waiting for some indefinite "coming soon."
As an entrepreneur, I do this often. In order to sleep better at night, I explain to myself that it’s somewhat harmless to give teasers about future content releases. If someone buys my product based on future promises or speculation, they’re investing into the development and my company’s future.
[0] https://openai.com/index/hello-gpt-4o/