Copilot isn't underwhelming, it's shit. What's impressive is how Microsoft managed to gut GPT-4 to the point of near-uselessness. It refuses to do work even more than OpenAI models refuse to advise on criminal behavior. In my experience, the only thing it does well is scan documents on corporate SharePoint. For anything else, it's better to copy-paste to a proper GPT-4 yourself.
(Ask Office Copilot in PowerPoint to create you a slide. I dare you! I double dare you!!)
The problem with demos is that they're staged, they showcase integrations that are never delivered, and probably never existed. But you know what's not hype and fluff? The models themselves. You could hack a more useful Copilot with AutoHotkey, today.
I have GPT-4o hooked up as a voice assistant via Home Assistant, and what a breeze that is. Sure, every interaction costs me some $0.03 due to inefficient use of context (HA generates too much noise by default in its map of available devices and their state), but I can walk around the house and turn devices on and off by casually chatting with my watch, and it work, works well, and works faster than it takes to turn on Google Assistant.
So no, I honestly don't think AI advances are oversold. It's just that companies large and small race to deploy "AI-enabled" features, no matter how badly made they are.
Basically, functional AI interactions are prohibitively resource intensive and expensive. Microsoft's non-coding Copilots are shit due to resource constraints.
Basically, yes. My last 4 days of playing with this voice assistant cost me some $3.60 for 215 requests to GPT-4o, amounting to a little under 700 000 tokens. It's something I can afford[0], but with costs like this, you can't exactly give GPT-4 access out to people for free. This cost structure doesn't work. It doesn't with GPT-4o, so it more than twice as much didn't with earlier model iterations. And yet, that is what you need if you want a general-purpose Copilot or Assistant-like system. GPT-3.5-Turbo ain't gonna cut it. Llamas ain't gonna cut it either[1].
In a large sense, Microsoft lied. But they didn't lie about capability of the technology itself - they just lied about being able to afford to deliver it for free.
--
[0] - Extrapolated to a hypothetical subscription, this would be ~$27 per month. I've seen more expensive and worse subscriptions. Still, it's a big motivator to go dig into the code of that integration and make it use ~2-4x fewer tokens by encoding "exposed entities" differently, and much more concisely.
[1] - Maybe Llama 3 could, but IIRC license prevents it, plus it's how many days old now?
> they just lied about being able to afford to deliver it for free.
But they never said it'll be free - I'm pretty sure it was always advertised as a paid add-on subscription. With that being the case, why would they not just offer multiple tiers to Copilot, using different models or credit limits?
Contrary to what the corporations want you to believe -- no, you can't buy your way out of every problem. Most of the modern AI tools are mostly oversold and underwhelming, sadly.
With the most recent update, it's actually very simple. You need three things:
1) Add OpenAI Conversation integration - https://www.home-assistant.io/integrations/openai_conversati... - and configure it with your OpenAI API key. In there, you can control part of the system prompt (HA will add some stuff around it) and configure model to use. With the newest HA, there's now an option to enable "Assist" mode (under "Control Home Assistant" header). Enable this.
2) Go to "Settings/Voice assistants". Under "Assist", you can add a new assistant. You'll be asked to pick a name, language to use, then choose a conversation model - here you pick the one you configured in step 1) - and Speech-to-Text and Text-to-Speech models. I have a subscription to Home Assistant Cloud, so I can choose "Home Assistant Cloud" models for STT and TTS; it would be great to integrate third party ones here, but I'm not sure if and how.
3) Still in "Settings/Voice assistants", look for a line saying "${some number} entities exposed", under "Add assistant" button. Click that, and curate the list of devices and sensors you want "exposed" to the assistant - "exposed" here means that HA will make a large YAML dump out of selected entities and paste that into the conversation for you[0]. There's also other stuff (I heard docs mentioning "intents") that you can expose, but I haven't look into it yet[1].
That's it. You can press the Assist button and start typing. Or, for much better experience, install HA's mobile app (and if you have a smartwatch, the watch companion app), and configure Home Assistant as your voice assistant on the device(s). That's how you get the full experience of randomly talking to your watch, "oh hey, make the home feel more like a Borg cube", and witnessing lights turning green and climate control pumping heat.
I really recommend everyone who can to try that. It's a night-and-day difference compared to Siri, Alexa or Google Now. It finally fulfills those promises of voice-activated interfaces.
(I'm seriously considering making a Home Assistant to Tasker bridge via HA app notification, just to enable the assistant to do things on my phone - experience is just that good, that I bet it'll, out of the box, work better than Google stuff.)
--
[0] - That's the inefficient token waster I mentioned in the previous comment. I have some 60 entities exposed, and best I can tell, it generates a couple thousand token's worth of YAML, most of which is noise like entity IDs and YAML structure. This could be cut down significantly if you named your devices and entities cleverly (and concisely), but I think my best bet is to dig into the code and trim it down. And/or create a synthetic entities that stand for multiple entities representing a single device or device group, like e.g. one "A/C" entity that combines multiple sensor entities from all A/C units.
[1] - Outside the YAML dump that goes with each message (and a preamble with current date/time), which is how the Assistant know current state of every exposed entity, there's also an extra schema exposing controls via "function calling" mechanism of OpenAI API, which is how the assistant is able to control devices at home. I assume those "intents" go there. I'll be looking into it today, because there's a bunch of interactions I could simplify if I could expose automation scripts to the assistant.
I have it enabled company-wide at enterprise level, so I know what it can and can't do in day-to-day practice.
Here's an example: I mentioned PowerPoint in my earlier comment. You know what's the correct way to use AI to make you PowerPoint slides? A way that works? It's to not use the O365 Copilot inside PowerPoint, but rather, ask GPT-4o in ChatGPT app to use Python and pandoc to make you a PowerPoint.
I literally demoed that to a colleague the other day. The difference is like night and day.
I've gone back to using GitHub Copilot with reveal.js [0]. It's much nicer to work with, and I'd recommended it unless you specifically need something from PowerPoint's advanced features.
(Ask Office Copilot in PowerPoint to create you a slide. I dare you! I double dare you!!)
The problem with demos is that they're staged, they showcase integrations that are never delivered, and probably never existed. But you know what's not hype and fluff? The models themselves. You could hack a more useful Copilot with AutoHotkey, today.
I have GPT-4o hooked up as a voice assistant via Home Assistant, and what a breeze that is. Sure, every interaction costs me some $0.03 due to inefficient use of context (HA generates too much noise by default in its map of available devices and their state), but I can walk around the house and turn devices on and off by casually chatting with my watch, and it work, works well, and works faster than it takes to turn on Google Assistant.
So no, I honestly don't think AI advances are oversold. It's just that companies large and small race to deploy "AI-enabled" features, no matter how badly made they are.