> I’ve never understood why voice recognition has always attempted to be complet...

setr · on May 9, 2023

I mean, right now the current state is effectively an undiscoverable control language, with somewhat flexibility but generally fails/unreliable unless you restrict yourself to very specific language — language that differs based on the task executed, often with similar but different specific formats required to do similar actions

I’d argue that if the current state is at all acceptable, then a consistent, teachable and specific language format would be an improvement in every way — and you can have an “actual” feedback loop because there’s a much more limited set of valid inputs, so your errors can be much more precise (and made human-friendly, without, I think, made merely programmer-friendly).

As it stands, I’ve never managed a dialogue with Siri/Alexa; it either ingests my input correctly, rejects it as an invalid action, does something completely wrong, or produces a “could not understand.. did you mean <gibberish>?”.

Having the smart-ai dialogue would be great if I could have it, but for the last decade that simply isn’t a thing that occurs. Perhaps with GPT and it’s peers, but afaik GPT doesn’t have a response->object model that could be actioned on, so the conversation would sound smoother but be just as incompetent at actually understanding whatever you’re looking to do. I think this is basically the “sufficiently smart compiler” problem, that never comes to fruition in practice

SargeDebian · on May 9, 2023

It's like using a CLI where the argument structure is inconsistent and there is no way to list commands and their arguments in a practical way.

andsoitis · on May 9, 2023

Close your eyes and imagine that CLI system is instead voice / dialog based. The tedium. For bonus points, imagine you’re in a space shared with others. Doesn’t work that well…

chezelenkoooo · on May 10, 2023

What? No, I think it'd be great! I'd love to be able to say out loud "kube get pods pipe grep service" and the output to be printed on the terminal. I _don't_ want to say "Hey Google, list the pods in kubernetes and look for customer service".

The transfer between my language and what I can type is great. It starts becoming more complex once you need to add countless flags, but again, a structured approach can fix this.