Hacker News new | past | comments | ask | show | jobs | submit login

> My biggest problem with “OK Google” is that I don’t know by heart what it can do interactively

Maybe it’s just me but this feels unaddressed and that seems ridiculous.

Why is it so hard for me to find a single, precise location on my phone with an enumerated list of every command Siri or Google can work with?




> Why is it so hard for me to find a single, precise location on my phone with an enumerated list of every command Siri or Google can work with?

The likely answer here is that the engineers who work on such products would scoff at the idea that their work amounts to a simple list of commands. In their minds, they’re working on a natural language virtual assistant, whose understanding of user input is “intelligent”, and it should know what you want regardless of how you phrase it. Want to do something? Just ask! Treat it as if it's a person! The possibilities are endless!

Never mind that its actual functionality (y’know… the things it can do when it understands you) is embarrassingly finite and boils down to a “list of commands” anyway.


I'm not sure if that's what Google is doing. More than half of my queries result in a "Let me google that for you" response where it pulls up a search page.


Apple's "assistant" is similarly useless. The best use case I have for it is when I'm driving and pondering about something silly so I ask it: "How does X do Y?" or similar, and the response in 99% of the cases is "I can't show this to you right now".


Alexa takes "the customer is always right" a little too far:

  Me: Alexa, when do babies double their birth weight?
  Alexa: According to an Amazon customer, some time within the first month. Does that answer your question?
  Me: No!
(The true answer is more like 5 months: https://www.mayoclinic.org/healthy-lifestyle/infant-and-todd....)


Not having lists means they can collect more training data.

Build a database of all the attempted interactions. Cluster them by task. Sort by most used (or most monetizable) that the system can’t support today. Bam! You’ve got a rough futures capabilities roadmap.

It’s more complicated of course but here you literally have a large customer base telling you what it wants, but your product can’t yet do, regularly.


Unfortunately there's also drift in behavior that comes from retraining. "OK Google, play NPR news headlines" get different results some days than others. Sometimes I get the latest hourly news, sometimes I get a robot voice reading headlines to me. Sometimes asking to dial someone calls them, sometimes it returns search results. Yadda.


Yes, is there a list of every command a human can execute or can work with?


Not sure what side of the debate you're taking here, but I think you've outlined the issue perfectly.

Engineers: "We couldn't have a list of commands, that's not how humans work, you're supposed to treat Alexa like a human, and the possibilities are endless"

Users: "Ok, then. Alexa, take out the trash."

Alexa: "Sorry, I can't do that."

(Ok, so obviously the possibilities aren't endless, right?)

I can somewhat understand general knowledge queries. For those, you can totally make the case that there's just too many things you can ask about to enumerate them all.

But imperative commands, like sending text messages, setting timers, or home automation? There's a finite list of those, since at the end of the day they actually have to be authored by some human who's writing a (say) Alexa skill. The number of utterances that may map to those skills are unbounded, but the number of skills aren't. So yes, at the end of the day, for "command" like things, they really should be able to give a list of them.


> (Ok, so obviously the possibilities aren't endless, right?)

This does not follow from the above. The set of positive integers is countably infinite. So is the set of positive even integers. Even if "half of the positive integers are missing!" there are still "endless" even postive integers.


By that logic the calculator app has an (effectively) infinite amount of functionality since there is an infinite number of integers which you can add together.

Somehow though they still list all the features.


> By that logic

This doesn't follow at all. It's not what I said and I find it difficult to believe that you even think it's what I said.


> This does not follow from the above

Well, I elaborated after. There's an actual finite set of skills that are coded up by actual engineers. A natural language system isn't hallucinating the ABI for the function calls that send text messages. There's code there which takes the utterance and sends the texts. What I'm saying is that you can take an inventory of what skills have been written (and/or are installed), and y'know... document them somewhere.


> you can take an inventory of what skills have been written (and/or are installed), and y'know... document them somewhere.

Sure. I didn't take exception with anything except the standard HN middlebrow dismissal.


I'm not giving a middlebrow dismissal. There exists a real discoverability problem with virtual assistants, and asking users to "just try things" leads them to try things that don't work, and then conclude that the assistant must not be as useful as they thought.

Moreover, when an assistant doesn't do a thing, you're unlikely to try it again later; instead most people will conclude "I guess it can't do that" and move on. If they add the feature later, it's too late.

With every failed request, your confidence that an assistant really is intelligent and can understand you, diminishes more and more. Every time a user hits a dead end with a virtual assistant, it doesn't encourage them to try more things that do work, it instead gives the user less confident that anything will.

I can't count the number of times my wife has been surprised I can get Siri to do things. Her typical response is "I can never get her to understand me so I just stick with timers." It's a real problem, and I'm not being dismissive of anything.

In contrast, reread your comment in this context. You're taking my comment, reading in the least charitable way, condescending to me about the meaning of finite when the rest of my comment clarifies what I mean, and being completely dismissive of the point I'm trying to make. How can you say I'm the one issuing middlebrow dismissals?


You should do some self reflection on why you felt the need to make a comment just to make yourself look smart.


> why you felt the need to make a comment just to make yourself look smart.

I hardly think it made me look smart. It's borderline trivial. The parent comment was insanely reductive in the stadnard HN style. I was hoping to help reduce the appearance of future such comments.

Sibling comments indicate that it had no positive effect. Such is life.


> It's borderline trivial.

> I was hoping to help reduce the appearance of future such comments.

> Sibling comments indicate that it had no positive effect.

I'm really not trying to attack you here but this honestly reads like a high-school kid trying to make themselves sound smart by emulating spock from star trek.


Yes, actually, here: https://en.wikipedia.org/wiki/Basic_English

  If one were to take the 25,000 word Oxford Pocket English Dictionary and take away the redundancies of our rich language and eliminate the words that can be made by putting together simpler words, we find that 90% of the concepts in that dictionary can be achieved with 850 words. 

  The shortened list makes simpler the effort to learn spelling and pronunciation irregularities. The rules of usage are identical to full English so that the practitioner communicates in perfectly good, but simple, English.

  We call this simplified language Basic English, the developer is Charles K. Ogden, and was released in 1930 with the book: Basic English: A General Introduction with Rules and Grammar.
Even Includes 200 picturables: http://ogden.basic-english.org/wordpic0.html

"A widely known 1933 book on this is a science fiction work on history up to the year 2106 titled The Shape of Things to Come by H. G. Wells. In this work, Basic English is the inter-language of the future world, a world in which after long struggles a global authoritarian government manages to unite humanity and force everyone to learn it as a second language."

- Sounds pretty close to Siri and the other digital assistants to me. Ever watch people from none English countries use their smartphones? Not all of it is implemented yet but this is almost all you need to run an empire.

Here it is deployed in favor of much needed disciplinary action for two Scottish people:

https://www.youtube.com/watch?v=BOUTfUmI8vs


> Here it is deployed in favor of much needed disciplinary action for two Scottish people

There was a moment when call centers started deploying “just say it” en masse - and I was literally in panic. Luckily, they brought back “or enter” pretty soon and also en masse.

To be fair to robots you protein constructs are not much better. In a two mile radius of our company’s office humans trained themselves to understand Russian accent pretty well. But beyond that…


I would love to see something similar to Basic English: A General Introduction with Rules and Grammer for other languages. It seems like it would be a great tool for learning a new language.


Anyone reminded of XKCD's "Up goer five" strip (https://xkcd.com/1133/), or is it just me?


He expanded the idea into a book: https://en.m.wikipedia.org/wiki/Thing_Explainer


There isn't, but a partial list could be assembled.

Most human interactions are context-triggered and heavily scripted.

This is easy to see on social media where responses to a popular trigger post fall into groups. A lot of people make one of a small number of generic expected responses, and there's an even smaller number of funny/off-beat posts - which all make the same joke.

Occasionally you get a truly original inventive reply. But only very rarely.

I have a vague memory of a fringe AI startup which has been trying to formalise that contextual database since the 90s.


It also annoys me that there's no (obvious) meta-interactions with most smart assistants to explore what's possible. I can't easily ask "can you do X?" or "what can you do with Y?"


This, plus the already-discussed lack of a list of working commands, further cements my belief that "voice assistants" are not there for the benefit of those who keep them in their homes.


There isn't even an enumerated list of all the features of the Google search engine (i.e. quotes for full expression, minus to rule out words etc.) And this might be the most popular web service in the world!


There is actually a fairly good list here: https://support.google.com/websearch/answer/2466433?hl=en

I know for a fact that it isn't complete. But most of the "secret" ones that I am aware of were very obscure and usually buggy, so maybe this is all of the officially supported ones.


Other OK Google problems:

o it changes. who cares if you know even some "commands"... it'll break. I used to ask google maps when driving "Ok Google, ETA". It's been many many years since that stopped working.

o can't change name/ATTN keyword. how dumb is your AI that you can't even rename _your_ assistant. /s


Oh yeah the constant syntax changes got me to stop using it entirely.

Commands would suddenly lead to web searches, I'd then have to Google the new set of magic words to make it set a reminder or whatever, only for it to break again two weeks later.


> Why is it so hard for me to find a single, precise location on my phone with an enumerated list of every command Siri or Google can work with?

Because engineers (and managers) contrive problems like this to the point they are useless solutions.


(I'm posting here because it's the most recent comment by your account).

You've unfortunately been breaking the site guidelines repeatedly and egregiously:

https://news.ycombinator.com/item?id=33585475

https://news.ycombinator.com/item?id=33550821

https://news.ycombinator.com/item?id=33547727

https://news.ycombinator.com/item?id=33472366

https://news.ycombinator.com/item?id=33472317

https://news.ycombinator.com/item?id=33468223

https://news.ycombinator.com/item?id=33451816

https://news.ycombinator.com/item?id=33447930

If you keep doing this, we're going to have to ban you. I don't want to ban you, so if you'd please review https://news.ycombinator.com/newsguidelines.html and use HN in the intended spirit, we'd really appreciate it.


Oh cool, even the site admins are ganging up on me. Most of those comments are taken entirely out of context, or direct replies to comments that were replies to my own, but you do you.

I am using HN in the intended spirit, and I'm sorry that you're letting the echo chamber color your perception of perfectly acceptable comments. You can email me if you'd like to continue discussing this in a more professional manner (why else would I fill it...?), but I have to say this is an appalling showing of leadership on your behalf, and I hope you address it soon.

Public intimidation does not make for a safe and healthy culture. Period.


No one is ganging up on you—the examples I listed contain plenty of personal attacks. That's clearly against the site guidelines and of course we have to ban accounts that won't stop doing it, so please stop.


One, maybe two examples contain personal attacks. You're going deep into threads to find these. Sounds more like a witch hunt than a fair and balanced analysis of my contributions. Ganging up is an understatement.

The rest? Get real. Cherry picking at it's finest. You're barking up the wrong tree here and I have no problem defending those comments ad nauseum.

I'll say it again since you're really not getting it: feel free to email me if you'd like to continue this in a more professional manner. I'll keep flagging comments that blatantly ignore this.

How many times is enough? I've now politely asked three times to be contacted via email if you need to, twice specifically for this altercation. Please don't make the same mistake a fourth time, as an admin and representative of this dying site.


If you're posting publicly on the site, it's appropriate for moderators to respond publicly on the site. Moderation comments are important not just as a one-to-one conversation, but as a signal to the community.


If you publicly intimidate users with a bias, you are not facilitating a fair and safe place to share ideas.

It's a concept called inclusivity, I think you should read up on it a bit.


I've been thinking about wiring up whisper[0], mozilla's tts[1] and gpt-3 together to make a voice assistant of sorts. Wouldn't have the access to device hardware and no guarantees of correct answers, but should blow siri etc out of the water in terms of understanding the context.

[0] https://github.com/openai/whisper [1] https://github.com/mozilla/TTS


It should also talk to spammers and provide them fake credit card numbers.


I would not count on Google to make public any such thing. But a third party could test it out to build such a list. And that could include caveats like "works if you ask in this form, no if you ask in this other form".


I stopped using Alexa after it almost burned my house down on thanksgiving. Apparently “bake at 400 degrees Fahrenheit for thirty minutes” somehow became “microwave for thirty minutes” even though it got all the words except bake! Who sets a temperature with their microwave?

Anyway we meant to bake something but instead absolutely roasted a metal pan and wire rack that merged into the glass somehow.

My wife thinks it’s kind of funny because the Disneyworld “Carousel of Progress” shows a very similar event happening due to voice controls, which they predicted in the 1960s!


Third party integrations?

It's both a static list (available to everyone) and a dynamic list (available only to you).

Having seen all the dead products at Google. Who would get rewarded for this/compensated? Would the complexity in building the list increase ongoing costs with an unclear return on investment?


Presumably, there is a list somewhere in Google’s internal documentation. All we’re asking for, is for them to copy and paste it from that documentation, clean it up a bit, and post it online.


There probably isn’t. There’ll be some hard coded “if this, say that”, but there are a lot of trained responses in the models that won’t be as simple as that.


The original Siri had such a list. I found it demonstrated here: https://youtu.be/agzItTz35QQ?t=716

Did that ever make it to release? I can't remember seeing it on my actual phone.


It would change constantly over time, and would eventually become very large. It's an interesting idea, though, a school subject on how to interact with your AI. Lots of grammer, machine learning theory, culture, a bit of security, etc, to second guess it.


Students in most public schools don’t even learn English grammar now in most states. That went away at some point in the Bush or Obama administrations, probably due to the NCLB and Common Core initiatives. It is not uncommon now to encounter college students who simply have never heard of a “direct object.” They need classes on the grammar of their own language more than they need a school subject on interacting with an overhyped and underperforming Siri.


Common Core has English grammar as a foundational skill. Your specific example is taught in grade 5.

That said, English is a difficult language and I'm not at all surprised that people get through school without fully grasping the names for grammar concepts, even if they use them every day.


I learned this when I was about 20, met a foreign friend online who was formally learning English, and I couldn't answer most of her questions.

It was a bizarre but very educational moment: I use English like I write code: I have no formal education on it but I seem to do fine.


You can use open source assistant instead like Dicio https://github.com/Stypox/dicio-android and configure it the way you like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: