Hacker News new | past | comments | ask | show | jobs | submit login
A closer look at Google Duplex (techcrunch.com)
222 points by Ours90 on June 27, 2018 | hide | past | favorite | 167 comments



Guys, c'mon, Google didn't make Duplex so that it could revolutionize restaurant reservations. Everyone here is getting so hung up on how there are better/other/existing ways to book a table. Of course there are, but that's clearly not the point of Duplex.

Reservations are just a convenient test bed for the underlying technology. And the underlying tech is 100% the reason for Duplex. Not table booking. It's all about training for AI - speech generation, conversation, language parsing. Making a reservation is a quarter step up and long staircase google sees for this tech.


i don't know what end goal you think they're training for, but duplex feels enough like an end goal to me. I know everybody likes to say everything google does is all about training the AI, but they have to be training for something.

This is a "something". it's a concrete, monetizable, and useful application of their AI work. making a reservation is the beginning of a sale. being able to book a reservation at any arbitrary restaurant with no extra work on the restaurantuer's part makes google ads more valuable. They can make money off this quite directly.


> This is a "something". it's a concrete, monetizable, and useful application of their AI work... They can make money off this quite directly.

The interviews I've read point to an 80% success rate - hugely impressive, but 20% of interactions requiring human intervention is still a _very_ large, expensive staff hungry contact center to run when you are Google scale, for a product that ultimately doesn't really bring in any direct revenue. Each time Duplex fails in its current form, a human operator has to step in to finish the call.

Even Google are telling us not to expect this to ship in anything anytime soon.

From Ron at Ars' demo:

“We’re actually quite a long way from launch, that’s the key thing to understand,” Fox explained at the meeting. “This is super-early technology, somewhere between technology demo and product.”


My point wasn't to say it's a product now, but rather that it is will be a product if everything pans out. The supposition was that this is just another way for Google to collect data, I believe that it's more of a way to use data than a way to collect data.


Google 411 could have been considered an end goal too. It too was concrete and monetizeable. Instead it was actually a way to get a whole lot of natural language voice samples. When it stopped being useful in that regard, they didn't turn around and try to monetize it, they just killed it.


Well, they could do what amazon does with aws lex, and offer it as a SaaS product for others to build "something" out of it, adding it to their gcloud stuff.


I guess the goal is to have something like the AI in the movie Her.


Exactly, can you hold a conversation with a machine on a given subject given an ontology and a knowledge graph. It is the sort of holy grail of dialog systems. Dr. Boyd-Graber and CU Boulder had a lab working on these problems while I was with IBM doing something similar in Watson.

The ontology provides a structural linkage for both constraining the conversation and surfacing situations where the conversation has broken down. The knowledge graph provides bounds around the conversation space concepts to insure you don't go off the rails semantically.

"restaurant" and "service business" reservations have nicely constrained ontologies so that you can search them in real time and stay synced.


Every problem has a existing solution, but that doesn't mean everyone is using it. Another great use for Duplex is getting opening times for a business. Sure, in theory those times can probably be found on their website, social media, or some other place, but 1. there generally isn't a good standard all across the world and 2. there are dozens of problems like these and not all have existing solutions.

This is where Duplex comes in. It fills in the gap for the lack of standards. The assistant will still try to use booking systems if it can obviously, but when it can't, it's can then resort to Duplex.


This. It's huge.

Enriching the Googlesphere with proprietary metadata at scale that other providers don't have is a clear value add.

And furthermore, customers are doing part of the work required for this.

In that way, it's no different than typical Google product progression.

E.g.: ship Maps with Android, get everyone using Maps, leverage usage share to provide features competitors can't (like real time traffic). Or Chrome.


Get accessibility information while you're in there - that's much, much harder to parse than opening times.


I’m rather confused why doing this as a consumer product in Assistant makes more sense than an enterprise product. There is so much room for improvement in call centers, and would be a huge market opportunity.

This is hardly even a time saver. What would be helpful is if this is also able to stay on hold for you — probably an easy task.


In Ars Technica's article Nick Fox, VP of product design for Google Assistant, said:

https://arstechnica.com/gadgets/2018/06/google-duplex-is-cal...

> Fox didn't rule out selling this kind of tech to call centers, but that's another potential down-the-road addition. "There are companies that do that well," he said. "There are very big and established companies that provide software for call centers. It's not the core of our business. We'll see as we go whether we think if we can help there. But it's not a business that we're in today, and it's a pretty well-served business."


Do you mean a call center bot on the answer side? I would think that's massively harder to do in a way that's noticeably better then the phone tree systems that already exist. Everything simple is already covered by simpler automated systems, and everything that isn't simple is impossibly hard for this level of tech.

Now if Google could make a bot that can call the service line for a company, jump through whatever hoops are needed to get a person on the phone, then ring the call through to me when they got one, that I would pay money for.


As an enterprise product, Google would require all audio be sent to their servers, which I'm guessing most businesses would not be too keen on doing.


Enterprises are more demanding and have a lower tolerance for error. I have no doubt Duplex will be rolled out to enterprise once the technology is ready and has been thoroughly tested on consumers.


Probably because it's far less risky to mess up a dinner order than trash the good name of a company.

But once it's working, they'll be coming for the call center jobs...


I think what Google is doing here is using AI to make its products smarter and then feed that data back into AI to make AI smarter. That's Google's end goal. To use AI to make its product better and frictionless.


Minus "AI" and it's like the same goal for every product/service. I get AI but the term is so over-used now it's just implied as "smarter" which is subjectively true. I don't always need smarter products, rather more relevant and useful. The simple fact that a poor UX in front of "smarter" services can destroy the whole concept.

I do think there is a bigger picture than reservations so for example if I knew that an action just took place the opportunities for highly targeted advertising knowing the context of what just happened could be significant. Feed Google's cash cow.


Google has dropped hints on how it may use Duplex in the way you say - reservations are just a way to train it. An example is to get better data on businesses in Google Maps by calling up businesses to see if they are open or closed on certain days, thus helping to improve one of their key platforms.


I have to disagree here, there is a huge strategic advantage to getting into the booking business in this particular manner. It pressures businesses to set up an account with google to manage reservations as it would be easy to up-sell businesses that are both not online today, or use other services instead. Once companies manage this part of their business via google, it becomes even easier to upsell them on further services including local advertising which is still a massively untapped market when it comes to digital marketing.


I wouldn't say you're disagreeing because my point wasn't that duplex is useless/won't be used for this stuff. It was that booking tables is small outgrowth of a much larger project.

Duplex is as much about reservations as Deep Mind is about playing go.


That's just another enormous data collection tool about users and businesses' activity for Google. That's the main goal. Training -- that's part to make data collection better.


The Ars Technica writeup is superior, IMHO:

https://arstechnica.com/gadgets/2018/06/google-duplex-is-cal...


Much better, and a very interesting read. Definitely answered a lot of questions.

> Part of the problem is that Google's training doesn't scale as well with Duplex as it does for other AI efforts. Wiretapping laws mean there isn't a treasure trove of millions of phone calls Google can train its AI with. All the training data needs to be made by Google, so the limited scope really helps.


I liked the Wired one, since she did what I would have wanted to do:

https://www.wired.com/story/google-duplex-gets-a-second-debu...

I then asked whether there were any allergies in the group. "OK, so, 7:30," the bot said. "No, I can fit you in at 7:45," I said. The bot was confused. "7:30," it said again. I also asked whether they would need a high chair for any small children. Another voice eventually interjected, and completed the reservation.

I hung up the phone feeling somewhat triumphant; my stint in college as a host at a brew house had paid off, and I had asked a series of questions that a bot, even a good one, couldn’t answer. It was a win for humans. “In that case, the operator that completed the call—that wasn't a human, right?” I asked Nygaard. No, she said. That was a human who took over the call. I was stunned; in the end, I was still a human who couldn’t differentiate between a voice powered by silicon and one born of flesh and blood.


It "feels revolutionary" to the author, but it is literally only capable of handling the simplest restaurant and hair salon reservations (the kind of thing you could do yourself in about 90 seconds). Seems like a silly headline, although the details are interesting.


> (the kind of thing you could do yourself in about 90 seconds)

I can also walk over to my kitchen, stepping around the dogs and cats, and pour a glass of milk in about 90 seconds too, but a robot that can reliably do that would be pretty damn revolutionary.

Don't underestimate the insane quantity of wetware computation going on during a short conversation.


Apart if you need to set it up for half an hour, and it fails 10% of the time, you just get up and get your glass of milk.


Actually it fails at least 20% of the time. Your floor cleaning robot will fail at cleaning it up 20% of the time. I'm not good at math, but I think that means you'll be spending 95% of your milk drinking time cleaning your floor...


What on Earth are you talking about? The point was not that anything humans can do within 90 seconds is non-revolutionary when implemented in machines. An artist can create something unique and beautiful in 90 seconds, and a mathematician can explain a short yet difficult proof in 90 seconds. Those things would absolutely be revolutionary if implemented in machines. Who cares about the quantity of computation? As if that matters for how "revolutionary" a software technology is.

It is entirely unclear to me that a bot that might make simple reservations at small group of establishments that include only restaurants and hair salons is in any way revolutionary. Even if it worked 100% of the time, who is making daily hair and restaurant reservations that would benefit from the extra 90 seconds? It's silly the amount of hype this is eliciting. You have a very strange definition of "revolutionary".


> […] who is making daily hair and restaurant reservations that would benefit from the extra 90 seconds?

The wealthy have had assistants since practically time immemorial, and no one bats an eye at their "90 seconds". The promise of this tool is a democratization of both the convenience and the normalcy.


> It is entirely unclear to me that a bot that might make simple reservations at small group of establishments that include only restaurants and hair salons is in any way revolutionary.

If it is an extensible technology that requires relatively minor tweaks to expand the domain of applicability, then it is revolutionary thougj the effect won't be fully felt until it has been more broadly applied.

If it is narrowly tailored such that the next two domains will cost nearly as much time and money to build out as restaurants and hair salons, then, sure, it's minimally useful.

Google's been promoting it as the first, and that seems in character with what Google does.


> As if that matters for how "revolutionary" a software technology is.

Doesn't matter if you're only interested in it as a consumer. But then why would you be on HN?

Your entire comment would apply to introduction of the first transistors, which were worse than vacuum tubes in almost everything except size - and nobody at the time particularly cared for their size.


A chatbot that only works within a limited domain is not "revolutionary". It's been done before. The NLP part is relatively easy in a limited domain. Given enough time and enough users, it could be an ELIZA 2.0 level of programming.

The harder part is voice to text (not rocket science and that can be done pretty reliably) and natural sounding text to speech.


Can't wait to see the bot you use to make millions with in competition with Duplex


With Googles reputation of dropping products and letting them wither and die, and Google’s lack of ability to monetize anything outside of search, there is just as much of likelihood that they won’t make money off of it as I won’t....


Its all simple in retrospect. The classic case, obviously, is Dropbox where HN dismissed it as 'its just rsync. What's the big deal'.


At the end of the day, Dropbox is becoming more of a feature than a product as Steve Jobs said.

For the $9.99 a month that Dropbox costs for a terabyte, you can get Office 365 that includes One Drive for 1Tb each for up to 5 users.


It must feel at least a little bit revolutionary since half the internet was saying that the demo was faked.


Probably the best compliment Google could get.


People were fooled by hype, a usual occurrence with new "AI" technology. That says nothing about how "revolutionary" it is.


Technology should aim to solve problems both big and small. If all we did was life-extension and deep space exploration, nobody would be solving problems we have today -- tracking how much food we eat automatically or how much fuel should be injected into the engine based on driving conditions/style.

You might think this isnt new "AI" technology, but as others have showed this is the first time we've seen an "AI" tech convincing masses that it is real human using voice.



From https://www.wired.com/story/google-duplex-gets-a-second-debu...

> I asked Huffman and Fox whether Google regretted showing off a carefully-produced Duplex demo back in May that offered little in terms of transparency or exposition. Fox didn’t say directly whether he regretted it. "We thought of the demo at I/O as much more of a technology demo, whereas what you see here is much more of the product side of the technology," Fox said.

So it's still not clear how edited or set up that call was at I/O. You can hear in the released video the voice is not as polished as the example at I/O, and also from the Wired article, you can see how easily the AI is tripped up. I don't think he needs to eat his hat yet.


Considering that he went out of his way to express his opinion (calling it a "fraud" and "con"), it'd be funny if he sits this one out.


He didn't, though. He said the terms of the demo made it seem like one


He said he thought Pichai was lying:

>Pichai claims these two examples are actual phone calls to actual businesses. I’m saying I don’t think that’s true.

I'm not sure what distinction you're making.


Ok, that's a firmer commitment than he made in the blog post.


I think his skepticism was warranted. It bothered me as well, that the demonstration was so controlled.


> >Pichai claims these two examples are actual phone calls to actual businesses. I’m saying I don’t think that’s true.

Skepticism and calling someone a liar aren't the same thing...


> For instance, asked to “repeat the last four numbers,” it restates the phone number in its entirety. It’s not a flaw, exactly, but it does show a simple place where the system is pushed to its limitations with regard to the understanding of the the subtle nuances of human conversation.

I chuckled. From experience, it turns out some humans don't understand the subtle nuances of human conversation either.


It's not that I don't understand, but I am unable to say the last four digits of my number without saying the whole thing (or at least saying the leading digits in my head). This is just how my brain recalls it.


I repeat the whole number when people ask me to repeat part of it all the time.


Heh, I agree that is rather a stretch. There is no harm in repeating the whole thing. And if they misheard the "last 4" there is a chance they misheard the first few as well.


>“We want to make sure that we’re not wasting the business’s time,” Fox says. “We want to make sure throughout everything we do here, that this is good experience for the business and that they’re not getting frustrated talking to an assistant while they’re trying to run their business.”

What's the added value of sounding exactly like a real person then? Why can't it be a robotic voice following a simple, efficient algorithm?

- Hi. I'm a robot trying to book for 4 at 20 tonight for my client. Is it possible? Please say Yes or No

- Wait what?

- I'm a robot trying to book for 4 at 20 tonight for my client. Is it possible? Please say Yes or No

- No

- Is it possible at any other time tonight? If yes, when?

- Yes, at 21

- I would like to book for 4 at 21. Thank you.

We have had this kind of technology for years.


Because people will hang up on it like they do all other robocalls, or they will talk to it like many people talk to "computers". They won't use grammar, they will speak louder, they will slow down, and they will try to keep the conversation as simple as possible fearing that the system can't handle anything else.

It needs to sound human so that humans will interact with it correctly. If it sounds human, then the person on the other end of the call might say things like "We don't have anything available on the 3rd but we do on the 5th, is that okay?"

If it sounds like a robot, people will just respond to questions with "no", "yes", or say "phone number" instead of "can i have your phone number?"

Sounding human puts the human in the drivers seat, sounding robotic means the "robot" needs to drive the conversation.


or they will talk to it like many people talk to "computers". They won't use grammar, they will speak louder, they will slow down, and they will try to keep the conversation as simple as possible fearing that the system can't handle anything else

So, pretty much how we've trained ourselves to use Google, the search engine.

Nobody types "I'd like to know what the best way is to braise pork with white wine" into Google. They type "pork recipe braise wine" because we've been trained by Google that this will produce better results.

If Google's current search worked as well as it claims Duplex will, then we'd finally be getting somewhere. But Google's taken its eye off the ball.

Right now, at least for the things I search for, searching on Google is like playing an Infocom game.


> Nobody types "I'd like to know what the best way is to braise pork with white wine" into Google

No, but lots of people say stuff like that to Google Assistant all the time, and Google is tuning their search engine to accommodate queries like that:

https://www.google.com/search?q=What's+the+best+way+to+brais...

The above query results in a card displaying step-by-step directions on how to braise pork with white wine. Maybe Google's better at this than you thought?


Name a search engine that's better than Google. Nobody type "I'd like to know what the best way is to braise pork with white wine", doesn't mean Google can't give you good search result if you typed like that, you will get just as good result if you type "pork recipe braise wine", but its fewer words and more efficient to type.

It's less of "google trained us to type like that" and more or "We opted to use the more efficient way of doing that same thing". If Google didn't give us good results if we typed "pork recipe braise wine", it would be a pretty shitty search engine in my book.


Name a search engine that's better than Google

I never stated that Google isn't the best search engine out there. I stated that I hoped the natural language learning from Duplex would filter down to the Google search engine.

But while we're on the topic, find a search query in which Google will return a list of movie reviewers in the city of Chicago. It can't. It is so focused on second-guessing search queries that it will only return lists of reviews of the film Chicago.

I had this exact very frustrating experience once trying to find the name of a woman I met who publishes a popular movie review blog in Chicago. There's even a "guild" of sorts of movie reviewers in Chicago, but Google could only surface that as high as the 10th page of results.

I know I'm not the only person frustrated with Google's desire to second-guess my searches because I heard a comedian on the radio last week who did a bit that lasted five whole minutes on Google second-guessing searches.


Kind of tangential to this, but I was really amazed at the accuracy of Google's search results when a few years back I tried to find the name of a song I'd heard at a nightclub in Hong Kong. Neither did I know the artist, nor really any part of the lyrics (because it had been very loud and crowded), so I was pleasantly suprised when Google Search took my query "guys in jean overalls singing too ra loo ra loo" and actually returned "Come on Eileen".


I feel that we have such a high standard for google search result quality that we can get frustrated by such highly specific search result edge cases. Google search is not perfect, but its the best search engine out there by a long margin for almost 20 years.

There will be a subset of users who will bemoan user privacy but expect highly specific personally tailored quality search result. You can't have both and there will always be edge cases in such a difficult domain. The fact that so many industry heavyweights can't compete with Google in search result quality, says a lot about how good Google is. Does it has problems? Yes, no one is denying it.


we can get frustrated by such highly specific search result edge cases

Every case is an edge case.

"Edge case" has become the tech industry's excuse for everything.


Every case is not an edge case. You can't be serious if you call of the following queries as edge cases -

1. Chicago

2. Chicago movie

3. Chicago movie review

4. Movies in Chicago

5. Movie reviewers

6. Movie reviewers in Chicago

7. List of movie reviewers in Chicago.

If you still feel so, you may be talking about edge case in non-technical sense.

Disc: Googler but nowhere close to search or duplex.


1. Chicago

But with Google, there is no pure search for "Chicago."

There is only a search for "Chicago" by someone at a particular location with a particular device using a particular browser with a particular search history with a particular number of other parameters that Google's pieced together that we don't even know about.

So, I'll be more specific: Every Google search is an edge case. Google spends billions making sure no two searches return identical results.


"movie reviewers located in Chicago" returns the Chicago film critics association as the second result.


For you. Right now. But when I did the search about a year ago, the results were as described above.


I'd say your issue couldn't be reaperduced.


Well, looks like your problem is fixed then.


> It's less of "google trained us to type like that" and more or "We opted to use the more efficient way of doing that same thing".

I'm a bit baffled at this, given it reads as though the past 20 years of searching hasn't been learning to adapt to what one might call search engine speak for better results. The obvious main issue was for years you had to guess how someone else would phrase something if using longer phrases, since the longer the query the less likely you'll find a matching result. If you didn't get the desired results you then would either re-phrase it to yet another natural sentence or (increasingly since it shortcuts the process) pare back the query to simpler keywords to have a better chance of success finding those same core keywords in sites.

Eventually people got used to just entering keywords instead since it wasted time trying one's luck on more natural sentences for the reasons above. Obviously with better natural language tech (and with more user content generated and indexed) some queries became more successful but even today it can be a complete dice roll for many longer, naturally phrased queries whether anything useful is returned.


This might be true for a small subset or early or techie users, but from my experience watching some of my less tech-inclined friends and family using google search. I find it fascinating how much trouble they go through by typing a complete sentence with proper grammar and spelling. I honestly don't know which one is common, but I wouldn't be surprised if a majority of users type full sentences or long search queries.


In my experience the results are worse with full sentences, just as it was described.


Actually both are true. Google (or earlier search engines) did work on basis of keywords.

The ability to handle full sentences well came much later in one of google's major search updates - I think Hummingbird which came out in 2013.


Basically everyone in my office types full sentences to Google, and it drives me crazy.


Did you actually try the former query? It works pretty darn well now -- for me, it brings up recipes (and your HN comment, of course).

Disclaimer: My personal opinion, not that of my employer.


You are comparing a generic computer to the best human conversation. But people make phone calls when driving, make phone calls with background noise, make phone calls while speaking to other people, make phone calls even if they don't know exactly what they want and waste the businesses time, all the time.

They will hang up or be confused the first time. As soon as this becomes common, they will get used to it and actually waste way less time than with the usual human call and will look forward to the familiar efficient robotic voice instead of the third prank or drunk call of the night.

>It needs to sound human so that humans will interact with it correctly.

I disagree. It needs to sound understandable and easy to use so that humans will interact with it correctly. Making it more "human" is orthogonal to that. All the replies to my comments have something to do with interacting in a more human way. Does it ever provide value for simple use cases? I'm honestly asking.

(If I am calling to book a table and the restaurant suggests me another day as in your example, I would be as stumped as the computer and would need to check my calendar and call them back, so your human sounding assistant is providing no value anyway. Is that even a real use case?)

I don't know, maybe a understand-all-manages-everything computer would be used more happily by people, but I'm not convinced. For these simple use case, I still think a binary algorithm would be, on average, more efficient for both the caller and the business.


You say

> We have had this kind of technology for years.

and

> For these simple use case, I still think a binary algorithm would be, on average, more efficient for both the caller and the business.

The fact that we have had this technology for years, and nobody has been able to make it work as you describe, suggests that it's not as simple as you think.

My take: sometimes you can persuade users to learn a new UI, but consider the stereotypical grandparent that still can barely use email; teaching a new UI is hard, normal non-tech people don't like it, and it will take more than a generation to reach 100% penetration.

The existing UI of using a telephone and conversing like a human is probably not at 100% penetration either (e.g. considering deaf individuals), but I'd wager it's as close as any one UI is going to get.


> Because people will hang up on it like they do all other robocalls

A business that is essentially closing a sale with customers? No way, the call might not be from a real human but the money that the reservation will bring in is no different than any other.


Real humans don't behave like that. They behave in all sorts of weird ways when they know that they're interacting with a machine. They SHOUT, O-VER-E-NUN-CI-ATE and use unnatural sentence fragments, which craters the accuracy rate of a speech parsing algorithm that was trained on natural speech. They get instantly frustrated, because they assume that this machine will be just as stupid as the archaic IVR system that their bank uses. They assume it's a scam robocall or a prank and hang up without saying anything.

Acting like a human is by far the most efficient option for a system that is designed to do what Duplex does.


How are you supposed to tell whether an automated system is intelligent or not? Sometimes the voice prompt tells me to "speak in full sentences", but it doesn't understand what I'm saying anyway.


Google solved this by having it talk to you naturally so you never even consider responding differently. Seems like a winning idea.


Only non native speakers think duplex sounds human


I'm native and I think it sounds pretty damn close. I could see myself getting through a simple conversation with it without getting suspicious (that's assuming that I had absolutely no reason to believe it was a system from the start)


How are you supposed to tell whether a human is intelligent or not?

Why wouldn't the same process hold here?


That would be a fault of the system then.


That would be wasting time, because humans are used to talking to humans, not to robots. Do you enjoy it when you call a business and you have to sit through the annoying automated system? If you want to make the experience better for businesses, this is the way to go, although much harder to achieve. The "um"s aren't to trick the person, it's actually makes the conversation flow better.


If you try calling a restaurant and say "Hi. I'm trying to book for 4 at 20 tonight for my client. Is it possible? Please say Yes or No", I suspect you'll discover that it's in fact not more efficient than a typical human conversion.


Would you want Siri or the personal Google Assistant to talk this way?

A strict binary tree structure chirped at you in low-fi synth voice like this is really not that pleasant or efficient.


...because technology should adapt to humans, not the other way around.

These voice systems are annoying as hell, as shown by the fact that they have become a comedy trope similar to airline food.

It's bad enough to deal with them when you're initiating the call. Imagine being a business forced to interact with such a system because it brings in too much revenue to ignore.

I'd definitely spit in every soup served to a guest willing to annoy me just to save himself the hassle of a phone call.


Thats assuming simple restaurant reservations are the end game. I'd be willing to bet that they will be expanding the utility of this far beyond tasks like these. This is just a way to get the right kind of training data for future projects and products, while providing value to google users to keep them in the ecosystem.


Because talking to robocall like that is annoying as hell. Not to mention fail rate is high (at least for my voice).


what @Klathmon said. If I am business , I will hang up at "hi I am robot..."


Why? People call businesses all the time and get robots, why would it be different the other way around? As soon as the boss figures out you're hanging up on people trying to get reservations you can bet hangups will stop ASAP.


Most of us hang up on robots too though. Robocalls are a nightmare. The best argument against Duplex IMO is that it’s going to enable a new generation of hellish scams and robocalls, if it ever leaves Google. Presumably Google knows this and will keep the tech locked down. Still, it’s only going to be a matter of time before it’s open one way or another, and then we’ll have to have the “should I be told up from that it’s a robot?” conversation again.


Most of us aren't paid to answer the phone though... If my job was to answer the phone for people making reservations and I started to hang up on reservation bots I would expect to be fired immediately. I also hang up on real people who call me unsolicited for sales pitches, it's not the bot part that annoys me.


Agreed. I would hang up at "This call is being recorded". No. I don't want this. How dare you just record me out of the blue without consent?


Why don't you want your business to get money for your products and services?


Don't get me wrong I like the tech but I don't get the sentiment. It is just a gimmick or a bridge technology. In the end our phones should be able to talk to the booking system and make the reservation without human interaction. I think the resources could be better allocated, I guess it's just a show and tell kinda project for google but the novelty wears of really fast after the first wrong reservation.


Everything in life is a "stopgap". In an ideal world we wouldn't need to make reservations at all, our phones would just sense that we were heading toward a restaurant and would work directly with people there to ensure we have a table and our food is ready and paid for by the time we get there.

But we don't live in an ideal world, and making a "perfect" system that nobody uses is useless.

This is a solution that real people can take advantage of right now for many places that don't have the budget or time to integrate with every service out there. It's an real solution to a real problem, and I don't think calling it "show and tell" is fair just because you can think of a "technically better" solution in 5 minutes that you'll never be able to roll out to 90%+ of stores and services out there.

"Don't let perfect be the enemy of good"


No the good would be Google hosting a free booking software as a service system and an app or even Google Assistant to interact with the system.

I'm sure companies would jump at the opportunity. Google gets more information about people's habits and it the success rate would be a lot higher.

"Don't let overly complicated be the enemy of the simple"....


Which they can still do! And are in a way (they also showed integrations with assistant and APIs at the same time as showing Duplex).

But that assumes that businesses have the time to integrate, that they care to integrate, and that they can repeat the process for every proprietary API and service out there.

That requires a fulltime developer or another service that does the integration for them. Which again isn't a problem for a lot of businesses, they might be more than happy to do that, but we won't ever hit 100% coverage with those services, especially not across all the "smart assistants" out there.

This allows users to work with almost 100% of businesses right now, and businesses can further integrate without duplex if needed/wanted and the user will not have to care what system the business uses, or how it integrates.


But that assumes that businesses have the time to integrate, that they care to integrate, and that they can repeat the process for every proprietary API and service out there.

Google is the 800 pound gorilla. Companies would integrate with them if they don't want to be left out.

This allows users to work with almost 100% of businesses right now,

You're assuming that it will be good enough and that the businesses won't hang up on it like people do with every other spam automated system.


They already do this actually, but the coverage is pretty spotty. There are still restaurants that don't have websites and you're asking them to set up a digital booking system through Google. It's just not going to happen. Also, I see this tech as more of a way of training up their AIs on tasks that provide a bit of value while they wait for people to connect to the official booking APIs and such.


And this runs head long into competitive issues. Google says they'll integrate the Google Assistant with OpenTable to let you book tables, but what's the chance OpenTable would nix that integration if Google was trying to make a free competitor? Google doesn't want to be in the restaurant booking system business, and this let's them offer a service in their assistant that no one else can.


Companies already go out of their way to place nice with Google (AMP pages). OpenTable really wouldn't have a choice. By being the default, Google has a competitive advantage.

Google doesn't want to be in the restaurant booking system business

Google wants to be in every business where they can gather data to better target advertising.


Your comment misses the critical point though... this hits all the places that have no electronic booking systems. To those here that concept probably seems absurd because we're technically oriented. My barber, doctor, mechanic and favorite BBQ joint however still take phone calls to make appointments and you can talk to them about electronic scheduling till you're blue in the face, they have no desire to change. The goal of this technology is to bridge the gap... start thinking machine to human interaction where machine to machine isn't available.

Not to mention the language barriers. Think of a person still struggling with a language needing to make an appointment in that language. I can see an interface on this where you would select the location and time and have Assistant call them in the appropriate language to handle the exchange for you. No longer is that local pasta place out of my reach when I'm in Italy but can't speak Italian. This opens up a huge realm of possibilities.


this hits all the places that have no electronic booking systems. To those here that concept probably seems absurd because we're technically oriented. My barber, doctor, mechanic and favorite BBQ joint however still take phone calls to make appointments and you can talk to them about electronic scheduling till you're blue in the face, they have no desire to change.

Sure they will change, my barber uses stripe, my doctor does everything else on computer - he's over 60 and three weeks away from retirement, my mechanic has to be technical to operate on modern cars and uses computers for estimates, sending pictures to insurance agents, etc.


At what point in our industry have we ever said "Let's stop innovating and making things easier, we've come far enough, now the people have to adapt to us." Natural language processing and conversation is the next step right now... to me this is such a logical progression I'll be shocked if it's not being planned by many teams.


Natural language processing has been "the next step" since the 60s. How often have you called in to a customer service center and as soon as you knew you were talking to an automated system just yelled "operator" until you got a human?

The minute a busy waitstaff person knows they are talking to a computer they are just going to hang up.

A voice assistant on one side talking to an API on the other side can be close to perfect -- even Apple can get that right with Siri for the intents that it supports especially since you know you're talking to a computer. But going the other way -- not so much.


It's also worth noting that while they show and advertise Duplex making calls because it's pretty cool, they have said that Google Assistant itself will try to use electronic booking first an then try Duplex otherwise.


I don't fully disagree, but you have to realize that there are a great number of people and places that are not hooked up to the internet, due to geography, politics, history, or other reasons.

Some of the best restaurants, hotels, and other experiences in the world are not reachable via an online reservation system.

The place I stayed last weekend can only be reserved by phone. Not because the owners luddites, but to keep out the bargain hunting riff-raff. There was a piece on CBS News a few weeks ago about a restaurant in Maine that's so popular you can only make a reservation by mail.

Sure hotels.com and opentable.com may nail the 80% of what's popular to the kinds of people who go to those sorts of places. But this world is so much bigger and more interesting than the internet or even Google can imagine.

For me the killed Duplex app will be when it's able to make a reservation for me in a language I don't speak. But even though its lost the "beta" badge, Google Translate still isn't ready for primetime.[0]

[0] I say this based on the feedback from the professional translators I work with daily. I put Google Translate text in my work as placeholders, and they have to then re-translate the projects correctly before they can be published.


Would raising the prices be enough to weed out the barfain hunting riff raff? That phone only model just sounds like an excise or a bias against young people who tend to use computers more


Would raising the prices be enough to weed out the barfain hunting riff raff?

With airline miles, hotels.com free night promotions, etc... probably not. People who want something for nothing will try anything.


Booking appointments is one of those areas that suffers terribly from a lack of network effects. Since it is so easy to create a new booking system, there are millions of different ones - your dentist probably uses a way different system from your doctor, for instance, never mind another dentist.

Because it’s easy to create a booking system, making a translation layer between systems is hard. There is no natural incentive for these booking systems to standardize, because each has its constituent users, and none of them are complaining.

So, for now, and probably for a long time to come, booking will be done by humans.


Booking appointments is one of those areas that suffers terribly from a lack of network effects. Since it is so easy to create a new booking system, there are millions of different ones - your dentist probably uses a way different system from your doctor, for instance, never mind another dentist.

Now imagine what would happen if Google swooped in and gave one away for free and said it would easily integrate with Android and iOS devices. Then imagine them giving away cheap tablets to run it on....


>Now imagine what would happen if Google swooped in and gave one away for free and said it would easily integrate with Android and iOS devices.

Very little.

The overwhelming majority of businesses won't switch their booking system unless they have a truly compelling reason. A large proportion of small businesses still use pen-and-paper to manage their bookings and have no intention of changing.

There is still a non-trivial market for dot-matrix printers and NCR paper. Many national retail chains use terminal emulators for their Point of Sale systems. For perfectly understandable reasons, businesses tend to be very conservative about their technology choices. Their current solution might not be perfect, but they know that it works, for a value of "works" that is compatible with their business remaining profitable. They know from experience that new technologies are often more trouble than they're worth.


The overwhelming majority of businesses won't switch their booking system unless they have a truly compelling reason. A large proportion of small businesses still use pen-and-paper to manage their bookings and have no intention of changing.

People would have said the same about payment systems before Strype came along....


Here is an example of who I think you’d have a hard time convincing: I’ve been going to the same hair stylist for about 22 years. Her clients have followed her as she moved around salons in Cupertino. She’s busy enough that many of her clients (including me) sit down at the beginning of the year and book an entire year’s worth of appointments. She only accepts checks or cash.

She’s popular enough that she has felt no market pressure to start accepting payment cards, and certainly has not had any trouble staying busy.

I think there’s a huge chunk of small/single-person businesses that will continue to do things using paper simply because they don’t want to change and they have enough clients that they don’t need to. Once they retire/expire they may be replaced by people that are more amenable to such things.


Do you think she would be patient enough to work with a less than perfect automated voice controlled system?


I think she’d be open to the idea, it’s not like she’s actively refusing business. Taking some reservations from a computer voice is a lot less change and (more importantly), it’s not change she has to initiate.


They already have. https://support.google.com/reserve/answer/7514650?hl=en

Coverage is spotty. It's crazy how much people here underestimate the lack of willingness of many business to use these systems. I live 45 minutes away from SF and recently got a haircut at a decently popular place and it was cash only. Adoption of new technology by small businesses is really slow.


I agree with the sentiment about slow adoption, but in this case, it might not make sense to switch.

If it's decently popular and is fully booked, going cash only would reduce their costs and either give them healthier profits or let them charge less. Most businesses accept cash and card, so they likely already have the daily task of taking money to the bank or having it picked up by a secure truck.


I'd suspect a cash-only business of tax evasion before technophobia.


They would generate some excitement, get marginal marketshare, quietly stop updating the system, let it languish for a few years, then just ghost the system?


And then imagine them shutting it down, and costing vendors billions to replace it.


now there are 17 competing standards...


... clarification: by humans I mean the receptionists, not the Google Duplex.


No man. Nobody cares about the booking of reservations. This is about AIs having useful conversations with people. Start with a very limited context and goal, and grow out from there.


The world is full of stop-gap technologies, because switching is hard and expensive. I have a PNG of my signature and a PDF-to-fax service, because I sometimes need to interface with legacy systems. I still have a checkbook, which I use once or twice a year. I have a full set of metric and SAE wrenches. Hell, I've still got some RS-232 cables.

Duplex is an adapter, a shim, an abstraction layer. It's not perfect, but it solves a problem.


I don't think this is about automated bookings. This is about Google advancing AI.


With this attitude, messy and outdated protocols like HTTP and TCP/IP would be shelved in favor of clean and efficient theoretical protocols that transcend the foolish bonds of our existing infrastructure. i.e. No Internet


This is huge step. No longer does a computer need an API to communicate with someone.


Google Duplex is not really about restaurant reservations.

AlphaGo is not really about winning at the game of Go.

They're both about something else. I'll leave that as an exercise to the reader.


Google says that, in testing, the system has also gotten tripped up encountering another machine by way of a phone tree. Listening closely because our menu options have changed doesn’t appear to compute just yet.

An assistant which deals with a response tree, handles hold time, and gets to the point where there's a human at the other end and they've been given whatever numbers and addresses are appropriate would be useful. Before the humans talk, the automated assistants on both sides should have all the routine stuff out of the way.


> Before the humans talk, the automated assistants on both sides should have all the routine stuff out of the way.

Imagine if there wasn't quite enough security built into all of this.

Store-bot: To confirm identity, what is users full name?

Phone-tree-bot: Thomas Arnold Pellington

Store-bot: What is Thomas Arnold Pellington's address?

Phone-tree-bot: 186 North 15th Street Minneapolis, Minnesota

Store-bot: What is Thomas Arnold Pellington's Social Security Number?

Phone-tree-bot: 183-44-5975

Now what if Phone-tree-bot called the wrong store number?

I remember a case awhile back where someone changed the phone number to the FBI or CIA was changed on Google Maps. IIRC, they forwarded the calls to the correct number, but could have just as easily intercepted the communications. A human has a chance of noticing that "the person on the other end is asking some weird questions that make me feel uneasy, maybe I should hang up". Hopefully bots could develop the same ability? Having a centralized source of phone trees for all the cable, cell phone, and service providers would certainly help.


> Imagine if there wasn't quite enough security built into all of this.

Fortunately, it seems that security was a consideration. From the Ars Technica article: At one point, the callers' email was asked for and Duplex responded with "I'm afraid I don't have permission to share my client's email."


I'm just imagining how to secure that in a voice only system:

Phone-bot: To confirm store identity please sign phrase with store private key.

Store-bot: One second

Store-bot: Data 5 7 4 9 a c e 7 f 6 d 0 0 2.....

Phone-bot: Identity confirmed to match Store public key

... sensitive convo ...

Of course it's ridiculous, but I got a chuckle


"Thep Thai’s owner insists that such a service would be something of a godsend for the 100-plus reservations the restaurant fields on a daily basis."

I'm a little confused by this and it never seems to be explained. How does this system make the restaurant's job easier?


I would guess many of these 100-plus reservations don't go smoothly. Callers might not speek English so well, or they might call amidst street noise, hesitate on their wishes... It's plausible that a standardized call format from the Google Assistant would be easier for the employees and take less time?


Two possibilities:

1. She wants it to answer the calls

2. The average person, when told "6pm on Tuesday is not available" probably either spends a bunch of time on the phone looking for another open spot on their calendar, or just calls back later.


Third possibility: Google is pushing this narrative because Step Two of their Duplex scheme is to automate the receiving end as well. No need for any human interaction that way.


So two standard APIs talking to each other or the person calling to make the reservation knows they are talking to a computer with limited domain understanding and speaks accordingly and the backend calls an API -- like programming an Alexa skill....


This doesn't make sense. They should just use an API. Why bother speaking like human when it is a bot at the other end!! :D


Duplex doesn't solve #2


Eventually they will release a system for the restaurant to take reservations.

Then in the future it will just be robots talking to robots. They should just develop some sort of API language to communicate more efficiently. Like a robot morse code with compressed information.


What about the question of no-shows? If your assistant can make bookings easily - you can make a load and then choose where you actually go later. That's actually one reason why restaurants like phone booking - because if you've had the human interaction of talking you're less likely to not show up. The likely upshot is that the industry will have to move to paying for bookings.


Off topic: to all the UI/UX designers out there, I instinctively click all prominent "X" symbols on a page before trying to determine what they mean. Every time I visit tech crunch I end up clicking the X and then hitting the back button to bring the article back up. A "Done" button would be better. Non-modal article content would be best...


> While the disclosures weren’t there in the earliest stage, the company has said since the beginning that it intended to add them.

> In my test call, I attempt to get Google Assistant to repeat that bit — it’s easy enough to not hear that opening line, particularly when you’ve got the phone up to your ear inside a crowded restaurant. But the AI just barrels on with the reservation. If you miss the disclosure, you’re out of luck — for now, at least. At present, the only way to opt out of being recorded is to just hang up the phone — not the best way to get repeat visitors.

This part makes me deeply uncomfortable. Opting out of being recorded likely means being fired for not taking its reservations. Consent feels weirdly coercive. For a tiny improvement in someone else's convenience, we may end up inching even closer to everything being recorded.


I get the idea of wanting to provide voiced communications for those that it may be difficult for (or for translation), but I still cannot rationalise why it needs to sound 'just like a real person.'

Would the initiator feel silly if a service called for them and sounded robotic, rather than like a friend?

Why does it need to distinctly act human? I cannot see the specific requirement for this imitation, other than it's cool/creepy. Or it's just tech marching on to deliver sci-fi dreams.


I believe some of the questions around Duplex were not whether or not the recordings were "real," but whether or not they were edited. The article was not clear on this...


From my understanding reading the article, the journalist that were they each got to try out the system themselves. While they don't provide exact audio, I assume they would've pointed out if what they experienced was wildly different. Their description of the conversation sounds a lot like what we had heard.


I think the refinement of saying "this is the google assistant" at the beginning is positive. Seems to solve the supposed ethical issues of the previous iteration.


There are a lot of vehement responses here --- with such strong naysayers, maybe they (Google) are setting themselves up for something that really pushes the boundary of what's possible and people are actually quite excited. =)

I'm personally just excited to see where this goes. Either it will never live up to expectations (i.e., good enough for people to reliably use), it will take far longer than everyone things, or we'll see it soon!


Interesting 'the next round will find Assistant inquiring about business hours.'

Probably a really valuable addition to Google Search Results, if they can use phones to verify open hours of businesses without having to pay people to make the calls (and without annoyance of older tech IVRs).

Google search and maps in particular are so great because of all these sources of data that come together to give me actionable info.


Clever for Google to offer a tool to collect even more data about users and businesses. The Duplex is a golden trove to get data about orders and context -- the part that Google missed when people order just over a phone. Now Google could be all over the conversation and collect every bit of business activity and user data.


Ethics aside, Duplex was built to overcome one key constraint: Restaurants are technology laggards.

Yes, booking a reservation with few taps IS easier/more accurate, etc, but that requires restaurants to pay/integrate with those systems.

With Duplex, you can schedule a reservation at ~100% of restaurants, right now, for free.


Even the Google AI finds it frustrating to deal with automated IVR according to this article. That concept needs to get an award for one of the most awful UX to ever exist - to say nothing about its (mis)use in phone numbers that could field emergencies.


I'm curious how Google envisions the future of this technology. Will it be limited to specific, Google defined use cases such as making reservations, or will it be a versatile platform allowing developers to create various agents, much like Dialogflow?


I wonder how Duplex would work if it rings a line powered by Duplex (for example, a business uses Duplex to schedule bookings). Would it just talk to the backend, or would Duplex talk to the receiving Duplex in the normal scheduling chat dance?


"The wait time at REQUESTED_RESTAURANT could be longer than usual, would you want to try GOOGLE_DUPLEX_PARTNER, instead? I already have a tentative booking on hold."


The article says "Eight percent is pretty good.." Actually, 80% success is unusable. I expect that is fairly close the the ceiling. If they really do roll this out (spoiler, they won't) google will need an army of humans to handle the other 20%. I guess I don't mind google paying a human to be my personal assistant, but I don't know if google will like it.

I hope the human assistants are not randomly selected, but weighted to select the same human you've used before. This will give the illusion that you really have your own personal assistant.


> Duplex represents a rare early look into an ongoing project from a company notorious for playing it close to the vest.

Wat. Google is notorious for announcing dozens of projects only to cancel/downplay them a year, a year and a half later.

Their I/O keynotes are usually replete with "awesome product, will be released some undefined time in the future" (many of those never materialise, or take on new forms).


2 ways to look at this, although I assume you know both these already.

1. Experimentation is key, and a willingness to accept mistakes and move on. If all a company does is play to its strength, you'd miss disruptive technologies. Some of these experiments work, some dont.

2. Pursuing an idea without a viable business opportunity is suicide. The idea might be "cool" and "well received" but if it cant sustain in the long-run, it doesn't serve anyone.


Right, this is basically inevitable if you're willing to try new things (and announce them significantly ahead of a public launch). You're never gonna be 100% accurate with guessing which potential products will turn out to be viable products.


Just because something is possible doesn't mean it should be done. Google needs some non-engineers.


Duplex is likely built with deep reinforcement learning, though i have not seen that disclosed yet.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: