Hacker News new | past | comments | ask | show | jobs | submit login
Machine Learning with TensorFlow Enabled Mobile Proof-Of-Purchase at Coca-Cola (googleblog.com)
93 points by zyngaro on Oct 17, 2017 | hide | past | favorite | 31 comments



Man, software engineering at the cutting edge is getting harder and harder by the day. Not only are you expected to master coding but also math heavy ML and economics. I guess a consequence of software eating the world is that more and more fields of human knowledge is folded into the engineering world. The consequence being that engineers really aught to be very broad in their reading if they want to take advantage of all the low hanging fruit the octopus arms stumbles into.


Math heavy ML is simple: ∀ problems: solution is Convnets (CNN in the article).

But yes, you should learn to apply multilayer convnets to image problems.

And don't worry about architecture. There's 2 variables : number of parameters and number of layers. There's an optimum, which will be bigger than "advanced" architectures, but it will match the advanced architectures' performance very closely (1-2% worse perhaps).

Slight issue is that it takes 2-3 hours to test a (number of free parameters, number of layers) combination, and so "hyperparameter tuning" (pick random and test, repeat) takes a long time. But if you make it a nice pipeline you can do it on cloud in parallel on a regular basis (as new data gets added) and have extremely good results.


   Math heavy ML is simple: ∀ problems: solution is Convnets
Oh, if only that were true.

The articles application, however, is a good example of where they are very applicable.


I think software engineering is easier now than back in the day. I have heard from people who used to have to write the code by hand change it to assembly and then program their chip. Debugging was a pain. Everything was a pain. We have also these crazy tools now. We don't need to know mnemonics.


While the very early pre-assembly, "enter the boot code sizeof(WORD) bit toggle switches at ta time on the front panel[1]" days were very difficult, working with bare metal hardware could be tedious, but usually no more difficult than today's similarly-sized projects. Debugging was sometimes easier with a logic analyzer watching values move across the CPU's data pins; it became a lot harder when you also have to worry about cache coherency,

> We don't need to know mnemonics.

Every specialized discipline has jargon and technical terms. Today everyone memorizes Javascript frameworks instead.

> We have also these crazy tools now.

Yes, but we also have crazy tools that rapidly create complexity. The popularity of using tools we don't understand, didn't (directly) write, that we cannot meaningfully inspect/audit (e.g. modern machine learning) is rapidly adding unknown, interdependent complexity. This complexity is already[2] spinning out of control. The only reason it seems easier today due to most of the problem being ignored.

[1] https://en.wikipedia.org/wiki/Front_panel#Booting

[2] http://geer.tinho.net/geer.source.27iv17.txt


I agree with you. I suppose what I should have said is that software engineering is becoming much more lateral than ever before. Though it seems like the verticalness is decreasing. Hmm, I wonder if the total "area" of "verticalness" times "lateralness" has stayed constant over time :) Perhaps it is a reflection of how much time an engineer can spend on learning things, and I would expect the "area" is different across the distribution of programmers. But maybe the average area has remained invariant.


It's definitely easier. No longer do you have to wait for a book about your turbo C plus plus compiler to arrive from far away to solve a problem. Now you just Google and have an answer so fast.

It probably depends on your industry though, working on new experimental projects has a lot of benefits now because the experiments are talked about in the open on forums like Reddit and Github


This is undoubtedly incredibly impressive as a feat of engineering. Detecting codes on those sample images with >95% accuracy is no mean feat. No doubt Google uses its muscle for many worthier things than this, and the learning from this project will be applied to many more 'serious' problems. I think what this blog post triggers for some people is just a further step down the road from Google as it was - a tech first solutions later company that solved interesting problems because they were interesting - to Google of today/the future - a big corporation that acts pretty much like any other, but with a few neat tools in its toolbox.


> No doubt Google uses its muscle for many worthier things than this

I hope this doesn't sound too negative, but since Google's main revenue is ads, "worthier things" ultimately means "improved ad targeting", doesn't it? In that case, it is not so different from the demonstrated application.


Pay no attention to the man behind the curtain...


> This is undoubtedly incredibly impressive as a feat of engineering.

For years, post offices around the world have been automatically recognizing handwritten text, which is of much poorer quality and consistency than those codes.

Not sure what is new here. That it runs on a smartphone?


Yes, and it runs on an image with a shadow occluding half the cap, held in someone's shaky hand, with low-contrast colouring. AFAIK, postal scanning is done in good lighting with known optical parameters of blue/black on white. The main part used for sorting the postal/zip code, which is relatively clear and in caps/numerals. Almost all mail has printed labels. Accuracy rates are not published, many people are hired to sort through the ones that cannot be automatically classified.

It may not seem like a huge jump conceptually but in engineering terms it's a big deal.


Judging from the comments section, Google needs to improve their ML for scam-detection.


It's a hard problem. One person's scam is another person's business opportunity. I, for one, want to learn more about this amazing spell-casting.


Consider that the Latin character set doesn't change when it realizes that a new convnet is doing a good job at recognizing it...

Machine learning is hard. Adversarial machine learning is much harder.


This is where the current state of the art in industry is in machine learning : do "modern" things like scanning documents/codes/... on old processes without modifying the processes.

Say you want to do the famous quality check in factories (is the dogfood box closed properly ?), you can just do that with a convnet.


Really nice.

tl;dr:

- Customised text recognition of codes printed on goods

- UX flow designed to (i) help users correct errors, and (ii) gather additional labelled images


Also worth mentioning:

- Fast: Needed a one-second average processing time (product-code iamge -> OCR pipeline)

- Accurate: Goal was to achieve 95% string recognition accuracy (with improvement via active learning)


+ saving of millions by not having to replace old dot-matrix style printing equipment


Why not just use QR codes and Google Vision API? Kinda seems like reinventing the wheel here..


Changing bottle printing machines / label designs across the world vs updating an app


The printing is so consistent, it's practically a bar code.


AI at Coca-Cola... and it actually makes sense. Nice!


Late stage capitalism


Why "late stage"?

Nobody knows how long the current capitalism will stay alive. This might as well be "early stage" ... in the sense of "one of the many experiments of companies to find the best form of market segmentation".


It's just a meme people like to spew when capitalism gives us some heavy irony. For example a fossil fuel company pivoting to climate change themed toys. It is irrelevant to the article at hand.


Congrats google on finally inventing OCR on fixed fonts.


On curved surfaces, under varying lighting conditions.

If you want to show that standard OCR can cope with the images they have on that page, then go for it. But I suspect you'll have issues.


Still good to have a healthy level of cynicism towards Deep Learning Hype articles.


As useful as Jian Yang's hot dog detector.


Kind of ironic that 2 comments on that article are spam... Maybe spend some time fixing that too.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: