Hacker News new | past | comments | ask | show | jobs | submit login

Disclaimer: I do ML-based work in Google Cloud, but I am not on the AutoML team.

The post says there is transfer learning involved, which means in practice you need much less data than you would if creating a classifier from scratch. Of course, more (good) data may yield better results, but it seems one of the goals behind this release is specifically to give custom (your own labels, not just generic object detection) high performance image classification to those who don't have access to Google-scale training sets.




What kind of datasets ? what size ? Is it something practical for a small business to gather ?


Figure 4 of the The Decaf paper shows meaningful learning w/ only 10 examples! https://arxiv.org/abs/1310.1531


Transfer learning is hardly a panacea, however much some would like it to be.


Disclosure: I work at Google on Kubeflow

Can you say more? I don't think anyone is saying it's magic pixie dust, but it does dramatically reduce the amount of data you need.


I'd probably phrase it as "can" dramatically reduce the amount of data you need rather than "does". Getting transfer learning to work in any kind of reliable way is still very much open research, and the systems I've seen are heavily dependent on basically every variable involved: the specific data sets, domains, model architectures, etc., with sometimes pretty puzzling failures.

I don't doubt Google has managed to make something useful work, though I'm more skeptical of how general the ML tech is. One advantage of an API like this is that it allows control over many of those variables. I'm not sure if this is what it does, but you could even start out by making a transfer-learning system that's heavily tailored to transfering from one specific fixed model, which coupled with some Google-level engineering/testing resources, could produce much more reliable performance than in the general case.


Disclosure: I work at Google on Kubeflow

As you can see here[1], we do provide quite a bit of information about the accuracy and training of the underlying model.

Additionally, the AutoML already (often) provides better than human level performance[2]. Your comment about transferring a heavily tailored model from one model to another is basically what it's doing - it's taking something domain specific (vision) and allowing you to transfer it to your domain.

[1] https://youtu.be/GbLQE2C181U?t=1m15s

[2] https://static.googleusercontent.com/media/research.google.c...


I was about to type a very similar comment, but this is much of what I had in mind.

I've also seen it used to justify insufficient validation - resulting in strange generalization failures.


It depends on the domain. It works for images because images are the same in time. It doesn’t work as well for text because there’s tons of nuance to speech patterns between groups (yelp vs google reviews)


you might want to test www.monkeylearn.com for text


Um... I don't think anyone here is saying (or even implying) that.


Maybe I was misreading, but I read it as something like : transfer learning is going to be a general solution for not having enough data. That's really not the case.


That's a fair point. There are certainly technical challenges involved in bringing this to other domains.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: