Hacker News new | past | comments | ask | show | jobs | submit login

When you talk about mapping (patterns of the) the input pixels to discrete classes, I don't think thats entirely what people do. We have the ability to make "distributed representations" of concepts e.g. word2vec, GloVe, etc which contain the idea that a sheep is pretty similar to a dog. The classes are far from discrete.

I'm pretty sure people train image recognizers to output these representations, which can include states like 51% certainty dog, 48% certainty sheep, 1% other, and if you aren't sure, take the best choice.

Its such an intuitive idea to combine these things that if it hasn't happened in the literature yet (I only looked for 1 minute), its because 1000 people have tried it, failed to improve the state of the art, and didn't publish.

On the other hand, we're generally pretty bad at inferring and generalizing 3d structure out of images, so I tend to blame that.




If you look at the fast.ai lesson 11 YouTube video, the very beginning has an amazing example of this with fish!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: