Hacker News new | past | comments | ask | show | jobs | submit login

Machine learning wouldn't help there, because it will learn from the existing categorizations. Garbage in, garbage out.



(1) Why would they train on the existing categorizations? You train on a smaller, known-good dataset.

(2) Outlier detection is a thing, even if they included the bad data. It would be trivial to detect that a small percentage of things categorized as "musical instruments" have extremely dissimilar facets/descriptions/images to all other things in that category, and very similar to things in a completely unrelated category.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: