Machine learning wouldn't help there, because it will learn from the existing ca...

exogen · on Oct 19, 2016

(1) Why would they train on the existing categorizations? You train on a smaller, known-good dataset.

(2) Outlier detection is a thing, even if they included the bad data. It would be trivial to detect that a small percentage of things categorized as "musical instruments" have extremely dissimilar facets/descriptions/images to all other things in that category, and very similar to things in a completely unrelated category.