Recently, I came across some threads asking why data labeling is difficult. (I have my biases as engineer) In my opinion, it's because labeling it essentially determining truth. But, truth requires context, interpretation, and domain knowledge. Sometimes it's easy (with caveats like dataset bias, labeler bias, taxonomy bias). But, for more complex labels, truth is not easily abstractable nor tractable.
To me it boils down to misunderstanding what the technology can do. If you are trying to have model that labels people as "unsuccessful" based on a picture (example from the article), of course you are setting yourself up for failure. If you're looking for a model that tells whether a manufactured part has a defect, you have a good chance at succeeding. The real question I have is why people ever think ML should be used for judgement calls.