Hacker News new | past | comments | ask | show | jobs | submit login

I feel like we're missing the popular vocabulary to describe some of these recurring problems.

The algorithms reflect and amplify biases in the data. They also go into feedback loops once they affect the reality being represented in data.

Imagine that an algorithm selects individuals at an airport for search. Contraband smugglers are caught. The dataset now includes them, evolving the bias. We have already seen this at play on social media, recommendation engines, fraud detection, etc.

The future is worrying. There are ratchets on technologies such as these. Unless there are extraordinary counter-reactions, they will be "affecting their own datasets" en masse very soon.

The term I've heard for this phenomenon is "mathwashing."

The old chestnut for this is "lies, damned lies, and statistics."

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
