OP here. We address this argument in detail in our paper, and we're deeply skeptical of it. See the sections titled "Challenges in addressing bias" and "Awareness is better than blindness".
Here's the short version:
We view the approach of "debiasing" word embeddings (Bolukbasi et al., 2016) with skepticism. If we view AI as perception followed by action, debiasing alters the AI’s perception (and model) of the world, rather than how it acts on
that perception. This gives the AI an incomplete understanding of the world. We see debiasing as "fairness through blindness". It has its place, but also important limits: prejudice can creep back in through proxies (although we should note that Bolukbasi et al. (2016) do consider "indirect bias" in their paper). Efforts to fight prejudice at the level of the initial representation will necessarily hurt meaning and accuracy, and will themselves be hard to adapt as societal understanding of fairness evolves
I agree with the suggestion to de-bias the application and not the representation itself.
Recently I was using a version of Conceptnet Numberbatch (word embeddings built from ConceptNet, word2vec, and GloVe data that perform very well on evaluations) as an input to sentiment analysis. So its input happens to include a crawl of the Web (via GloVe) and things that came to mind as people played word games (via ConceptNet). All of this went into a straightforward support vector regression with AFINN as training data.
You can probably see where this is going. The resulting sentiment classification of words such as "Mexican", "Chinese", and "black" would make Donald Trump blush.
I think the current version is less extreme about it, but there is still an effect to be corrected: it ends up with slightly negative opinions about most words that describe groups of people, especially the more dissimilar they are from the American majority.
So my correction is to add words about groups of people to the training data for the sentiment analyzer, with a lot of weight, saying that their output has to be 0.
I'm not convinced by your skepticism about correcting prejudiced bias. Debiasing certainly gives the AI a understanding of the world than the original (biased) language dataset, but it's not necessarily less complete - or less accurate. After all, any one corpus is incomplete, and has biases based on the items that were chosen for it - which are likely to reflect the biases of the past, and of the person choosing the corpus. It may not be a "complete" or "accurate" reflection of today's world - let alone the future. So it's not at all clear to me that efforts to undo the bias will necessarily make it less "accurate".
> Debiasing certainly gives the AI a understanding of the world than the original (biased) language dataset, but it's not necessarily less complete - or less accurate.
If you're translating from an ungendered language and have to choose, the only way you're going to get anything sensible is from context and common usage. Which is going to choose "she is a nurse" because an algorithm that can deduce that fathers are most likely male can also deduce that nurses are most likely female. But without that you get bad translations like "she is a father" and "he is a fine ship" and "John is her own person."
> "She is a nurse" is also not a bias. It's a prior ...
Assuming that lower-status professions are female and higher-status professions are male ("he is a doctor") when translating ungendered words is indeed a bias.
> the system will be right 93% of the time.
And "this person is a doctor, that person is a nurse" will be right 100% of the time.
It's a bias in the sense that it accurately reflects a fact you dislike. It's not a bias in the statistical sense, namely something that causes the answer to be wrong systematically in a particular direction. See my other post here discussing the distinction.
The phrase "this person is a doctor" has a different meaning than "she is a doctor" - "she" and "he" refers to (I'm probably messing up the terminology here) contextually implicit person. "This person" does not.
See how you sway the argument in your favour using words with negative connotations like "fairness through blindness" and "hurt meaning and accuracy". Nobody would want to deliberately blind or hurt something, would they? How about rebalance or recalibrate or re-correct.
A concrete analogy:
1) I have a meter measuring stick but I discover that it was made wrong, it is actually 2mm shorter than advertised. Every time I make a measurement with it I have to add 2mm to the measurement. Would it not be better to use a more accurate stick and not have to continually compensate?
With your analogy that would assume we know exactly how long a meter is. "We know that we are wrong, but we don't know the exact right answer". Also language shifts and biases are not constants. Oh, then you have the issue of a corpus attempting to manipulate the learning algorithm itself.
Yes! We address this in the section "Implications for understanding human prejudice".
The simplicity and strength of our results suggests a new null hypothesis for explaining origins of prejudicial behavior in
humans, namely, the implicit transmission of ingroup/outgroup identity information through language. That is, before providing
an explicit or institutional explanation for why individuals make decisions that disadvantage one group with regards to another,
one must show that the unjust decision was not a simple outcome of unthinking reproduction of statistical regularities absorbed
with language. Similarly, before positing complex models for how prejudicial attitudes perpetuate from one generation to
the next or from one group to another, we must check whether simply learning language is sufficient to explain the observed
transmission of prejudice. These new null hypotheses are important not because we necessarily expect them to be true in
most cases, but because Occam’s razor now requires that we eliminate them, or at least quantify findings about prejudice in
comparison to what is explainable from language transmission alone
Here's the short version:
We view the approach of "debiasing" word embeddings (Bolukbasi et al., 2016) with skepticism. If we view AI as perception followed by action, debiasing alters the AI’s perception (and model) of the world, rather than how it acts on that perception. This gives the AI an incomplete understanding of the world. We see debiasing as "fairness through blindness". It has its place, but also important limits: prejudice can creep back in through proxies (although we should note that Bolukbasi et al. (2016) do consider "indirect bias" in their paper). Efforts to fight prejudice at the level of the initial representation will necessarily hurt meaning and accuracy, and will themselves be hard to adapt as societal understanding of fairness evolves
Direct link to our paper: http://randomwalker.info/publications/language-bias.pdf