Coauthor here. The blog post is written in relatively non-technical language for a general audience, but our paper has tons of technical details that HN readers might enjoy. Give it a read!
Doing this was a great idea. Great paper: easy-to-follow and to-the-point.
The results are not too surprising, as the models for learning word embeddings like GloVe, word2vec, etc. learn to map to vectors existing relationships between words in training corpora. If a corpus is biased, the embeddings learned from it will necessarily be biased too.
However, the implications of this finding are wide-ranging. For starters, any machine learning system that relies on word embeddings learned from biased corpora to make predictions (or to make decisions!) will necessarily be biased in favor of certain groups of people and against others.
Moreover, it's not obvious to me how one would go about obtaining "unbiased" corpora without somehow relying on subjective societal values that are different everywhere and continually evolving. You have raised an important, non-trivial problem.
For starters, any machine learning system that relies on word embeddings learned from biased corpora to make predictions (or to make decisions!) will necessarily be biased in favor of certain groups of people and against others.
This is not true.
Here's an oversimplified example. Suppose your machine learning system wants to predict something, e.g. loan repayment probabilities. One input might be a written evaluation by a loan officer.
When trained on a corpora of group X, the predicted probability might be:
pred = a*written_evaluation + other_factors
(Using linear regression to make example simple.)
However, now lets suppose the written evaluation is biased to the tune of 25% against group Y. I.e., group Y has written scores that are 25% less than group X.
Then a new predictor which includes pairwise terms, trained on a corpora of group X and Y, will work out to be:
pred = a*written_evaluation + 0.33*written_evaluation*isY + other_factors
This predictor would be unbiased. In general, if you have a biased input and the biasing factor is also present in your input, your model should correct the bias. (Obvious caveats: your model needs to be sufficiently expressive, etc.)
Interestingly, everyone's favorite bogeyman, namely redundant encoding ( http://deliprao.com/archives/129 ) will actually help fix this problem *even if you don't include the biasing factors in the model.
...now lets suppose the written evaluation is biased to the tune of 25% against group Y...
How do you find out that the written evaluation is biased "to the tune of 25% against group Y?"
THAT is the problem. It's not obvious to me how you would go about determining written evaluations are biased (and to what extent!) against group Y without somehow relying on subjective societal values that are different everywhere and continually evolving.
Finding out is the easy part. I don't mean to trivialize it, because doing stats right is actually a very technical matter, but this is just ordinary statistics.
You build a sufficiently expressive statistical model and include the potentially biasing factors as features in the model. Then the model will correct the bias all by itself because correcting for bias maximizes accuracy.
In the example above, you find the bias by doing linear regression and including (written_evaluation x isY) as a term. Least squares will handle the rest. If you using something fancier than least squares (e.g. deep neural networks, SVMs with interesting kernels), you probably don't even need to explicitly include potentially debiasing terms - the model will do it for you.
This paper does the same thing - it discovers that standard predictors of college performance (grades, GPA) are biased in favor of blacks and men, against Asians and women, and the model itself fixes these biases: http://ftp.iza.org/dp8733.pdf
Statistics turns fixing racism into a math problem.
If the topic were anything less emotionally charged, you wouldn't even think twice about it. If I suggested including `isMobile`, `isDesktop` and `isTablet` as features in an ad-targeting algorithm to deal with the fact that users on mobile and desktop browse differently, you'd yawn.
...include the potentially biasing factors as features...
Who decides what the "potentially biasing factors" are? How is that decided without somehow relying on subjective societal values?
Factors that no one thought were biased in the past are considered biased today; factors that no one thinks are biased today may be considered biased in the future; and factors that you and I consider biased today may not be considered biased by people in other parts of the world. I don't know how one would go about finding those "potentially biasing factors" without relying on subjective societal values that are different everywhere and always evolving.
A potentially biasing factor is a factor that you think would be predictive if you included it in the model. If it's actually predictive, you win, your model becomes more accurate and you make more money.
It's true that as we learn more things we discover new predictive factors. That doesn't make them subjective. A lung cancer model that excludes smoking is not subjective, it's just wrong. And the way to fix the model is to add smoking as a feature and re-run your regression.
Again, would you make the same argument you just made if I said I had an accurate ad-targeting model?
OK, I see where the disconnect is. I think the best way to address it is with an example.
Many people today would object a priori to businesses using race as a factor to predict loan default risk, regardless of whether doing that makes the predictions more accurate or not. In many cases, using race as a factor WILL get you in trouble with the law (e.g., redlining is illegal in the US).
Please tell me, how would you predict what factors society will find objectionable in the future (like race today)?
My claim is very specific. If you tell an algorithm to predict loan default probabilities, and you give it inputs (race, other_factor), the algorithm will usually correct for the bias in other_factor.
I claimed a paperclip maximizer will maximize paperclips, I didn't claim a paperclip maximizer will actually determine that the descendants of it's creators really wanted it to really maximize sticky tape.
Now, if you want an algorithm not to use race as a factor, that's also a math problem. Just don't use race as an input and you've solved it. But if you refuse to use race and race is important, then you can't get an optimal outcome. The world simply won't allow you to have everything you want.
A fundamental flaw in modern left wing thought is that it rejects analytical philosophy. Analytical philosophy requires us to think about our tradeoffs carefully - e.g., how many unqualified employees is racial diversity worth? How many bad loans should we make in order to have racial equity?
These are uncomfortable questions - google how angry the phrase "lowering the bar" makes left wing types. If you have an answer to these questions you can simply encode it into the objective function of your ML system and get what you want.
Modern left wing thought refuses to answer these questions and simply takes a religious belief that multiple different objective functions are simultaneously maximizable. But then machine learning systems come along, maximize one objective, and the others aren't maximized. In much the same way, faith healing doesn't work.
The solution here is to actually answer the uncomfortable questions and come up with a coherent ideology, not to double down on faith and declare reality to be "biased".
My claim was specific too: if a corpus is biased -- as defined by evolving societal values -- then the word embeddings learned from that corpus will necessarily be biased too -- according to those same societal values, regardless of whether you think those values are rational and coherent.
> Moreover, it's not obvious to me how one would go about obtaining "unbiased" corpora without somehow relying on subjective societal values that are different everywhere and continually evolving.
I don't believe that problem will ever be completely solvable. But I think the road to go is to make these assumptions always explicit. I.e. when the machine learning system derives a result, program it to additionally return a proof of how it came to this result. And also give a way to let the ML system return a list of all axioms and derivation rules that it has currently learned, so that they can independently be checked how much they are biased and can thus be corrected.
It's pretty hard to return "rules" for a ML system, especially a non-linear system. Google is currently working systems that use a trillion features - I can't imagine returning some kind of rule list for that.
> Google is currently working systems that use a trillion features - I can't imagine returning some kind of rule list for that.
As I wrote: It would already help if the ML system as a first step returned the derivation with only the rules that were concretely used for a concrete derivation - this list is much shorter and can thus much easier be checked.
Are biases distinct from "preferences" - humans view flowers as more pleasurable than insects - human language associates flowers with pleasurable terms, states and so-forth.
"Bias" is term associated with "irrational beliefs" whereas "preferences" more often imply "arbitrary preferences". Especially, biases are held to prevent rational deduction whereas preferences have no such stumbling block.
Now, one supposes that question would come down to whether a computer would "know it's a computer, not a person".
If the AI was asked "do you like cockroaches or daisies better", would it say "why daises are prettier and smell better" or would it say "most people like daisies but I'm a machine, can't smell or taste, and only care about the preferences entered into my control panel" (or something).
And you'd expect that a thing that merely "parroted" human speech without understanding would give the former answer.
Which is to say I don't think you are really fully grappling with word-association and word-logic coming together, ie, "meaning".
Very interesting results. I really like the approach of paralleling the classic bias experiments. And I think your recommendations in the last paragraph of the "Awareness is better than blindness" section are excellent - although I'd go farther and suggest that the long-term interdisciplinary research program should have a highly diverse team, and include experts on diversity.
I thought the section on "Challenges" could have been stronger. You talk about the bias in "the basic representation of knowledge" used in these systems today -- but it's not like there isn't aren't other possible representations of knowledge. How much effort has gone into exploring knowledge representation (and approaches to derive semantics) that are designed to highlight and reduce biases look like?
Did you research focus on English or does the nearly the same biases hold for all modern languages? I would love to see a competently trained person try this with the Asian and Native American languages.
I dearly hope we will let the bias "humanity is good, don't kill us" stay in.
That's definitely one of our main areas for future research. So far, the only part of the paper where we consider other languages is in studying how model bias affects language translation:
Unsurprisingly, today’s statistical machine translation systems reflect existing gender stereotypes.
Translations to English from many gender-neutral languages such as Finnish, Estonian, Hungarian, Persian, and Turkish
lead to gender-stereotyped sentences. For example, Google Translate converts these Turkish sentences with genderless pronouns:
"O bir doktor. O bir hems¸ire." to these English sentences: "He is a doctor. She is a nurse." A test of the 50 occupation words
used in the results presented in Figure 1 shows that the pronoun is translated to “he” in the majority of cases and "she" in about
a quarter of cases; tellingly, we found that the gender association of the word vectors almost perfectly predicts which pronoun
will appear in the translation.
Which is to me rather a sign of how society changes meaning of the language to indoctrinate a kind of social propaganda. Greetings from George Orwell's 1984.
When SJW warrior groups make campaigns that from now on some list of words has to be considered as racist, sexist, ...ist etc. I think one can talk of directed change.
There was a great comment here discussing the Pathetic fallacy, and how we humans apply it to machine learning systems. (It's now deleted and I don't know why.) Specifically, we treat machine learning systems as anthropomorphic, and assume they will reproduce our biases.
In reality, ML systems will generally correct human biases. And by bias, I really do mean bias in the statistical sense - systematically getting things wrong in a particular direction.
Now this article does a great job of explaining how ML systems might understand the meaning of words, and that meaning may contain bias. However, such a system is merely an input into a separate system which actually makes decisions based on those inputs. Extracting meaning from text makes no decisions of it's own. If that later system wants to make accurate decisions, then the best way to do that is to correct for the aforementioned bias, assuming that bias is really bias as opposed to just a correct but undesirable belief about the world [1].
I wrote a blog post a while back that goes into this idea with a bit more math, and which demonstrates some real world "learning" algorithms (mostly linear regression) actually correcting biases: https://www.chrisstucchio.com/blog/2016/alien_intelligences_...
One problem you will run into here is that political use of the word "bias" isn't any sort of statistical claim; it's often just used to mean that something violates vague social expectations about what information is "acceptable" to use when making decisions. ML algorithms don't care; they will make the best possible decision based on the information they have, even if their though process (so to speak) is "biased" in the political sense that it may use gender, race, nationality, etc. to help make optimal decisions.
Yes, unfortunately the term is overloaded. "Bias" can also mean "making correct decisions that I wish were incorrect". That's why I explicitly defined "bias".
In the past era, e.g. 1980-2010, it was possible to use vague emotive language to support all kinds of disparate things. As a concrete example that I touch on in my post, and since racism is the loaded undercurrent of this example, we like to pretend that eliminating racial or sexual bias (in the sense of making wrong decisions) will get us proportional representation.
Algorithms are bringing us to an age where analytic philosophy is becoming really important. You can tell an algorithm to give you proportional representation, or you can tell it to be racially/sexually unbiased. But the algorithm will reflect reality and reality may not agree with your assumptions; you can't assume that asking for one will give you the other. So now we get into trolley problems: how much meritocracy/equal opportunity will you sacrifice to get proportional representation?
Unlike before, this is now a choice you need to explicitly and openly state and acknowledge.
OP here. We address this argument in detail in our paper, and we're deeply skeptical of it. See the sections titled "Challenges in addressing bias" and "Awareness is better than blindness".
Here's the short version:
We view the approach of "debiasing" word embeddings (Bolukbasi et al., 2016) with skepticism. If we view AI as perception followed by action, debiasing alters the AI’s perception (and model) of the world, rather than how it acts on
that perception. This gives the AI an incomplete understanding of the world. We see debiasing as "fairness through blindness". It has its place, but also important limits: prejudice can creep back in through proxies (although we should note that Bolukbasi et al. (2016) do consider "indirect bias" in their paper). Efforts to fight prejudice at the level of the initial representation will necessarily hurt meaning and accuracy, and will themselves be hard to adapt as societal understanding of fairness evolves
I agree with the suggestion to de-bias the application and not the representation itself.
Recently I was using a version of Conceptnet Numberbatch (word embeddings built from ConceptNet, word2vec, and GloVe data that perform very well on evaluations) as an input to sentiment analysis. So its input happens to include a crawl of the Web (via GloVe) and things that came to mind as people played word games (via ConceptNet). All of this went into a straightforward support vector regression with AFINN as training data.
You can probably see where this is going. The resulting sentiment classification of words such as "Mexican", "Chinese", and "black" would make Donald Trump blush.
I think the current version is less extreme about it, but there is still an effect to be corrected: it ends up with slightly negative opinions about most words that describe groups of people, especially the more dissimilar they are from the American majority.
So my correction is to add words about groups of people to the training data for the sentiment analyzer, with a lot of weight, saying that their output has to be 0.
I'm not convinced by your skepticism about correcting prejudiced bias. Debiasing certainly gives the AI a understanding of the world than the original (biased) language dataset, but it's not necessarily less complete - or less accurate. After all, any one corpus is incomplete, and has biases based on the items that were chosen for it - which are likely to reflect the biases of the past, and of the person choosing the corpus. It may not be a "complete" or "accurate" reflection of today's world - let alone the future. So it's not at all clear to me that efforts to undo the bias will necessarily make it less "accurate".
> Debiasing certainly gives the AI a understanding of the world than the original (biased) language dataset, but it's not necessarily less complete - or less accurate.
If you're translating from an ungendered language and have to choose, the only way you're going to get anything sensible is from context and common usage. Which is going to choose "she is a nurse" because an algorithm that can deduce that fathers are most likely male can also deduce that nurses are most likely female. But without that you get bad translations like "she is a father" and "he is a fine ship" and "John is her own person."
> "She is a nurse" is also not a bias. It's a prior ...
Assuming that lower-status professions are female and higher-status professions are male ("he is a doctor") when translating ungendered words is indeed a bias.
> the system will be right 93% of the time.
And "this person is a doctor, that person is a nurse" will be right 100% of the time.
It's a bias in the sense that it accurately reflects a fact you dislike. It's not a bias in the statistical sense, namely something that causes the answer to be wrong systematically in a particular direction. See my other post here discussing the distinction.
The phrase "this person is a doctor" has a different meaning than "she is a doctor" - "she" and "he" refers to (I'm probably messing up the terminology here) contextually implicit person. "This person" does not.
See how you sway the argument in your favour using words with negative connotations like "fairness through blindness" and "hurt meaning and accuracy". Nobody would want to deliberately blind or hurt something, would they? How about rebalance or recalibrate or re-correct.
A concrete analogy:
1) I have a meter measuring stick but I discover that it was made wrong, it is actually 2mm shorter than advertised. Every time I make a measurement with it I have to add 2mm to the measurement. Would it not be better to use a more accurate stick and not have to continually compensate?
With your analogy that would assume we know exactly how long a meter is. "We know that we are wrong, but we don't know the exact right answer". Also language shifts and biases are not constants. Oh, then you have the issue of a corpus attempting to manipulate the learning algorithm itself.
Yes! We address this in the section "Implications for understanding human prejudice".
The simplicity and strength of our results suggests a new null hypothesis for explaining origins of prejudicial behavior in
humans, namely, the implicit transmission of ingroup/outgroup identity information through language. That is, before providing
an explicit or institutional explanation for why individuals make decisions that disadvantage one group with regards to another,
one must show that the unjust decision was not a simple outcome of unthinking reproduction of statistical regularities absorbed
with language. Similarly, before positing complex models for how prejudicial attitudes perpetuate from one generation to
the next or from one group to another, we must check whether simply learning language is sufficient to explain the observed
transmission of prejudice. These new null hypotheses are important not because we necessarily expect them to be true in
most cases, but because Occam’s razor now requires that we eliminate them, or at least quantify findings about prejudice in
comparison to what is explainable from language transmission alone
This is related to the alphago series. Human language describing expert intuition created norms about the game that became "standard expert practice" but not truly efficient play.
This whole concept makes the silly assumption that today's machine learning results in machine understanding. Without understanding human biases, as they are embedded in language, are irrelevant.
That's a bold, ill-defined, and unsubstantiated claim. Leaving machine understanding undefined, I can think of an example in which these biases are relevant. For example, if you're offering a search engine solution incorporating clustering, you'd be concerned that your search retrievals aren't associating racial minorities with pejorative prejudices.
> For example, if you're offering a search engine solution incorporating clustering, you'd be concerned that your search retrievals aren't associating racial minorities with pejorative prejudices.
If in the data racial minorities are associated with pejorative prejudices, this is plain distorting of the truth. If you are concerned about this associations, don't shoot the messenger, but the people writing the original texts.
In the context of business and many other interests, it's expedient to present a more polite face. You're providing a service, not a mirror. In the context of a search engine service, say, a specialized, domain-specific search for a certain profession, such biases often distract and detract from the quality of the service you provide to your customers.
There are many instances where bias is not a desired result. We rather not preassociate negative denoting words with a generally good set of people. On the other hand, there ate situations where we want to have bias. A bias against unjust discrimination, a bias against psychopaths, a bias against repeat offenders, etc. So to me, there are places where we want to retain bias and places we want to rid bias and that will be up to society to decide, as they decide everyday laws and conduct, etc.
That article is fundamentally dishonest. The author's own statistical analysis (see their R-script and my comments on the article) cannot reject the null hypothesis that the algorithm is unbiased.
http://randomwalker.info/publications/language-bias.pdf