Hacker News new | past | comments | ask | show | jobs | submit login
Google Photos AI still can't label gorillas after racist errors (theregister.com)
43 points by LinuxBender on May 30, 2023 | hide | past | favorite | 68 comments



I agree that blocking the feature is the smartest thing to do if people are going to get offended. It's idiotic to call it "racist", it's a shitty classifier, and it's incredibly irresponsible (and probably racist) as well as dismissive of actual racism to pretend that's racist.

At the same time, I'm surprised they can't get this classifier to work. It doesn't seem like a very challenging problem (I'd consider myself a computer vision expert). I wonder if they're over-reacting and just deciding to zero any residual risk by not allowing that classification anymore.

Identity politics aside, it would be an interesting study to try and break a man/gorilla classifier. Like take a picture of a man, say in a jungle setting and showing teeth or with a furry hat on, and see what the actual failure modes are. Regularly occurring misclassifications are a useful window into how a model operates.


> At the same time, I'm surprised they can't get this classifier to work. It doesn't seem like a very challenging problem

Why are you assuming that they can't get the classifier to work? There's no evidence to that effect in the article.

> I wonder if they're over-reacting and just deciding to zero any residual risk by not allowing that classification anymore.

Why do you describe that as over-reacting? It seems like the appropriate level of reaction given there is no upside for getting it right and a massive downside for getting it wrong.

It doesn't even need to be a top-down edict, it's just the way the incentives will work out for everyone involved.

Let's say that you're an engineer working on that team, and find that there's a bunch of terms added to a blocklist a decade ago and still there. Are you going to just remove them, on the assumption that the classifications are correct? Of course not. How much validation work are you willing to do to convince yourself and others that the results are 100% correct? Is that really going to be the most interesting or impactful work you can do this quarter?

Similarly you'd find that whoever needs to approve such a removal would have incentives very strongly biased toward not approving the removal of terms from a blocklist. If everything goes well, they get no credit. If something goes wrong, they get the blame. It's possible to push through that bias toward inaction, but it requires it to be a change that somebody really wants to make happen.


> It's idiotic to call it "racist", it's a shitty classifier, and it's incredibly irresponsible (and probably racist) as well as dismissive of actual racism to pretend that's racist.

It might be useful to understand why it's offensive... Likening black people to apes/gorillas/monkeys is a racist trope with a long history. It's part of casting black people as less than human, which has been used to justify their enslavement and denial of human rights.

I think it's very likely the classifier was not constructed to be racist, but once you see that it is producing racist tropes, it is racist to continue to let it do so.


This. Once a bias is known it must be corrected. Allowing it to stand is basically saying "we don't care about black consumers, they can't use this feature, and we will continue to insult them on accident". Which it's no longer an accident, it's a known racially discriminative software feature...

It's like hiring someone to do plumbing. A few customers invite them into their home and report that the new plumber said racist stuff. As a business owner you do what after this?


Comically, the most potentially racist aspect (and I agree that having a bad classifier classify people as gorillas isn’t racist) is that Google hasn’t prioritized it enough in 8 years to fix.

And this seems to fall into the sad, overly pervasive racism of people not caring vs the more explicit “a human calling someone a gorilla is racist” category.

But also, it’s not like there’s many people who need to classify gorillas in photos so it likely doesn’t get brought up that much.


> is that Google hasn’t prioritized it enough in 8 years to fix.

I understand that there's nothing to fix. It doesn't label anything as gorillas anymore.

Why waste effort in bringing it back, when another mislabeled photo will only serve for people to scream "racism" on social media? Better to not poke that particular beehive, I reckon.


The bug is in the classifier in that I would want a high accuracy in classifying. I doubt this is the only error it makes, just the most visible. So google isn’t prioritizing accurately classifying black people, just they hard coded to never classify gorillas. Presumably because their classifier is still wrong enough that it would think black people are not people.

If they cared about this problem rather than just thinking it’s a “beehive” to be avoided rather than a real problem, then they would fix it. But they don’t actually care about accuracy in classifying people of some races, so they just avoid it.

Not super racist, but definitely shows they don’t care about this type of bug in that the affected populations aren’t important to them. I understand not caring about gorillas, but I would expect they should be very interested in accurately classifying black people.


> Presumably because their classifier is still wrong enough that it would think black people are not people.

That is a very weird interpretation.

The classifier is likely accurate enough to be useful, with many edge cases where things are mislabeled. Edge cases might be difficult to fix without causing other issues, or perhaps the technology to be more accurate is just not there yet.

One of those edge cases causes a massive shitstorm on social media, where many people would love to scream "racism" at it.

So we have two possibilities:

1) Put an indeterminate amount of effort to make things more accurate. It won't bring any benefit (as the current classifier is accurate enough), and if you keep mislabeling black people as monkeys even in rare cases, people will still yell on social media. Beehive indeed.

2) Hardcode gorillas out of the classifier. This solves the issue with minimal effort and causes no shitstorm, besides some annoying people grumbling that it is still racist, but with no substance to their claims.

The choice, to me, is obvious.


Let's say the classified gets it right 999,999 times out of a million. It's about as fixed as it will ever be. Why would Google bring it back when there will be dozens of instances of them getting it wrong a year? It is mostly downside with very little upside.

You can actually find things the Google probably thinks are gorillas and monkeys by searching for "zoo".


I'm on board with assuming that the intent of Google / the engineers who built the classifier is not racist. However the outcome — labelling a black person as a gorilla — certainly is racist. What makes you think otherwise?


The reason I don’t think the actual act of the program bugging is racist is I imagine it’s just some stupid rule based on shape (primates have a similar shape) and skin color and isn’t smart enough to distinguish humans vs gorillas vs Gumby toys with black/grey skin closer to a gorilla.

I think intent is important for labeling something racist and a function doesn’t have intent. And it doesn’t seem like the programmer had intent.

So I agree that a human seeing a person and labeling it as a gorilla is racist, it’s because the human is making an inappropriate value judgement.


The reason I don’t think the actual act of the program bugging is racist is I imagine it’s just some stupid rule based on shape (primates have a similar shape) and skin color and isn’t smart enough to distinguish humans vs gorillas vs Gumby toys with black/grey skin closer to a gorilla.

But this is AI -- so the stupid rule is not pre-programmed, but rather curve-fit to the data (uh, "learned").

So ultimately it's a matter of (1) failing to find the right training data (or procedure) and (2) more fundamentally, choosing not to correct the problem after 8 years.


I was generalizing the bug but my basis is an assumption that the programmer didn’t make some conscious or unconscious racist decision, but just something basic like “match shapes and colors” and the training data had a bunch of gorillas for one reason or another.

I think this gets fixed by better training data and more pictures of really dark skinned people. So with more supervised labels of dark skinned people to people, properly so the matching doesn’t think people are closer to gorillas.

Comically/sadly, we’ll know we get closer to fixing the training sets to be more inclusive when google starts labeling gorillas as people.

I think there are some systemic reasons why there aren’t more diverse populations in training data. And those are more society issues than AI issues (ie, rich people are more represented, rich people are certain races, therefore races are more represented).

And finally, I’ve worked in software that people just test what they are and know so I’ve seen so many test plans that are too simple and only test the programmers dob and address. This doesn’t mean racist because all the programmers are Asian males. It just means the quality review wasn’t thorough enough to include proper test conditions.

I might be inappropriately conflating software bugs from different areas but this is what makes me think “stupidity or weakness more likely than racism.”


How is that racist? Because you’re projecting a racist comment onto it, a classifier? That does not make sense to me


The racism comes from the fact non-white people were not properly considered when the model was developed and trained. This comes up time and time again in AI, ranging from face ID that only works on white people, to porn classifiers that associate black people with NSFW images.


No matter how good or well trained on good data with good representation of all skin colors a classifier is, it’s going to misclassify people and things periodically, and it’s definitely going to misclassify black people as gorillas more often than other races.


But, white people get misclassified as animals by the classifier too. Typically white people aren't misclassified as gorillas but as other animals. So i don't think the cause is as simple as non-white people not being considered during training.


It classified 80 photos of the same black person as 'gorilla', I have not heard of that happening with white people.


I saw lots of examples of white children being classified as seals


Have a link to an example or two? I can't find any after a few minutes of searching.


i was working at google at the time so the examples i saw were all in internal documents


Of course.


> The racism comes from the fact non-white people were not properly considered

Is this the case? Do we even know for a fact that only non-white people were mislabeled as anything else?

Or are we just, you know, throwing out baseless speculation as fact?


If a system consistently misclassifies persons black persons far more than white persons -- and does so in a way that's obviously provocative and offensive -- then by definition it's racist in its effect (regardless of intent). The fact that the smartest company in the world cannot seem to get a handle on this problem after 8 years is also not unreasonable grounds to suspect that something's up.

Like that they don't appreciate the gravity of the problem, for example.


These are dangerous grounds to discuss, but I don't think it's racist (colloquially) at all. If gorillas were like yetis and covered in white fur and it started labeling anglos as gorillas, it's not racist either. Racism (colloquially) comes from bad people's intentions. Who would've thought that a creature that is very similar to us humans and has a color that matches some humans would accidentally classify something poorly.

What would be racist from this outcome is if it kept doing this and no one did anything. Clearly it hurts people's feelings and that is a very valid issue. Googles option to just nuke it is a great start until they can hammer out the kinks.


Racism (colloquially) comes from bad people's intentions.

Racism can also be measured by its effect, regardless of intent.

What would be racist from this outcome is if it kept doing this and no one did anything.

After 8 years, that's seems to be precisely what's happening.


Isn't the point of the article that it just refuses to recognize gorillas outright? That prevents exactly what you're talking about. And I made that point in my post. It is hurtful so Google prevent google photos from classifying anything as a Gorilla is a good bandaid. Some things are just too risky to solve for little gain.


Racism isn't an objective order existing in the universe separate from us. It's part of human experience and exists where humans experience it.

Given the recent history of equating black people with non-human primates, and using that to deny them rights & full participation in society, making this error is going to be experienced as racist. It's not a matter of individual malice or taxonomic classification, but of history and social relations.


I think we can all agree that the classifier is horribly broken.

But it seems like if nobody is working on this, how will we ever fix this gaping hole in image classifiers? And don't we want to fix it? And to fix it, research will continue to get it wrong until they get it less wrong and more right, but can only iterate without a massive backlash. It seems like being stuck between a rock and a hard place.

I am rhetorically asking, wouldn't we have to allow researchers to iterate on this problem to fix it? That simply won't happen until we are able to allow them leeway understanding that this is an incrementally improving model. Otherwise what we have is just a sledgehammer solution (just banning all primate classifications) which actually never addressed the problem, that these models do have a race-based bias (probably in their input datasets.)


I'm simply answering the question of how it is racist, not currently trying to tackle the appropriateness of fixing the racism or the technical hurdles involved in that. It's outside my expertise and not particularly relevant to the comment I was responding to.


I suppose this could be an example of Popper’s third world.


Because racism is about harm, not an estimation of a thing's motivations and prejudices -- which it's why it's still racist even if you didn't mean it or didn't know. It doesn't actually require a mind at all. Anything that confers, amplifies, or perpetuates harmful stereotypes or negative associations with people of a specific race is racist.

The thing you're calling racism is actually hate speech as it's typically defined in law.


Presumably, the GP considers intent to be the only relevant factor in determining if something is racist.


Leaving the current situation aside, it's an interesting philosophical point. In law you have the concept of "mens rea" https://en.wikipedia.org/wiki/Mens_rea


Intent WAS a factor here. There was no intent to consider anyone other than white people when the model was trained.


Define racism?


That's a good question. The ML bias this isn't necessarily due to one prejudiced person doing this on purpose. It's connected to systemic racism where history and culture added up to the status quo that is biased.

The fact is that training sets usually contain many more white men than black women, especially if they're just scraped off the web. People who guided the training may have just used datasets that reflect their own culture and demographics of their own country, and didn't see a problem with that. The opposite would have been be seen as "pandering to diversity" in their country, so they've ended up with a biased dataset and a biased algorithm.


> It's idiotic to call it "racist", it's a shitty classifier, and it's incredibly irresponsible (and probably racist) as well as dismissive of actual racism to pretend that's racist.

To quote from the article:

>> The company was criticized when a software developer, Jacky Alciné, found the image recognition system deployed in its Photos app in 2015 had mistakenly labelled a photo of him and his friend as gorillas.

People who are black have every right to complain about Black people being classified as gorillas, particularly as that comparison is a very common act of discrimination - most infamously in soccer, where e.g. Vinícius Júnior is routinely subjected to monkey sounds and chants [1] and the President of the Spanish Agents association literally told him to "stop playing the monkey" as a goal celebration [2].

[1] https://www.cbssports.com/soccer/news/vinicius-junior-faces-...

[2] https://www.dailymail.co.uk/sport/sportsnews/article-1121907...


There are efforts to make more accurate classifiers (2021), which seems to come down to avoiding dataset biases, but for them to work, they essentially have to group humans into racial subsets (of which 'gorilla' and 'chimpanzee' might be a discrete entity, which is evolutionarily interesting at least, as humans/chimps/gorillas are more closely related than to each other relative any outgroup).

https://www.mdpi.com/2227-7390/9/2/195

> "Studies have shown that due to the distribution of ethnicity/race in training datasets, biometric algorithms suffer from “cross race effect”—their performance is better on subjects closer to the “country of origin” of the algorithm."

However, these classifiers are also problematic as racial identification is a bit thorny politically. It's comparable to gender identification classifiers in the context of trans politics (as, regardless of personal identification, there are some classifier-accessible differences between the vast majority of female and male faces, although I suppose there's an androgenous middle ground).

To go out on a limb a little, imagine an 'Aryan vs. Jewish' classifier - who wants to be associated with that? Lots of downsides for any commercial outfit, certainly, and few upsides.


> it's incredibly irresponsible (and probably racist) as well as dismissive of actual racism to pretend that's racist.

Because of the entanglement of people who harbor individual racial animus and the systemic factors that lead to disparate outcomes for people of different races we are in a bad situation vis-a-vie language, but I think this isn't quite right.

I agree there's no reason to believe the engineers involved had any prejudice (i.e. individualized animus against non-white people). It's the result of under-testing or corner cases or whatever. However, the system is definitely racist.

It's racist in the sense that it has an adverse impact on people of different races (white people will not be mis-detected as gorillas) and its racist in the sense that there must have been racial imbalances in development (one struggles to imagine this being launched if all humans were being detected as gorillas for instance). This doesn't require any prejudice on the part of the people making it - it's just a fact that we can observe that it has disparate racial impacts and that choices in its development have led to a product that is unevenly useful to people of different races. It's also clear because facial recognition systems perform worse across all measures on people with darker skin (not just on this metric) - it's not like the accuracy rate is the same but the mistakes are different.

Also, of course, yes - people will point out that different skin tones interact differently with photographic technology and algorithm designers are fighting an up hill battle. It's true! We have historically been good at solving the challenges of capturing images of light skinned people better than dark skinned ones. That's racist too! It's hard to not build a racist system when your tools already have racial bias. Again - pointing out the racial bias in this system is not an indictment of the people who built it. It's an indictment of the world it emerged out of and a bookmark to come back to when you see people using the outputs of the system.


My guess is that the model does work, but that verifying that it works is just way too expensive for a tiny benefit. Especially if the model is constantly evolving.


It's idiotic to call it "racist", it's a shitty classifier

This is sort of like telling residents of a historically black neighborhood which just so happens to have been exposed to drinking water with dangerously high levels of lead since time immemorial (despite decades of complaints) that it's just "shitty pipes", and that it's "idiotic" for anyone to think this state of affairs could possibly have something to do with racism.

I agree that the matter is quite nuanced, and there was no racist intent. However it's far from "idiotic" to address these issues directly. If anything it seems a bit obtuse not to address them.

Where there's smoke, after all, there's usually some form of fire.


I think the lead pipes example is different because both the neighborhood structure and the lead are results of the same intentional racism.

The classifier is not the result of any racist action on behalf of the programmer.

It’s more like blaming a hospital administrator when people of a discriminated race have bad outcomes at their hospital, when the outcome is the systemic rate of all the other hospitals. The hospital outcome isn’t racist, because the systemic issues that affect the outcomes of the hospital patients have nothing to do with the decisions of the hospital.


Not necessarily -- recall that lead service lines were extremely widely used in the U.S. in all neighborhoods, black and white. So in fact it's more likely that he pipes being first put into the ground without any racist intent at all.

It's the fact of this problem not being remediated in certain neighborhoods that cause people to ask, very legitimately -- how this came to pass.


I think it's idiotic and racist to dismiss people offense of being labeled as an animal with a long history of being used as derogatory racial insult


> I wonder if they're over-reacting

There doesn't seem to be any benefit to solving this problem, so how is it an overreaction given the negative press?


There's no benefit to you personally maybe? I'm sure there's a lot of black people who would like to be correctly identified instead of being labeled gorillas.


They mean there's no benefit to correctly labelling gorillas so they keep that turned off completetly.


They aren't labelled gorillas. Google's API won't label gorillas.


[flagged]


That would at least be an over-specific class though, as most colonialists are people. The issue here is that it's often not seeing people as people, and identifying them as animals. It would be more like if it ID'd white folks as cows or such.


White folks = cows isn't a fair comparison, there is a strong racist history of comparing black people to monkeys/gorillas that increases the social harm.


Sure. There's not a good comparison. But even if we're just talking about differences in training model issues, a portion of the underlying issue is that the classification is so wrong that it doesn't see some people as even being human.

The fact that it's specifically gorillas is a level on top of that, but it would probably still be bad/a problem if it had the other common computer vision issue where it just doesn't see people with darker skin tones at all.


If it labeled me as a pig I’d understand it was a misclassification. As a colonizer I’d understand it was clearly someone injecting bias.


I would take being called a gorilla a compliment. They are jovial, endearing and typically well tempered. Humans on the otherhand I would almost consider an insult. Since we start massive wars, raise cultures and poke fun at each other in ways that are racist. Kinda bad when animals are better role models.

But really it is history that is to blame. Gorillas are great. But if you called anyone a Lion or Lioness it would be perceived as a compliment. Gorillas don't deserve such connotation.


I think I could write a classifier that would accurately separate out gorillas and humans in a weekend with a dataset. But I guess as its pointed out here, there's no upside to getting it right and massive downside to getting it wrong


I suspect the challenge is not in getting that particular feature to work in isolation (which I agree seems trivial). But rather, in getting it to work in a way that integrates with the 10,000 other decision boundary problems that a large-scale AI system needs to be able to navigate.

That is to say: not so challenging that they couldn't have got wrapped by now, if they cared enough to do so. But not necessarily trivial, either.


Both gorillas and black people have dark skin and reasonably flat features.

The lack of contrast can make things hard the evaluate, especially if you're dealing with SDR content in a format that struggles to accurately encode darker colours.

It's not racism, it's a lack of information to the evaluator and a failure of communication by the company.

If they actually put out a statement regarding the technical issues around the problem, rather than trying to bury it in order to minimise the backlash, we could be having some far more reasonable and level headed discussions.


Also a lot of the training set for “human” is on weird mutant albino humans who are over represented in images for historical reasons. It’s not offensive to look more like our relatives than the weird mutant albino alien looking creatures do, is it?


I read the entire wikipedia page on racism to see why people call a bug in an experimental product feature, based on bad classifiers and data, a form of racism.

It's unfortunate. And I can see why it worries people. Because on the surface it looks like a systemic problem. But is it really?


Perhaps it is not racism that the bug exists, but failing to find, fix, or disable a feature that makes offensive classifications is racist.


This is a valuable lesson in what AI can do for society: act as a mirror to show us systemic biases that exist that people try to pretend don't exist.

I think most here would understand systemic bias if reflected in a distributed system. Imagine a classifier where, through no explicit intent of the designer of the classifier, always said that some user login was malicious or some binary was malware or the like, even when it was not. And all those users or binaries had something in common that was apparent, and were all being misclassified. That would be systemic bias, and nobody would blink an eye about a call for it to be corrected.

Here we see systemic racial bias, probably not because of individual racial animus on the part of some Google engineer, but because society has systemic racial bias and the classifier is just reflecting that because it was trained on public data.

We should look at incidents like this differently: not as yet another chance to argue about what Google should or shouldn't do, but as a chance to see that a socially-neutral AI that does not have bias built into it is learning racial bias from society.


There is a rigorous definition of bias, and that isn’t it. That is called ‘variance’. It doesn’t matter if there is structure to the variance. It doesn’t matter if the model is consistently wrong for one type of input. It is not bias unless it has been specifically and intentionally programmed to predict something other than the result from the maximum likelihood estimator given the data. A trickier type of bias is in the dataset itself, but once again, this would require a specific effort and intention to produce that bias by modifying the collection or filtering procedure. Nothing as much has been alleged, so you have provided no evidence of bias, ‘systemic’ or otherwise.

Come up with a different word. Maybe “non-uniform variance”.


> the classifier is just reflecting that because it was trained on public data

Are you suggesting that the cause here is that there are a lot of racist images on the internet visually depicting black people as gorillas?

I assume that is not the understanding of most people here who are questioning what makes this racist, because if that is indeed the cause then I agree it is obviously racist.

Is it the cause though? I would be interested to see some evidence or rationale, because I don’t think I have ever seen an image like this outside of perhaps a handful of very old hand drawn propaganda images (which presumably would not affect the classifier much due to being overwhelmed by the vast number of images of real people and real gorillas).


We need to be vigilant to no end about how important it is to be able to run AI systems ourselves and use the opportunity they provide without the oversight of HR or PR staff rooms.


even if there's a perfectly reasonable & rational explanation for why classifiers like this make misclassifications like this, the fact that this sort of misclassification continues to be present is a widely used public facing model is racist. furthermore, attempts to dismiss the above as not being racist because "it's technical in nature" is also racist because it blatantly ignore the harm that comes from continuing to parrot a well know racial caricature


This proves my point. Google never had any lead in AI. All they have and will have is PR stunts.


So is this a "racist data" thing or a "bad recognition" thing? Do pictures of white people get mislabeled as something else in ways pictures of other ethnicities don't?


Can't or won't




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: