Those are most likely due to the system prompt which tries to reduce bias (but ends introducing bias in the opposite direction for some prompts as you can see) so I wouldn't expect to see that happen with an open model where you can control the entire system prompt
Because I think it be would be kinda hilarious, trying to make people believe they are very progressive by biasing the model to such extreme and then in the real world nothing is changed. Also because I believe the model is a result of a kind of white guilt mentality that some people seem to have, as one person who led the development of Gemini tried to defend it on Twitter yesterday, he is a white man.
Of all the very very very many things that Google models get wrong, not understanding nationality and skin tone distributions seems to be a very weird one to focus on.
Why are there three links to this question? And why are people so upset over it? Very odd, seems like it is mostly driven by political rage.
Exactly. Sure this particular example is driven by political rage, but the underlying issue is that the maintainers of these models are altering them to conform to an agenda. It's not even surprising that people choose to focus on the political rage aspect of it, because that same political rage is the source of the agenda in the first place. It's a concerning precedent to set, because what other non-political modifications might be in the model?
Well, every model is altered to conform to an agenda. You will train it on data, which you have personally picked (and is therefore subject to your own bias), and you'll guide its training to match the goal you wish to achieve with the model. If you were doing the training, your own agenda would come into play. Google's agenda is to make something very general that works for everyone.
So if you're trying to be as unbiased as humanly possible, you might say, just use the raw datasets that exist in the world. But we live in a world where the datasets themselves are often biased.
Bias in ML and other types of models is well-documented, and can cause very real repercussions. Poor representation in datasets can cause groups to be unfairly disadvantaged when an insurance premium or mortgage is calculated, for example. It can also mean your phone's ML photography system doesn't expose certain skin colors very well.
Even if it was trained with a statistically representative dataset (e.g. about 2/3 of the US is white), you want your model to work for ALL your customers, not just 2/3 of them. Since ML has a lot to do with statistics, your trained model will see "most of this dataset is white" and the results will reflect that. So it is 100% necessary to make adjustments if you want your model to work accurately for everyone, and not just the dominant population in the dataset.
Even if we aren't using these models for much yet, a racist AI model would seriously harm how people trust and rely on these models. As a result, training models to avoid bias is 100% an important part of the agenda, even when the agenda is just creating a model that works well for everyone.
Obviously, that's gone off the rails a bit with these examples, but it is a real problem nonetheless. (And training a model to understand the difference between our modern world and what things were like historically is a complex problem, I'm sure!)
I'm pretty sure that this whole story with Gemini and now this has already seriously harmed how people trust and rely on those models way more than any implicit biases from the training data.
Is it intentional? You think they intentionally made it not understand skin tone distribution by country? I would believe it if there was proof, but with all the other things it gets wrong it's weird to jump to that conclusion.
There's way too much politics in these things. I'm tired of people pushing on the politics rather than pushing for better tech.
> Is it intentional? You think they intentionally made it not understand skin tone distribution by country? I would believe it if there was proof, but with all the other things it gets wrong it's weird to jump to that conclusion.
Yes, it's absolutely intentional. Leaked system prompts from other AIs such as DALL-E show that they are being explicitly prompted to inject racial "diversity" into their outputs even in contexts where it makes no sense, and there's no reason to assume the same isn't being done here, since the result seems way worse than anything I've seen from DALL-E and others.
I mean, I asked it for a samurai from a specific Japanese time period and it gave me a picture of a "non-binary indigenous American woman" (its words, not mine) so I think there is something intentional going on.
Ah, I remember when such things were mere jokes. If AI 'trained' this way ever has a serious real world application, I don't think there will be much laughing.
Google has been burnt before, e.g. classifying black people as gorillas in 2015, so I can understand their fear when they have so much to lose, but clearly they've gone way too far the other way and are going to have to do a lot to regain people's trust. For now, Gemini is a play toy
What group are you talking about? In any case, your account appears to be freshly made, and you are indeed trolling around. Many gray comments and so on. What happened to your previous account I wonder?
I think its great that some consideration was given by Gemma to the 2.3 million Norwegian immigrants. However it is/was very consistent in which kind of Norwegians it decided to show regardless of the prompt 100% of the time.
In fact it was quite adamant regardless of the time period or geography.
Rather mysteriously if you try it now as opposed to when it came out the results currently only show non-immigrant Norwegians. So is it wrong now? Because now it switched to exclusively ignoring the 4.5 million immigrants and only showing me the boring OG Norwegians.
I for one am outraged that the 8.9 million people of color Norwegian immigrants are presently under represented by Google. There is a serious risk of misleading people.
Cut down on the grandstanding maybe. It's clear from its descriptions and what we known now that they just carelessly added "diverse ethnicities and genders" or whatever to prompts across the board to compensate for a model that otherwise clearly would have defaulted to just spitting out pictures of white people for most prompts. That's not part of some nefarious agenda to destroy Truth and history but literally just trying to cover their asses because Google has a history of accidental racism (e.g. the "tagging Black people as gorillas" incident a while back).
Pretending that a shoddy AI image generated with a blatant inability to produce consistent output is a "serious risk" is ridiculous. The thing wasn't even able to give you a picture of the founding fathers that didn't look like a Colors of Benetton ad. I struggle to imagine what tangible risk this "misinfo" would have. Norwegians being the wrong color? And what harm does that do? Bad assumptions about the prevalence of sickle cell anemia?
E.g.
https://x.com/debarghya_das/status/1759786243519615169?s=20
https://x.com/MiceynComplex/status/1759833997688107301?s=20
https://x.com/AravSrinivas/status/1759826471655452984?s=20