The rumour of IBM dropping “all facial recognition work” is unsubstantiated, despite making its way into industry’s headlines.
Krishna’s letter is here[0]. IBM will cease to sell related products and services. One might speculate they might resume sales once there are strong regulations and limitations in place.
>>> IBM no longer offers general purpose IBM facial recognition or analysis software. IBM firmly opposes and will not condone uses of any technology, including facial recognition technology offered by other vendors, for mass surveillance, racial profiling, violations of basic human rights and freedoms, or any purpose which is not consistent with our values and Principles of Trust and Transparency. We believe now is the time to begin a national dialogue on whether and how facial recognition technology should be employed by domestic law enforcement agencies.
Even if the phrasing is limited to "general purpose" and "software", it still seems a pretty strong stance to me.
I tend to start liking them more and more. Power as only unbackdoored modern CPU, and now this. Seems to be a big marketing push, in the right direction.
IBM recently laid off many of the POWER staff (effectively OpenPOWER only has a skeleton crew, POWER lost people too though) in their last round of layoffs. Speculation is that they are on the path to sell off the POWER design group.
I think they are referring to it not having Intel ME or TrustZone (which is present in AMD chips and ARM chips), which could allow the processor manufacturer to run code that you don't know about. Of course, there's no Power systems for consumer use, so we're kind of stuck with Intel/AMD (until Apple releases those ARM Macs, which probably can't boot linux :( ).
Northrop Grumman has been developing very sophisticated, state-of-the-art, facial recognition technology for the last 15 years. There are several USA defense contractors that have this technology.
> Northrop Grumman has been developing very sophisticated, state-of-the-art, facial recognition technology for the last 15 years.
Yikes, hopefully you're not suggesting that the length of time is a virtue? This may be a stereotype but I thought Northrop Grumman was one of those "attach" defense contractors that do work for years and years but never actually deliver any software.
With neural network use in image recognition (if not facial recognition) being used at least as early as 1983 (I'm not in the field, so just did 5s of research and found something related) - https://ieeexplore.ieee.org/document/6313076
"A recognition with a large-scale network is simulated on a PDP-11/34 minicomputer and is shown to have a great capability for visual pattern recognition. The model consists of nine layers of cells. ..."
another paper from 1990 using neural networks for facial sex identification:
and software facial recognition itself dates to the 60s.
despite what the 'data science and AI and by AI I really only mean the subset of AI referring to ML' hypetrain tells you, these things are not new, the processors are just faster, and the tools are more accessible, which means the relevant application space is larger.
> these things are not new, the processors are just faster, and the tools are more accessible, which means the relevant application space is larger.
The core to my argument that most of contemporary ML isn't new or different in an "algorithmic breakthrough" sense, but in that we've massively sped up GIGO. An AI researcher in the 60s/70s teleported to today isn't going to have a rough time figuring out what's going on, they are going to whistle in shock at how massively bloated our "sample sets" and data corpuses have grown, how fast we're churning through that data (regularly), and how much of it we are leaving unsupervised or "self-supervised", and with maybe some questions as to what we did with our (lack of) QA/QR processes along the way.
I was studying RNN-driven image classification 12 years ago. No doubt defense companies were waayyyy ahead of a master-degree course in an university outside the US.
^W denotes Ctrl-W, the shortcut to delete the previous word for example originally in Berkley (thanks Wikipedia!) Unix-style terminals (and ^H denotes the control code for backspace, deleting the previous letter). The Jargon File's example shows how these are used for humorous effect: '"Be nice to this fool^H^H^H^Hgentleman, he's visiting from corporate HQ." reads roughly as “Be nice to this fool, er, gentleman...”, with irony emphasized.' (http://www.catb.org/~esr/jargon/html/writing-style.html)
I'm confused by this comment. The YF-23 was not a successful project, but it's successful competitor is well known to the public. Furthermore there are successful Northrop projects that are known to the public, for instance the B-2. And the B-2 was shown to the public years before the first operational one was delivered to the Air Force.
There are a number of cynical comments here about how they weren't making money on the technology and are just announcing this for PR reasons. Well, maybe, but isn't that sort of cynical response even worse?
I'm rabidly against the use of facial recognition on unwilling subjects, whether it's a government actor (by far the most oppressive use) or a corporate actor. I'm rabidly against public space cameras. I want to see this technology die and never return.
We all love to pounce on companies for doing things we don't like. Why don't we celebrate this as the victory it is? Of course there's a PR component here. Why wouldn't they make an announcement? Why wouldn't they do it now when the audience might be more receptive to the idea? The fact that they weren't making money off it is unlikely to be the only reason they're canceling it. IBM plays the long game, and there's absolutely a market for this technology. A huge and profitable market. They could have kept at it and turned a profit.
So, yeah, they're trying to make some hay, but not every corporate action is purely cynical and evil. Let's appreciate that they've made a positive change, and let's hope that it increases awareness of a horrible technology, and puts pressure on the more egregious actors like Amazon and the defense industry. We don't have to pat IBM on the back, but we can cut them some slack.
Amen. Even if they're doing it for purely selfish reasons, it's still great that they used it as an opportunity to set an example. And I hasten to add, it's not exactly an unmitigated marketing win. It'll certainly raise some eyebrows in some circles to see IBM taking a moral stance on something.
It's just software, if an actor wanted to use facial recognition they can write it themselves. Isn't the correct attack vector regulation vs having a company pull its products? If IBM isn't doing it someone else will and it'll just be less competition, meaning a better funded product for the industry winner.
The primary customers for this technology are governments. They want it. They're not going to regulate it away from themselves. At best they'll regulate the far less dangerous civil use so as to pay lip service to concerns and amplify their "think of the children" misdirection. They can't write it as quickly or as effectively themselves. Governments don't write software. They pay IBM, Amazon, and Lockheed to do it. If most vendors grow a conscience (or fear consumer retribution), then no one writes it. Or it becomes prohibitively
expensive. Or it doesn't work as well (which isn't necessarily always good, but is in this case, because it will reduce public support).
> They're not going to regulate it away from themselves.
Hum... Speak for the US here. The EU already did that, Brazil (my country) already did that, I think from our neighbors here Chile and Argentina already did that.
Besides the government software that works is all written by the government. Too bad 99% of government software is contracted out.
Assigning morality to facial recognition, a task most humans can do is more than a bit bizzare in my opinion. I tend to operate on a "free extended thought" model when it comes to processing available publically available inputs.
I am certainly not oblivious to the abuses of it and it being a Morton's fork in practice - if it is inaccurate you get many innocents harmed by idiots treating a screening tool as a unique identifier. If it is 100% accurate you can trace anyone perfectly.
Personally I see the cynicism as an acknowledgment of their apparent capabilities as lacking. Claiming moral high ground for what you aren't capable of isn't exactly meaningful. A dyscalcic janitor boasting about their morals not getting rich designing missles for the military industrial complex doesn't mean much because they would be incapable in the first place. Assigning morality to it just gets silly.
> Assigning morality to facial recognition, a task most humans can do is more than a bit bizzare in my opinion. I tend to operate on a "free extended thought" model when it comes to processing available publically available inputs.
The problem with that is that while in theory you could use a million people to do things like facial recognition on the scale allowed by technology, in practice, this would be incredible expensive, so it doesn't happen.
If I can do it do my neighbor it is vastly different from me being able to do it to my whole city. It has very different consequences and should be weighted on its own.
>The problem with that is that while in theory you could use a million people to do things like facial recognition on the scale allowed by technology, in practice, this would be incredible expensive, so it doesn't happen.
Rather, I would say that if you use a million people to do this, it would be morally abhorrent as well. Just look at Stasi informants in the DDR.
I will say that 3 years ago when we tested all the major vendors' image recognition technology, they all failed spectacularly. And this wasn't some small project, we ran tens of thousands of our images through (out of over a million) and worked with several Phd students in our efforts.
A picture with a white college kid in the frame would get the output of "human". Put an African borrower in the frame and you got at best a failure to recognize a human, and at worst a reference to an animal.
I would hope the situation is much better now, but the bias (and just sheer inaccuracy) of the tools was readily apparent and we gave up on image recognition for the time being.
There are real technical challenges in producing photographs of both light and dark skin, but the focus on light skinned subjects have led to an accumulation of technical solutions.
Computerized facial recognition has been progressing for something like 50 years, with each paper building on previous work. I suspect a parallel world where the dominant social group hard dark skin would explore different technical avenues, focus on ways of recognizing faces that lean less on tone, etc. Obviously it's hard to say with what we know now.
Humans have difficulty recognizing cross-ethnic faces, however I don't think dark skinned people are worse at recognizing same-ethnic faces than light skinned people. To me that suggests our social history is more to blame than some fundimental quality of skin tone.
It goes even deeper than that. If you look at the history of photography, film, and television you'll see a bias towards white skin tones that standards have been built up around. Just looking at "leader ladies" or common reference images there isn't a bunch of diversity. In practice I've heard cinematographers and photographers bring it up around filming different ethnicities. It starts with changing the exposure, but means you have to spend more attention towards color reproduction.
This can extend beyond the film or image sensor to biasing lighting equipment, defaults, standards, and thresholds.
Looking through the literature you'll see things like, "the design decisions in the NTSC’s test images effectively biased the format toward rendering white people as more lifelike than other races"
I'm mostly familiar with the history in the US. I'm earnestly curious about how other countries accommodated these types of things as they adopted the technologies.
It's deeper than that. The chemicals used in analog color photography didn't have a dye profile that could capture people of color effectively until the 80s.
Even deeper- Kodak only started putting effort to get the process right in the 70s after customers complained about poor color reproduction[1] in wood grain and chocolate
I recall being amused maybe 3 years ago when I took a picture of two friends, one with dark skin, one with light skin, and the camera choose to expose for the darker skin. He looked fine, the lighter skin friend looked like snow lit by headlights. Completely washed out.
The camera settings and background lighting make a big difference.
FYI: Wedding photographers deal with the contrast between white dresses and black suits/tuxedos every day, for the last century.
Some notes:
- the eye can handle the dynamic range, but there's no correct exposure for that photographic combination, so they play with lighting, exposure bracketing and dodging when printing
- it's recommended that brides wear off-white (like cream) or other colors of fabric if they're concerned about the photos
A RAW photo (raw input from the camera sensor, without processing) can often be used in post production to get more range out of a photo than would be possible with a JPEG. One of my desktop backgrounds I often use was actually quite overexposed due to the sun shining through trees. I was able to correct for it in Lightroom and it's now one of my favorite shots.
That's exactly what (exposure) bracketing is for. You take three photos at different exposures and merge them.
This is also how a lot of HDR photography is done.
That's one way, but a lot of work for more than one photo. Going from setting adjustment to photoshopping is a huge step-up up in terms of labor and cost.
And that is why you pay a wedding photographer instead of just asking uncle Herbert to take some pretty pictures with his DSLR.
Mind you, the hobbyist uncle may produce great photos an may be a wizard with lightroom and photo shop. But a lot of hobbyists try to compensate with gear for skill, or more charitable may not be even aware of related skills an possibilities.
So in the end the snapshots your friend Andy took with his 2 year old iPhone might look better (have better automatic processing) than those photos Herbert took with his expensive prosumer or even pro kit.
I learned that one convention group I'm a part of converged on very visible glossy white badge cards not just because it is easy to sharpie names visibly on (in classic "Hello my name is" style), but also out of a request from the photography group as it is a very useful meter reference visible in almost every photo of any group member.
Seeing how many great photos come out of the event in sometimes very strange lighting conditions (as you might expect from an event that includes everything from brightly lit outdoors events to indoor conference spaces and theaters to rock show-lit concerts, and everything in between), I'd imagine that little metering trick is doing wonders.
On my A7S2 I simply use the exposure correction wheel in such situations - or use the exposure series function so I can mess around with HDR later in Photoshop.
Yeah I usually just use exposure comp too, it works pretty well most of the time but candids when everyone's moving and the background is changing can be pretty hard without a fill flash to help out a bit. Thank god for iso invariance though.
White people also have higher facial variance in general. I vaguely remember in university we had an assignment to generate “eigenfaces” or something and if you partitioned the faces by race the output of the SVD would be much wider for white people. This isn’t especially surprising when you consider the fact that white people have more light/dark contrast, more hair colors, more eye colors, etc. I think a lot of the “bias” complaints levied against algorithms like this are not bias at all, but just humans who are unhappy the world doesn’t quite live up to their idealized model.
When you sample from a smaller pool you will make uninformed statements like this. Black/Dark people of the world are not limited to Black people in Atlanta. Bet many people here do not know that there are Black people in this world, some who have naturally blonde hair and some who have blue eyes, google it.
Moreover capturing Black/Dark skin and features requires more accurate light metering & lighting because dark skin absorb more light. There's a lot variance in cheekbones, nose and lips.
Humans' features in general, are more complicated then you realize.
Ok but that can still be true (blonde hair, blue eyes) while there still being much more variations in the white population than in the black population.
I'm curious how many white people there are on earth vs how many black people there are, and other races. A couple google searches didn't give me any easy finds
Black people can have blond hair and blue eyes also. Common in Melanesians but not unheard of in African Americans either. I had blonde hair when I was a baby and genetically I'm 83% African (average admixture across all Black Americans). An uncle of mine had blue eyes when he was born
That’s because “white” and “black” are loose, shifting, ideological constructions with little basis in the scientific reality of human genetic variation. Many “white people” weren’t considered “white” until fairly recently and Africa actually has more human genetic variation than anywhere else.
I always cringe when reading about "race" in the US, the term really (intentionally? ) gives the impression that there is a clear genetic demarcation between people based on skin color.
The US is the only country I am aware of who still uses this term, everywhere else was using some thinking like ethnicity to indicate different culture, origin...
The UN pushed from the 1950 to replace race with culture. Many people around the world now say "But he is from a different culture." Instead of "But he is from a different race."
> little basis in the scientific reality of human genetic variation
This meme dates back to a loose claim made by R. Lewontin back in the 70s. In fact, you can very precisely and reliably recreate the "intuitive" human racial categorization using unsupervised algorithms, like doing multi-dimensional clustering over fixation indices. (It does not work using single-dimensional clustering, which is what Lewontin was talking about.)
Modern biologists usually talk in terms of clines rather than races, but this is just using the first derivative instead of the zeroth - you'll get the same result either way.
> Africa actually has more human genetic variation than anywhere else.
SNP diversity has ~nothing to do with phenotypic variance.
Of course the question here is recreate whose “intuitive racial categorization” because all of that is historically and culturally specific. Saying it’s possible for a computer to recreate these categorizations presumes that the categorization has some objective reality outside of this when they’re just a variable heuristic determined by all those inputs.
> all of that is historically and culturally specific
Not really - almost everyone can agree on "middle eastern/north african", "east asian", "south asian", "black african", "white", etc. If you force people to pick a single-digit number of major categories, they're probably going to come up with the same categories that k-means in fixation space would.
This is an evidence-free supposition, consistent with your pattern across this thread of making broad claims without anything to support them. You’ve provided no proof that k-means on a representative sample of phenotypic variation in the groups you cite would return this result.
That almost everyone can agree on these categories is also contrary to reality. For example, many of who you describe as East Asians consider themselves racially distinct both within their societies and from their nearby neighbors. Also, what major categories do mixed race people fall inside?
It's not my job to provide detailed proof on every HN post I make; I'm just pointing out something relevant, and if it interests you, you can go ahead and find where people have already done this. I think I've been specific enough that you can find this stuff on your own. This took me about 1 minute to find: https://www.discovermagazine.com/health/to-classify-humanity...
> many of who you describe as East Asians consider themselves racially distinct
That's why I specifically mentioned the number of racial categories involved. Obviously as the number increases you can have different clustering results.
> Also, what major categories do mixed race people fall inside?
Obviously not into any of them, if we're talking about a simple mechanical classifier with high separation.
The number of racial categories would itself be an arbitrary limit not corresponding to actual genetic variance, nor would classification under such limit capture said variance, and none of it would match up to the folk biology of racial categorization. This is the general problem with reasoning backwards from 19th century gobbledygook about human genetic variation instead of beginning with the genetics themselves.
It may not be “your job” to provide such evidence, but you’ve made a series of specific claims about things like the rate of phenotypic variance among different racial groups. If you don’t want to defend them, that’s your prerogative, but you also can’t expect them to be received as authoritative or remain free of challenge.
> Ok but that can still be true (blonde hair, blue eyes) while there still being much more variations in the white population than in the black population.
Even the man who coined "Caucasian" as a racial category recognized that there was more physical variance among African populations and individuals than compared with Europeans.
You've made the one good point that I've seen in these responses, which is that I think all the pictures were sourced from the US. If black Americans are descended from a relatively narrow geographic region in Africa, that could lead to me underestimating phenotypic variance for black people in general. However, the problem would still exist when the technology is deployed in America.
> some who have naturally blonde hair and some who have blue eyes
I know they exist, but we are talking about statistical properties of entire populations, and these people are very rare.
> Humans' features in general, are more complicated then you realize.
It's not about what I realize - it's about what can be mechanically detected.
What can be mechanically detected is limited to how the data is collected.
You missed this part: "Moreover capturing Black/Dark skin and features require more accurate light metering & lighting because dark skin absorbs more light."
I have to see the images used to train the ML model, to be certain, but based on my experience working in photography and programming, I believe it is more likely than not that they used essentially poor quality images for the training.
Moreover, after the model has been trained, to use the system effectively the facial recognition camera has to be set up to capture both light and dark skin, in the case of dark skin, it typically means not relying on available light alone indoors, an additional camera light must be provided.
The reality is if you want a facial recognition system that accurately detects dark skin it will cost a bit more to do it right.
White people have more facial variance than all the people categorized as not White people? That seems unlikely.
But if it were so, then it sort of invalidates your argument because the bias complaints are that the facial recognition algorithms often misidentifies other races (not generally as white but as perhaps not human)
Specifically the scandal some years ago when black people where being identified as gorillas, it seems obvious to me that if non-white people have less facial variance it should be easier to identify black people as black instead of more difficult.
I thought it was a basic understanding that unfamiliar people, events, places, etc _do_ look alike, because before you get sufficient experience and exposure, you don't have enough skill to know what features are important to focus on and which ones carry no information.
Its a well known thing that people have a harder time telling the difference but I have not seen it mentioned that software would. For example zebras are all uniquely identifiable by their stripes and other zebra and computers could identify them but I would have no hope.
It's true. But that doesn't mean we shouldn't put the effort in to learn, and especially put the effort into our software.
It's also worth considering what saying that implies to the people you're saying it to. Especially when said dismissively. I'm terrible at names, faces, voices, pretty much any way of telling people apart. But that's my problem, and I need to be careful not to push the burden of that problem onto the people around me.
If you say “i have trouble telling people of ___ race apart” that’s different from “you all look the same”. One is a statement of your own limitations, the other is saying your own limitation is actually someone else’s fault.
The first one CAN also be troubling if you have had ample exposure to become familiar, and are indirectly admitting that you did not believe it to be an important skill to learn.
Lots of white people also look the same, I grew up in a mostly Mexican environment, living in the Bay Area now, I find it difficult to recognize lots of white people apart from each other.
I'd say it's an ignorant thing to say in the way you say it. Like you said in this post, you just don't know the features to look out for when you're not familiar with them
No, software people never do anything this controversial. They were just observing that they know of a universal law of nature as described in a recent Quillete article, and/or have second-hand knowledge of a particular quirk of a specific software that is definitely not just the reflection of its creators' myopic world-view.
You might be right in terms of hair or eye color (not too many gingers outside Europe), but Africa has a vast amount of genetic diversity, so I'd naively guess that variance along other feature dimensions would be correspondingly higher too.
If there are technical challenges which would produced bias results...perhaps the answer is to not release the product. Perhaps do more R&D or more training on broader samples until one has a better product to release?
I don't think they are releasing products to make the world a better place. After all this tech is very dangerous. I don't the the ethics of racial bias are factored in to the decision making at these companies.
This is one bias I think benefits folks with dark skin. I'd love to be in the group that doesn't get recognised by this terrifying tech.
That’s not necessarily what you want. “not working” can manifest itself in false positive matches too. Cue “all X people look the same” stupidness.
And the truth is, the danger zone is when the algo is somewhat ok, but a few percents worse on the discriminated group. If it were obviously stupidly bad (like for example any two black people are a positive match according to it) then people will disregard it. But if its just slightly bad thats harder to notice and suddenly your loan/car rental is rejected because the security system says you look too much like a specific known fraudster. And of course they wont even tell you that thats the reason, because security.
I worked in imaging for medical diagnostic support and dark complexions are just plainly more difficult to analyse because of restricted illumination times and limited contrast. Sensors got much better in recent years though.
The indications in question were extremely rare for black people on the other hand. Another use case was the detection of Vitiligo, which is significantly more easy on black people. There was a great demand for therapies because patients are stigmatized for different reasons, especially in Africa because the symptoms can look similar to some deadly diseases.
There is a lot of discrimination in imaging, but it doesn't have to do with race and more with the creation of binary images.
As for facial recognition. I see some use cases for convenience, but it would be far more effective to disallow widespread application because of the numerous unsolved problems. So I see the move from IBM positively, even if they continue their work. In the end I believe face reg to be possible without bias, but I don't have these propped up security needs.
But the fact that the technology is considered viable given this failure mode is what reflects the bias. If a vendor has tech that can only distinguish white faces it will sell it as “detect your best customers and offer them special deals when they enter your store.” If the tech can only recognize black faces it will be sold as “recognize known criminals in the neighborhood and alert police when they enter your store.”
From what I understand, NIST has been running a live "competition" on faces of 9 million people for some time already [0]. They are trying to evaluate racial biases in particular [1]. Was there a reason why you decided to not rely on what NIST was putting out?
I've had this issue with AI's in the medical industry.
At best, you have AI's that can easily recognize pathologies that an average rad could recognize, but are useless when it comes to trying to recognize a pathology that 99.999% of rads would miss. Why? Probably because the system is biased towards pathologies that rads recognize. Why? Probably because that's what's in the learning set. I understand all that, but that makes the AI almost useless in a production setting simply because of the way many healthcare delivery networks are structured with respect to radiology.
At worst, you have AI's that seem to recognize pathologies that an average rad could recognize, but then inexplicably have a horrible miss on an obvious study that any first year radiologist would have caught while half asleep. It's the sort of miss that makes the astute observer wonder if the other vendor's AI, the one that didn't have any misses on your test set, was simply not fed an example study that triggered its blindspot? You start to wonder where the blindspot on that one was? You start to wonder does it even have a blindspot? How can you work around blindspots like this in general? Etc.
But here's the thing, it's never good when you're thinking about mitigations while you're still testing the AI. My first RSNA was over 20 years ago. To this day, we're still hearing the same promises, and the production testing, (it never fails), is still uncovering the same issues.
Now I would have thought that recognizing a human would be easier than trying to ascertain, say, calcification in a DX, but apparently these same issues crop up in a number of different applications of ML based technologies.
I understand that a training set, by the very definition of the word "set", does not contain everything. Obviously, there will be bias towards whatever is in the training set. Essentially, most techniques today, simply train AI's to tell us whatever we tell the AI's to tell us. But for this reason it should be unsurprising that these sorts of AI's work best in settings with well defined domain spaces and are challenged when the domain space is less well defined. (And in either case, these AI's will have an inescapable bias towards whatever was in the training set.)
What does a tumor look like? Is probably too broad a question to give these AI's. What does a stop sign look like? Is probably a question that these AI's could answer relatively reliably. (I hope?) You would have thought that "what does a human look like?" would be closer to "what does a stop sign look like?" But I guess it's not surprising to hear that it seems to trend towards "what does a tumor look like?" in practice.
I agree with you, but IMHO it's trying to solve the wrong problem. The real solution is identifying a major pain point (such as annotation, co-registration, etc) and solving a lot of the small annoying steps to make the radiologist better at his/her job.
Full disclosure: I am a cofounder at a startup automating chest X-ray reporting.
It is true that ML algorithms are almost always trained on radiologist labels on the same modality, and thus take in the reader biases. I also agree that some radiologists are better than others as you imply.
As a patient, one does not know who will read their film. IMHO we as an industry should aim not at beating 99.999% of radiologists. We should merely make products which consistently perform not worse than an average radiologist at a particular institution. It is always thrilling to outperform humans with your software, but at the end patient outcomes are what matters. Those are about consistent performance over a long period of time.
Demonstrating this consistent performance is the challenging part, but it is possible to prove it through sufficiently careful and lengthy prospective trials. That’s what we are focusing on, and I would love to see the other players in the industry do the same.
Is ML result considered as a first/second opinion, or as just 'a quick check' for reference only?
I believe that ML should not be taken in lieu of human opinion. The consensus, be it medical or legal, has to be explicitly human with all the responsibility attached.
Shifting the responsibility for the misses onto a faceless ML is only eroding trust in the professional opinions and cementing the biases.
I fully agree that treatment should be prescribed by a human doctor who can explain and answer questions. However, I would not agree that each of the data inputs to the treatment decision should be generated manually. That is already not the case.
This is the same argument that is often put forward in relation to accident rates for self driving vehicles. ML only needs to outperform humans.
The problem with this argument is it glosses over that fact that in the tail, where the ML is making a wrong decision, sometimes a catastrophic one, the behaviour of the ML algorithm is not well understood. How can we deploy something in such safety critical applications that we do not fully understand?
Excellent point. That is what the lengthy prospective study phase as well as periodic auditing after deployment are for.
Please also note that there are several important differences as compared to the automotive industry. First, one could argue that the task at hand is trivial as compared to the self driving car. We are operating in a heavily constrained setting with much better understood data inputs and a hundred-year history of medical professionals trying to classify and systematize them. Moreover, our task is not time-critical. It sometimes takes more than a week for such an image to be reported on.
Improving the training set is the way to go. Like others have said, maybe we should attempt to improve contrast of darker faces before training (which won't be easy as well).
That said, a non-diverse facial dataset in a diverse society like the US is simply useless. It doesn't help saying the AI is suffering from a human bias, and dropping these projects entirely, unless they are being used for a malign purpose like what China does.
I would hope that on HN one would take the time to explore the technical issues before even suggesting bias (which implies human racial discrimination of some sort).
I am going to venture a guess that there's a large audience in the image processing/AI/ML world that lacks a fundamental understanding of image sensor and lens technology. I have never seen mention of concepts such as well capacity, quantum efficiency, noise floor, dynamic range, thermal noise, gamma encoding, compression induced errors, etc. in most work I have reviewed.
Sensors used in the general class of imagers found in these experiments are nowhere near adequate to capture the full dynamics of a lot of real life images. The lowlights (referring to the lower portion of the dynamic range of a camera, encoding, compression and image processing system) can be some of the most challenging portions of the dynamic range to get quality data.
The old idea applies: Garbage-in, garbage-out.
It should come as no surprise that algorithms trained on (likely) bad images with bad lowlight detail will fail to deal with people of darker skin. It's almost a given. One can't assume cheap cameras and the data sets produced with these cameras will see the world the way our eyes are able to. Not to mention the fact that we have something called "understanding" while classifier systems have no clue whatsoever what they are looking at, all they can do is put things in buckets and that's that. In other words, there is no inherent comprehension of what a human being might be versus a bear or a teapot. That's a major problem.
The answer isn't to give up. The answer is to understand and then go back and do it right. This isn't going to be cheap and it will likely require rethinking how we build and train these systems.
As a tangentially related data point, I have three German Shepherd dogs. Two are the traditional black and brown coloring. The third is 100% black. It is virtually impossible to take a good picture of him. In anything but the right lighting he shows up as a dark amorphous blob. For all the prowess of the mighty camera in an iPhone 10, you'd be hard pressed to use those images to recognize him as anything other than a blob on a dark couch.
I do have access to high performance images with far greater well capacity as well as 100% uncompressed data output. In that case there's usable data in the lowlights that, through gamma and LUT manipulation can be extracted. When you do that he quickly goes from looking like a blob to looking like a happy dog.
Anyone interested in learning more, I would highly recommend looking up Jim Janesick:
The popular phrase "he wrote the book" applies here. Jim's books on the subject of image sensor technology (science and design of sensors) are the reference work anyone in imaging studies. He designed so many sensors for space applications I am not sure he even remembers how many. I was fortunate enough to study CCD and CMOS sensor design under him a couple of decades ago.
ML has to start with good data. Inadequate sensors coupled with compression and other processing artifacts leads to bad data, a formula for failure.
You wrote a lot of text but missed the central point. Cameras that can capture dark skin exist. This is a problem that human researchers just shrugged off or ignored. You might say well, can't do anything about it, our training corpus is full of these bad images. Maybe we blame the camera manufacturers?
But that's still a cop out. You can't use models with these problems. It gets you disasters like face unlock that doesn't work for black people. This happens even if people aren't white. Samsung famously released blink detection that didn't work on a lot of East Asian faces. The products are broken because the white bias is the default and pervades everyone's thinking.
What I am saying --which is absolutely true-- is that there's a lot more data in the mid-range of a gamma-encoded image (which is 100% of the images produced by everyone doing this work) than the low-lights. This means that the local dynamic range in those regions is different. Which means that operations such as edge detection will be more accurate and could make the difference between something working and not.
Vision researchers should take a class or two in cinematography and photography, it would serve them well. Even the quality of the lens makes a difference. Most work I've seen out there uses cameras that barely pass as security cameras or webcams.
That said, yes, ML needs to work with crappy images and every single camera out there. My argument is that you are not going to be able to train using crap data. And the images in a data set would be crap if the data --the images-- were not acquired using cameras and techniques that provide enough data across various segments of the dynamic range.
Again, I gave the example of my black GSD for a reason. You are not going to be able to recognize him as anything other than a blob on a couch without a camera that can capture enough data at the low end of the dynamic range and a system trained with that data.
The fact that Samsung (or anyone else) failed means nothing. In order for that data point to be meaningful you'd have to have intimate knowledge of what they were doing and what capabilities they had, both in terms of science and engineering as well as the consumer hardware they developed.
I have competed against multi-billion dollar multinational corporations who, despite their financial prowess and scale, could not design their way out of a paper bag. They don't understand the problem, lack creativity and, most importantly, absolutely lack the passion necessary to solve it. Ten 9-to-5 engineers can't compete with a single engineer passionate enough to devote every waking hour to solving difficult problems. It doesn't matter how much money you throw at them, they just can't perform.
This doesn’t help when the variance in feature space of white people’s faces is objectively a lot wider than the variance of black people’s faces. You can have the best camera in the world, but it’s not going to change the fact that someone from Ireland looks way more different versus someone from Italy (in any mechanical basis) than someone from Cameroon looks versus someone from Malawi.
Objectively? You have to show some evidence, otherwise this just sounds just like "yall look alike to me".
To add to this considering that genetic variations between peoples on the African continent are larger than the variations to anywhere else, this is also very unlikely. I think it's a reasonable assumption that genetic variation and variation in physical features are somewhat correlated.
Yes, I wonder if this is because in the political climate in the US some people feel more empowered for statements like this or if the number of people holding these believes is actually increasing (at HN)
I think GP may be reacting based on common techniques of obscuring faces - i.e. ninjas wearing black masks, special ops putting dark paint on their face to obscure details. I think the effectiveness of those techniques leads people to think that darker colors are harder to see or make out. But that analogy breaks down because here we're talking about fully lit situations, right?
> genetic variations between peoples on the African continent are larger than the variations to anywhere else
You are misinterpreting this, as does almost everyone else. Phenotypic variance is almost completely orthogonal to number of unique SNPs (which is what this generally refers to).
> genetic variation and variation in physical features are somewhat correlated.
This is not the case. You can have narrow population bottlenecks (reducing SNP diversity) followed by high variance in selective pressure. This is exactly what happened to early Europeans. You had a small initial population spread out to inhabit a large variety of ecosystems.
This is a farcical argument. Facial recognition apps aren’t looking to distinguish gross features like Irish vs Italian features (whatever those are). They can tell two brothers from Ireland apart from one another, they should be able to tell two brothers from Malawi apart too. Their mother probably can, so why the hell shouldn’t their phone?
Is this really the case though? Feature wise it's not just the color of the skin that matters, and populations across Africa can look very different from each other's even to the human eye.
The first mistake is to believe the definition of "bias" requires intent to harm. It doesn't. Any systemic error is "bias", even if it's just your thermometer always reporting 2 deg more than the actual temperature.
The second issue is believing some technical arcana are a valid excuse for selling products with such errors. It isn't! If your product reinforces entrenched discrimination, you either fix it or stop selling it.
Think of this from the perspective of the end consumer of the model: am I willing to buy new high-quality capture hardware for the target production environment, potentially discarding an existing investment?
In my experience, they rarely will even consider it.
Therefore, the training data should have the same technical shortcomings as what will be used in production.
> Therefore, the training data should have the same technical shortcomings as what will be used in production.
I think training and the application of the trained system are separable. Nothing is accomplished by training with data sets that lack data or detail. Inference or classification is impossible or deeply impaired by the lack of data.
As a hypothesis, the solution is likely to involve training one network with good data and a separate network to be the interface between the first network and the imperfect perceptual data in real-world applications.
At the end of the day AI/ML need to leave the world of classification behind and move on to the concept of understanding. This is not an easy task, yet it is necessary in order for these amazing technologies to truly become generally useful.
We don't show a human child ten thousand images of a thousand different cups in ten different orientations in order for that child to recognize and understand cups. The reason for this is that our brains evolved to give us the ability to understand, not simply classify. This means we need far fewer samples in order to have effective interactions with our physical world.
The focus on using massive data sets to train classification engines is a neat parlor trick, yet it will never result in understanding and is unlikely to develop into useful general artificial intelligence. The problem quickly becomes exponential and low quality data becomes noise. We need to develop paradigms for encoding understanding rather than massive classification networks that can't even match the performance of a dog in some applications. As I said before, this is a very difficult problem. I don't think we know how to do this yet. Not even sure we have any idea how to do it. I certainly don't.
When analyzing large business decisions, I find the most success when viewing through cynicism. Self-interest is after all the primary driver of most businesses.
Both things can be true at the same time. IBM is looking for a nice excuse and also with facial recognition people won't get away with minor crime when cops are not looking.
If every person who breaks into a car or sells something in the street is caught, this would be an insane surge in the prison population.
Don't rule out a 3rd factor: in the facial recognition industry it is difficult to earn a profit. In general a FR client already has an existing camera network they want to add FR - meaning the quality control over the image quality is already gone. They also already have a security video management system, so whatever FR solution IBM proposes has to interoperate with that other pre-existing video management software. So right there, two traditional sources of major profit are already occupied, and in order to play in this industry one's FR software needs to be able to work with any pre-existing camera network(s) as well as interoperate with any pre-existing video surveillance software... setting all that up is not cheap and the customer is often low-tech and expecting magic...
Even more cynical: they’re not _actually_ dropping it from development—they hope players at the fore are pressured to drop it and they can swoop in saying they’ve fixed all the issues that affected others. Surprise!
I share your cynicism, but I think it's worth examining if IBM's cynical cop-out might also be true. Is it possible to develop facial recognition systems that couldn't be used for mass surveillance? Or, is it possible to develop mass surveillance systems that could only be used in the pursuit of moral (or at least morally neutral) goals?
I'm generally of the opinion that tools are intrinsically amoral; neither moral nor immoral. A hammer could be used to build an orphanage, or be used to hit orphans; it's an amoral tool that could be used for good or evil. But I think when tools become sufficiently powerful, that sort of perspective starts to break down. Weapons of mass destruction are good examples of this from the past century. Could there conceivably exist digital systems too powerful to consider morally neutral? Personally I think so, and I think facial recognition technology might qualify.
And it would be worthy of public outrage if a tool manufacturer were supplying hammers to child abusers for the purpose of child abuse. The fact that hammers can also be used to construct buildings is completely irrelevant.
This entire “tools are amoral” argument seems to deny the fact that in most cases we don’t need to judge the intrinsic morality of a particular tool, because we can just look at how the tool is actually being used.
That view is extremely popular for essentially every action made by any well-known company or individual.
Personally, in cases where there isn’t a specific explanation for why a particular decision has secretive motivations, I do find it a touch too cynical.
Well the alternative would be that they have chosen to take a moral stand that they expect will cost themselves and their investors money. So the explanation is pretty basic for disbelieving that story.
Why is that the only alternative? Couldn’t the consequences of this particular action in fact be a good financial decision due to negative feelings toward face detection software held by some portion of IBM’s potential customers or the general public?
I don’t understand why it would be a problem if making a moral choice is also advantageous for other reasons. In fact, surely we should strive to arrange society in such a manner that these things align.
On the other hand, the more ethical companies will be the ones moving the slowest in that field, since there is a lot of work needed to make this technology safe, and they would also make less money since they would have to refuse some customers.
Not saying that this is what happened at IBM though.
They may be 5th in the cloud race, but the market is huge. I'll bet they're earning far more than you'd think. They have a few very, very large clients.
I could relate to that, I'm the type of CEO that wants to party in Ibiza with gogo dancers on my story, but wants to keep a welcoming and inclusive work environment so would be looking for any of the thinnest controversies to resign and disappear with my golden handcuffs and consolidated shareholdings.
So far nobody has dug up all the little nuggets I've left in public records. Sucks
This headline reminded me of the "Chicago PD" recent episode called "False Positive" (S7E6) [1]. A new id system is pushed into a resonant case by a police chief. The system's merits are touted as being 'strongly condemned by ACLU'. Yet still in beta..., people are shown on screen as just a collection of dots. Virtual code lines that affect real lives.
A striking difference to what's happening in China where facial recognition software made SenseTime the world's most valuable AI unicorn: "However, facial recognition does not seem to have been making the company much money, if any. To be fair the technology is really in its infancy and there are few applications where an enterprise vendor like IBM makes sense."
I think the cat is out of the bag. There's enough public datasets and published methodologies that are relatively simple to implement, that quite usable facial recognition software is within the bounds of an undergraduate homework project. Sure, IBM can probably make it more accurate, but nonetheless, if somebody wants to make a tool that does e.g. ethnic profiling, then they can do it without the help of IBM, the techniques for solving similar vision tasks are known and people who can do it are widespread.
Any interested government department where a manager wants this can hire a random graduate that can implement this for them, it could be literally be a one-man project with a trivial budget. There's no multimillion purchase pitch required, a regional niche department can do it in their own kitchen without involving the rest of the government using spare change they have to spend until the end of the fiscal year in order to not get the next year's budget reduced.
I understand how ml can replicate existing cultural bias in recommendation systems or risk scoring systems, but how does bias work in the context of facial recognition?
technically much lower accuracy on African-American and Asian populations (in the US)[1]. More importantly technical issues aside the primary use case of facial recognition seems to be the policing of minority or vulnerable populatoins and erosion of privacy more broadly, which tends to hit minorities the worst.
But isn't the bias mostly just because of the lack of data? Asian facial recognition works very well (or very badly, depending on your perspective) in China...where there is a ton of data on Asian facial structures, etc, for example.
It's not just a problem of lack of data, it's a problem of composition of data. If there's a signal that increases your accuracy for asian faces and decreases it for caucasian faces, in systems deployed in China the system weights will be adjusted one way, in systems deployed in Europe you'll get the other way, and Asians in Europe and Caucasians in China will get bad performance. Maybe people don't care about that, but wait until that performance difference is between black and white people in America.
I definitely had this issue when I was growing up in a Balkan country. I couldn't differentiate Asian faces, I remember having a lot of difficulty with this while watching movies. Fast-forward a decade, now I'm living in the US, and I can differentiate Asian faces as easy as White faces or Black faces. For me, it was all a matter of "training" (and by that I mean interacting with people with these facial features).
I see at least two main sources of bias: 1) your training data does not have enough people with dark skin or African American face features, 2) your hardware and image processing pipeline relies on color contrast or skin brightness. One way to measure bias is to compare the average error rare across ethnic groups.
I think the potential harm these systems could cause if they worked flawlessly might be far greater than the harm these systems could cause by malfunctioning.
You keep repeating black faces have lower variance, but you never back this up. Personally I find it a preposterous thesis, I find the opposite to be true. I know for a fact genetic diversity is highest among Africans.
By being so convinced of this without showing any proof you are running to risk of proving the point at the root of this issue: we are not addressing our bias in building the system. Not only in our data, but in your case perhaps even not in our thinking.
I calculated it myself in a CV course in university. The same effect was observed by others. I didn't read it somewhere. This shouldn't surprise anyone - white people have tremendously higher variance in color alone (skin, hair, and eyes).
But now explain to me the difference between Bob's face and another humans face without referencing what another humans face looks like, or the difference between bob's face in front of a white wall vs a blue sky.
Basically to train an ML system you'll need pictures of bob and pictures of not bob.
If the pictures of not Bob don't include any humans then there's a good chance that every human face gets classified as Bob because they look a lot more like Bob than an airplane. If the pictures of human faces are only white people and Bob is black then it's quite possible all black faces you show the system after training get classified as Bob.
So the problem is mostly when deciding whether a face is likely one of the faces already in the system or if it is a new face that hasn’t been added yet?
So the problem is it's hard to give a definitive answer because there are many different types of facial recognition systems.
ML systems have a phase when you are training the algorithm. When you are adding data in this phase you are altering the algorithm rather than adding data to compare against. For facial recognition it might be learning things like distance between eyes is useful for distinguishing faces, that the wall behind a person is not good for distinguishing people, etc. If this stage of the process goes badly then you may end up with it assuming common characteristics are important than they are.
When you add faces after training you are usually not altering the algorithm (some ml techniques do continue learning but there are challenges in this approach too), you are just adding faces the algorithm is generating outputs for.
So for a naive face lock algorithm on a phone you may have your trained algorithm that compares two pictures of faces and puts out a number based on how certain the faces are the same person. So all that's happening when the user is 'training' face lock is the adding of a bunch of pictures of their face, then when you go to unlock the unlock algorithm just compares the image from the camera to all of the stored images and if any match closely enough it unlocks.
Now let's consider the naive law enforcement approach with a similar algorithm. We'll go with the hypothetical: You load in a bunch of outstanding warrants and whenever your body camera captures an image that matches the image in a warrant it will ding and you can go to the cruiser and pull up the warrant for further scrutiny.
If the facial recognition software say placed a large importance on whether a person had an epicanthic fold (because it was enough to be a distinguishing feature in the training data set) then if one warantee was of asian descent and had a big nose then all it might take for others of the same descent to have a high match with that person is to have a big nose, even if their other features are distinctly different. Adding more pictures of the wanted man wouldn't actually help this it would infact make it more likely that more asians would be held up while the warrant was checked by a human.
Now you can be smarter about your logic around the output from your ML algorithm but to actually get better output to base your logic on you actually need to train the ML better in the first place (or retrain it). That requires data, good data for what your trying to classify, actually figuring out what that data should contain and making sure it contains it is a huge task in itself. A publicly available corpus may not be as robust as required for this training and there is little incentive for private companies to share robust data as this can be considered the secret sauce in their offerings as well as being a barrier to entry to competitors.
I don’t think that’s necessarily true. I don’t know much about machine learning, but I’m pretty sure a common technique is to use a model that is trained on a huge number of relevant data, and that model outputs a fingerprint of a given input, and the fingerprints of two inputs can be used to estimate the similarity of those two inputs.
I had this story about HP webcam software in mind. [0] But I'd say it very much depends on what you are doing. I don't think it is an inherent short coming of the technology but if you don't think about what you are doing the result might give the wrong perception.
Yeah, I think the comments here are talking about both using the same terminology without clarifying, which was a bit confusing until I realized that was happening.
If this was the root problem, it isn't that hard a problem to solve, and a lot of people have the motivation to solve it and gather the justified social plaudits for solving the problem. Is it seriously the case that nobody has collected a more diverse face set despite it being public knowledge for years that this is a problem? It doesn't take that long.
Never let a crisis go to waste. Why just cancel a program that's losing money when you can cancel the program and tell everyone it's because of your new found belief in social justice.
Lots of AI tech isn't making money now, but will in the future after tons of investment. IBM is over a century old, and they've invested in a lot of tech like that. I understand the impulse to be cynical about a company's motivations for any social action, but this is honestly a good thing and we should be happy about that.
Facial recognition is difficult to earn a profit for several key reasons: a) FR customers tend to already have large camera networks they want to add FR, meaning the profits of selling the customer cameras and the quality control over the camera image quality is gone; b) FR customers tend so already have video surveillance management software, and expect any added FR software to interoperate as a plugin or 2nd class application to their surveillance management software; c) most/all of the cameras were never intended for FR use, but this new potential client has existing relationships for their security cameras and networking, so the FR vendor will need to work with/for this 3rd party company that already has the video surveillance management contract... Their guys operating the FR system day to day need to be trained how to install FR appropriate view cameras, understand issues such as variances of illumination, indoor and outdoor lighting, and the necessity for timed outside lights and so on. Then after all that, specialists configuring the FR system to be tuned for accuracy across all indoor and outdoor environments across all time frames of monitoring. With the logistics of each FR customer being unique, add in these other existing infrastructure hurdles, and then mix in the uncertainty of the FR system's configuration and it working with light to no technical trained staff? FR is very difficult to do successfully, and that equates to very difficult to do profitably.
Issue real. But with one not so successful vendor out, what does it meant?
Also, with china here and like many technology could you not do it china would give up as well? Is it 2016 when the coronavirus study join with a us u stop due to concerns but not by china (Wuhan lab) that give them a lead? Or human genetic HIV research on baby?
No good solution but I think just quit is not the one for a potential human rights related technology research or area?
Only AFTER they helped China develop the technology to racially ID their Muslims. And IBM isn't alone, many companies have helped China to efficiently fill its death camps.
What you're saying might be statistically correlated but that does not make it an useful discriminator. There is nothing inherent in black people that makes them commit crimes more, these are societal factors.
Imagine a white person had to pay 100x the scholarship because the richest people are all white.
Actually you can prove it and it has been proven. Genetics have a minor say in who a person is, and racial factor is equivalent to statistical error. For example there are adopted black children in very white populations, e.g. in Eastern Europe, where I live. These children are absolutely no different to kids here - and the white kids here know from experience that a person that grew up here (among our people) will be just like them. And the same goes for other ethnicities/nationalities that are considered different in personality, e.g. Arabs, Greeks, Italians, Spanish people... It's all about the upbringing.
I am not familiar with any scandals, and in the case of Ivy league schools I am always asking why that school? There are thousands of cheap (for everyone) high quality schools. The fact that few well known schools are using race to distinguish high income students is wrong (now you at least have some view into what racism is, imagine how bad it must be to be a minority), but it is not happening universally as you suggest.
>Actually you can prove it and it has been proven. Genetics have a minor say in who a person is, and racial factor is equivalent to statistical error
That's not true and you won't be able to produce a rigorous source, and it's inconsistent what what we know about human physiology.
> The fact that few well known schools are using race to distinguish high income students is wrong (now you at least have some view into what racism is, imagine how bad it must be to be a minority
I am a minority, son of immigrants, and that's what makes this discussion particularly frustrating. This unrealistic perception of genes and culture is a purely western delusion.
>For example there are adopted black children in very white populations, e.g. in Eastern Europe, where I live. These children are absolutely no different to kids here - and the white kids here know from experience that a person that grew up here (among our people) will be just like them.
There are similar studies from the 70s-90s in the US which show opposite conclusions.[0] We can continue bending over backwards coming up with explanations that allow us to maintain our magnanimous worldview, or we can accept that what we know about genes and culture has negative implications for equality of outcome. This isn't a superiority or inferiority judgement either - the NFL is not overwheingly black because discrimination.
The key is to have open data and not let Chinese to have world data whilst they close their it and data for themselves. Shut down an area and let china lead is not the human right answer. You need to force them to join the world in a meaningful way. We cannot study photos or things like that like study Soviet Union politics.
Just too danger to leave and let china to win. All in and ensure the technology be used in an open And censurable manner.
They're just behind. And the ACLU "bias" study was thin and unscientific. Data sets and weights could have bias, but that bias can also be controlled. "Facial recognition" does not have bias.
Krishna’s letter is here[0]. IBM will cease to sell related products and services. One might speculate they might resume sales once there are strong regulations and limitations in place.
[0] https://www.ibm.com/blogs/policy/wp-content/uploads/2020/06/...