Hacker News new | past | comments | ask | show | jobs | submit login
Machine learning is fundamentally conservative (lareviewofbooks.org)
308 points by macawfish on Jan 4, 2020 | hide | past | favorite | 200 comments



Teach AI using our behaviour, AI learns our behaviour. A bit like our children. I'm genuinely confused as to the alternative.

The objection seems to be based on the falasy that technological progress equals social or political "progress". Why on earth would we expect AI descision making to display a lack of prejudice when human decision making is suffused with it.

The only people who expect technology to act like a benevolent god are the ones who have replaced their god with it. All technological progress does is to increase the power and influence of human beings. The progress the writer seems to want is socio- political, not technological.


> Why on earth would we expect AI descision making to display a lack of prejudice when human decision making is suffused with it.

Particularly when we consider what we mean by prejudice, which is presumably something like Making a decision on grounds which we deem it important to ignore. This is a very complex concept. It's a function of society, and changes with society. It's not something with a rigorous definition.

Obvious example: reasonable modern people know it's indefensible to make an engineering hiring decision on the grounds of ethnicity, regardless of whether there are any correlations associated with ethnicity. This is even enshrined in law in many countries.

To make a decision on the grounds of someone's qualifications and job experience, however, does not count as prejudice.

We should expect a machine learning system to act as a correlation-seeker (that is after all what it is designed to do), without a nuanced understanding of what prejudice means.

We've seen this issue crop up in the context of an AI having a say in parole decisions. [0] also relevant discussion at [1]

[0] https://www.forbes.com/sites/bernardmarr/2019/01/29/3-steps-...

[1] https://news.ycombinator.com/item?id=13156153


> We should expect a machine learning system to act as a correlation-seeker [...], without a nuanced understanding of what prejudice means.

One likely outcome is that increased usage of ML will lay bare more gray-area correlations that force us to confront broader systematic prejudice in an uncomfortable way. Now whether we make material changes based on these discoveries or whether we leverage our significant ability for post-hoc rationalization to paper over the cognitive dissonance is another question entirely.


> more gray-area correlations that force us to confront broader systematic prejudice in an uncomfortable way

Or, perhaps we discover that correlations don't necessarily point to systematic prejudice, and continue to argue over which correlations are more indicative of prejudice than others.

Some point at the proportion of women in top software engineering job and claim prejudice in hiring. Others point at the hiring pool for top software engineering job and disagree. And around and around we go. Machine learning won't solve that.


> We should expect a machine learning system to act as a correlation-seeker

We should expect machine learning systems to dismiss accidental indirect correlation for the benefit of the variables that are directly correlated. There are plenty of algorithms that achieve that, it's only that gradient descent doesn't.

The fact that our AIs are becoming biased is a bug. It should be fixed.


> The fact that our AIs are becoming biased is a bug. It should be fixed.

In many cases, it's the data that is biased. In that case it's impossible to differentiate between bias that the AI should learn and bias that the AI shouldn't learn.

Let's assume we have a database of incidents where a person was found to have cannabis. This database has the following data items: a timestamp, the persons name, the persons ethnicity and the amount of cannabis that was found. Now assume further that black and white people have the same base rate of cannabis use (which according to the studies I found seems to be the case). The last thing we have to assume in this case is that this database was created by racist policemen who arrest more black people for cannabis consumption.

An AI trained using this data would assume a higher base rate of cannabis consumption by black people. It's impossible for this AI to differentiate between correlations it should learn (for example that people who used cannabis multiple times and were found to have much of it are worth looking at) and (untrue) correlations that it shouldn't learn (that black people have a higher base rate).

The correct solution here is to use a dataset that is not biased, but it's hard to tell whether a dataset is biased.


The data can't tell you how the groups differ, since you can't tell the difference between criminal behavior and policing behavior. So you have to add some priors. The most progressive approach is to assume that there are no intrinsic differences between protected groups, and any difference in the data is the legacy of past discrimination.

You can add such a prior by adding a term to the loss function that penalizes any difference between the way the groups are treated. The math isn't hard, only the political decision of what is protected and what isn't.


> assume that there are no intrinsic differences between protected groups, and any difference in the data is the legacy of past discrimination.

I don't see why we should assume that this would reflect reality.

If a law has racist roots, i.e. if it is written to target a particular ethnicity, then we should expect that certain ethnicities really do break that law more than others.


So... If you don't like the data, change the character of the data?

I get what you're getting at. If the data collection is biased, massage the actor on the data to compensate. Got it.

...but if you're uncomfortable with the consequences of unleashing a Machine Learned system trained on your data as part of your decision making process from the get go, why in the heck would you want to anyway? There is no benefit to doing so when you realize that the system only works as well and unbiased as the sum of the biases of the data architects, data populators, and data interpreters combined, and that extra loss factor you added is, in fact, another bias. An unacceptable on to the minds of many no matter how disagreeable one may personally find it.

Better to just leave it in human hands to deal with on a case by case basis by that point. At least by doing that we stay ultimately in control of the overall decision making process.

People being in the loop is not a bug. Trying to solve social/political problems with poorly thought out applications of inexplicable technological systems is.


> An AI trained using this data would assume a higher base rate of cannabis consumption by black people

This is a very good point, but it's not enough to ensure your input data aren't reflecting existing biases in society. (I believe it's necessary, but not sufficient.)

In my other comment I gave the example of maternity leave. A woman who becomes pregnant won't be as productive that year. It's no-one's 'fault' that this is the case, and it doesn't reflect anyone's bias. It's still important to ensure, when making hiring decisions, that applicants are not eliminated on the grounds of 'high pregnancy risk'.


No, that doesn't sound right. It doesn't matter which particular algorithm you're thinking of. The issue here is at a different level.

Let's consider an AI that weighs in on hiring decisions. Let's consider what it might make of maternity.

Decades ago, it wasn't unusual for a prospective employer to ask a female applicant whether she was planning on getting pregnant. Society decided that wasn't acceptable, and made the practice unlawful (as well as forcing employers to offer paid maternity leave), despite that, from a profit-seeking perspective, that is valuable information for the prospective employer to have.

Let's suppose an AI has been trained on a dataset of previous hiring decisions, and some measure of how much value those employees went on to deliver. Let's suppose it's also told the age and sex of the applicants. It might notice that female applicants within a certain age range, have a tendency to have a long period of zero productivity, some time after being hired. Having noticed this correlation, it would then discriminate against applicants matching that profile.

But wait!, I hear you think, Why did we provide it data on sex and age? Remove those data-points and the problem goes away. Not so. There could be non-obvious correlations acting as proxies for that information. The system might detect these correlations.

In a parole-board context, suspects' names and addresses could be proxies for ethnicity, which could correlate with recidivism rates. The 'correct' behaviour for a correlation-seeking AI is to detect those correlations, and then start to systematically discriminate against individuals of certain ethnicities, even when their crime and personal conduct is identical to another individual of a different ethnicity.

Back to our maternity leave example, then. It's an interesting exercise to come up with possible correlations. The ones that occur to me:

* Given names (if provided to the AI) generally tell you a person's sex

* Given names also fall in and out of fashion over the decades, so there's a correlation with age

* The topics a person chose to study at university, can correlate with the person's sex, as well as their age, as some fields didn't exist 40 years ago

* The culture a person is from, may impact the statistically expected number of children for them to have, and at what age. Various things could act as proxies for this.


- why are we providing the names and addresses as input to the neural network? they should be annonymized?

humans may also act on the same information. The fact that you are less likely to be called if you have an arabic sounding name in europe is well documented.

at least with neural networks we can remove those variables.

I guess another problem could be that our current distribution is biased. so the system might give an advantage to new-grads from unviersity X becausea lot of old employees are from it. which is inherent disadvantage from people from tradtionally-black university Y.

but hiring based on alma mater is already a common practice.


What are the algorithms that can dismiss “indirect correlation”? How can you tell a direct from an indirect correlation in data?


> What are the algorithms that can dismiss “indirect correlation”?

Anything that doesn't get stuck on local maxima do.

> How can you tell a direct from an indirect correlation in data?

There is another variable, with stronger correlation, that explains the same change.


Citation please?


Prejudice is using a proxy to approximate something else. It’s inherently inaccurate like trying to find the fastest vehicle and then listening them by price tag.

The problem for AI is these proxies can seem to be very useful. Want to know how fast a vehicle is, well price seems like a really good approximation when you start. Except, the more information you have the more likely price is going to fool you.


The boundry between informed and uninformed decisions becomes blurred doesn't it. Very interesting.


We are humans, living in a human society, with human values and purposes, and the associated processes and conflicts, at every point of social scale. Furthermore, we are a process, with a past and an unguessable future, at every point of social scale.

It is patently an act of incredible destruction to increasingly, globally and inescapably “control” important domains of human lives, relationships and society with extremely superficial centralized mathematical models of what humanity is.

There will be unintended consequences. They will be horrifying, and we may not even grasp what we've lost.


>> Why on earth would we expect AI descision making to display a lack of prejudice when human decision making is suffused with it.

According to the article, that is how AI decision making is presented by "technocrats everywhere". For instance:

Empiricism-washing is the top ideological dirty trick of technocrats everywhere: they assert that the data “doesn’t lie,” and thus all policy prescriptions based on data can be divorced from “politics” and relegated to the realm of “evidence.”

The article is presenting, and supporting, the opposite opinion to that of "technocrats everywhere".


Maybe the writer should be understood here as criticising the hype around “AI”. AI is hyped as somehow allowing us to transcend our current approaches and find new patterns in the data, but what it really does is make us dig our heels deeper into our existing methodologies.


> "Teach AI using our behaviour, AI learns our behaviour. A bit like our children. I'm genuinely confused as to the alternative."

The ultimate goal of teaching, however, is teaching how we arrived at a given behavior. And amongst the most prominent themes in this is dealing with new situations and adapting to change. Teaching is not about imitation, it's about transcending the example.


I feel like you are using the word transend to suggest this very thing - that impirical data will reflect our political and cultural aspirations. We belive our children and our technology are both blank slates onto which we can pile our hopes for the future, but neither are. We are all children, and our technology is just a reflection of us.


Ever hear the phrase "truth from the mouth of babes"? ML systems are a lot like children, and like children, frequently state things that are both true and impolite. Sometimes, though, these things need to be said.


Conservative in this sense means something like 'resistant to deviation from established norms'. I think a lot of the headline-only readers take conservative to mean 'of the character of a specific political movement' which ironically seems more activistic than change resistant.


Perhaps you could call it "integral controller" (like the I in "PID controller")? Because systems that have a memory behave like ones, and we are definitely in a feedback loop with those systems.

And, from what I remember from my control theory classes, the integral part of a controller introduces lag, inertia, generally making the output more resistant to input changes.

(Also note that the "non-conservative" Proportional and Derivative components in a PID by definition don't learn - they react to input and its change.)


That's a pretty interesting corollary! I suppose I could extrapolate that in such a system, too much change over too little time would invite a strong conservative response / lag to future input.

Our 20 year old millennium has seen a tremendous rate of change in both technology and social norms. Perhaps politics has some similar inertial dynamics.


But isn't “resistant to deviation from established norms” at least partly what conservatism means politically?


On paper, yes, but in practice, there is significant lag time between a new noem being established and conservatives giving up the old one in favor of the new normal. For example, being against gay marriage (as opposed to it being a non-issue) is still seen as the conservative opinion despite pro-gay-marriage being the plurality opinion for eight years and an outright majority for six[0]. Further, since economic deregulation is seen as the conservative stance, being in favor of abolishing the minimum wage labels someone as a conservative, despite it being nationally established for seven decades.

[0] https://www.pewforum.org/fact-sheet/changing-attitudes-on-ga...


Mostly only in America does it have this association. I read it as a bias towards not changing unless it's something clearly related to the US political system


The article calls machine learning "conservative" because it only tells us what is, and not what should be. I don't think that's a useful framing. It's more accurate to say that, like all statistical techniques, machine learning is descriptive, not prescriptive. Not everything in the world has to be prescriptive.


The article is not addressing what machine learning can and cannot do but how individuals and organizations conceptualize what machine learning can and cannot do.

It, as the current top comment mentions, is a philosophical rather than technical argument.

It doesn't matter if all machine learning can do is describe the current state of things of it is presumed by those interpreting the output to provide insight into some great and universal truth.

Most science degree programs have a philosophy of science course as a degree requirement. As an engineering faculty member I have found huge resistance to and equivalent course becoming part of engineering or computer science degrees. Those degrees, instead, take an apparoch to scientific/technical knowledge that is less science and more religion.

The comments here are a reminder why such courses have a place.


It is not just a philosophy course which is missing from software engineering degrees. Software has real impact on individuals and societies that use them. Therefore, software engineers must necessarily be taught about the interface of psychology/sociology with software and the ethics surrounding this interface. The world would be a better place for it.

Please read this thread https://twitter.com/yonatanzunger/status/975545527973462016 "computer science is a field which hasn't yet encountered consequences."


> Not everything in the world has to be prescriptive.

Unfortunately we're still on the rise of the current hype cycle around Machine Learning, and prescriptive solutions are precisely what nearly every company who use the phrase "machine learning" are promising.


I think recasting prescription from "so you should" to "and here are the inferences now available" can be reasonable. The issue is when the inferences from discovered patterns are invalid and implicit - as per predictive policing or follow about ads on the web. An inference such as "this is almost certainly the component that is causing the most faults and will cost the least to replace" is reasonable and valid to me, and so a machine offering it would be valuable.


Maybe the pure science of ML is descriptive and not prescriptive. But if the rest of the world uses ML prescriptively in application, then in practice it's basically prescriptive.

It's a bit like dictionaries: they "learn" the current meaning of words. It gets annoying though when everyone starts running around policing the meaning of words based heavily on that dictionary: it would conservatively restrict the natural evolution of word meanings, when we know meaning of words do change over time, and sometimes should change over time.


We generally want ML to imitate us (or human generated data), thus we teach it to do so. But there are also generative ML (like GANs) and simulation based ML (like AlphaGo) which can be more creative. There is nothing stoping us from letting agents evolve in a complex environment and be creative. It's just not commercially useful to do that yet. Doctorow writes like he doesn't understand much math behind ML, yet has strong opinions on it.

Every time a random number is involved in the training process (stochastic selection of mini batches, noise injection, epsilon-greedy action selection), a form of exploration is taking place, finding novel paths of solving the problem at hand. The mixing of noise with data is paradoxically useful, even necessary in learning.


> ... finding novel paths of solving the problem at hand ...

This only applies to reinforcement learning (and similar). Classification problems (and similar) are fundamentally conservative, because you feed them with a finite set of preexisting training data that they must extrapolate from.


Sure, but no matter what, you can't derive an "ought" from an "is". At the end of the day we merely tell these algorithms what is the case. No matter what goes on inside, they cannot output moral prescriptions.


>Sure, but no matter what, you can't derive an "ought" from an "is".

The same applies to humans, no?


No it doesn't. When a humam gets wronged by what "is" they can likely feel or imagine a better "ought".


They can feel or imagine an "ought", but the point of Hume is that they can't argue for it.


And yet humans do, all the time. It's similar to Hume's attack on causality. You can't show that A caused B, but yet we all act like it's the case, when B always follows A. Kant's critique comes next.


You don’t think an machines can learn to classify situations as good or bad for them ?


> It's just not commercially useful to do that yet.

Some people are working on it : https://news.ycombinator.com/item?id=21728776 (dec 2019)


Examples are good and on point, the conclusion is not. When he tries to frame it all in some grand political/quasi-philosophical manner, it becomes outright wrong and stupid, but I won't argue with that part, because it won't be useful to anyone.

What I want to point out is that nothing he says should be attributed specifically to "machine learning". Machine learning is a set of techniques to make inferences from data automatically, but there is no implicit restriction on what the inferences should be. So machine learning is not "conservative" — almost all popular applications of it are. There is no inherent reason why an ML-algorithm should suggest the most similar videos to the ones you watched recently. The same way you can use (say) association learning to find most common item sets, you can use it to find the least common item sets with the given item, and recommend them instead. Or anything in between. But application designers usually choose the less creative option to recommend (understandably so) stuff similar to what you already got.

Sometimes it's ok: if the most popular thing to ask somebody to sit on nowadays is "my face" it's only logical to advice that, I see nothing wrong with this application. But many internet shops indeed could benefit from considering what a user has already bought (from this very shop), because it isn't likely he will want to buy a second similar, but different TV anytime soon. Or, when recommending a movie, you could try to optimize for something different than "the most popular stuff watched by people that watched a lot of stuff you did watch" — which is a "safe" thing to recommend, but at the same time not really interesting. Of course, finding another sensible approach is a lot of work, but it doesn't mean there isn't one: maybe it could be "a movie with unusually high score given by somebody, who also highly rated a number of movies you rated higher than the average".


The point is that ML today is based on pattern recognition and memorizing a stationary distribution.

This stationary distribution is the source of the conservativeness and central to algorithms that we call "machine learning". ML always tries to replicate the distribution when it makes infererences, so it is fundamentally biaed against changes that deviate from that distribution

The Future and the Past are structurally the same thing in these models! They are "missing" but re-creatable links.

AI is the broader term, but in pop culture AI and ML are very much synonymous.


Well... no, not really. It is kinda hard to discuss in general, because it depends so much on the details: the application and the algorithms in question. But there is nothing inherently conservative about ML algorithms.

I see why you assume that "stationary distribution is the source of the conservativeness", so maybe I should clarify this moment. It is kind of true in the most general sense: sure, when querying the stationary distribution we can only ever ask how things are in a timeless universe. How anything new can be obtained this way? The problem is, that if we are this general, then the word "conservativeness" loses any meaning, since everything in the [deterministic] Universe can be framed as a question of "how things are", everything is conservative, nothing new can be obtained anywhere, ever.

And we don't even need to get this general for the word "conservativeness" to lose practical sense. When you ask another human for an advice, all he ever does is, in essence, pattern recognition and querying his internal database of "how things generally are to the best of his knowledge". Yet you don't call every human advice ever "conservative": only the kind of advice that recommends the safest, most popular thing, thing that everybody likes, pretty much ignoring the nuance of your personal taste. In fact, even then, you call it "conservative" only if you can notice that, which means that the recommendation isn't new for you personally (and by that criteria alone most humans would lose to a very simple, absolutely currently implementable music recommendation algorithm, since they probably know much lesser number of "artists similar to what you like" than Spotify knows: the only thing Spotify has to do to win is not to recommend the most popular choice almost every time).

One more thing. I could probably convey to you that assuming "ML = conservativeness" is wrong much faster by invoking the example of reinforcement learning, since it is sort of intuitive: there is the obvious existence of "time" in that, you can imagine a "dialogue" where it adapts to what user wants using his feedback, etc. It is easy to see how it could behave "not conservatively". I intentionally avoided doing that, since it can lead to the false impression that RL is somehow different and less conservative than other algorithms. On the contrary, the point I'm trying to make is that every algorithm, even the purest form of memorizing the stationary distribution (like Naïve Bayes) is not inherently conservative. It all depends on what exactly you ask (i.e. how you represent inputs and outputs) and how you query the distribution you have (e.g. how much variability you allow, but not only that).

So, when you see the application that uses ML algorithm and behaves "conservatively", it isn't because of the ML algorithm, it is because of the application: it asks the ML algorithm wrong questions.


I felt the same way about this essay’s conclusion. It’s like he took all of the substance out of Ted Kaczynski’s arguments and repackaged it as a diatribe against ML. Except it’s not convincing in the slightest.


Thank you. “Conservativeness” is an inductive bias that is simple, often helpful, and more often used, but not fundamental.


While Cory is talking about a slightly different facet of conservatism, I found it quite ironic that he made a rather conservative statement himself:

>Nor is machine learning likely to produce a reliable method of inferring intention: it’s a bedrock of anthropology that intention is unknowable without dialogue. As Cliff Geertz points out in his seminal 1973 essay, “Thick Description,” you cannot distinguish a “wink” (which means something) from a “twitch” (a meaningless reflex) without asking the person you’re observing which one it was.

Was this mathematically proven? It's definitely an interesting statement, since a lot of "AI" systems try to predict intention and do a piss poor job of it, but to quote that the anthropological ancestors have proclaimed for eternity that a computer can never know even the slightest fraction of intention from just observation seems hypocritical.


This was one of the stupidest parts of a fairly weak essay. Of course you can distinguish a wink from a twitch without asking, otherwise we wouldn't use winks for surreptitious communication.


As an anthropologist you are in theory studying humans from out of a shared cultural context and it is easy to argue that a machine which hasn't even the slightest semblance of a shared cultural context would not be capable of distinguishing a twitch from a wink with 100% reliability. Not all human gestures are universal either so even if you trained your AI to read faces it would likely be trained on a small subset of human cultures and would produce varying results. In regards to the greater question of knowing intention I'd agree that algorithms do a much poorer job of guessing my intention than me directly communicating it and I am annoyed that so much of our on-line communication is filtered by algorithmic social engagement that an encourages shallow interactions.


This sounds the the classical "intelligence being something people have".

It is futile to talk about 100% correctness in this field

> ... machine which hasn't even the slightest semblance of a shared cultural context ...

Why is this universally true?

> ... it would likely be trained on a small subset of human cultures and would produce varying results.

As it has when you travel to foreign countries.

There is not such hing as understanding artificial intelligence. There is a task of understanding _intelligence_ and attempting to implement it without giving birth.


In Greece, if I show my index and middle finger with the nails facing towards my interlocutor, it means "two". In the UK, it means "up yours". In the UK, if I show my open palm with fingers extended, it means "five". In Greece it's a rude gesture that means "you are an idiot" (actually: a malakas).

And don't get me started on supposedly universal facial expressions like smiles. Everytime I see a US citizen in a posed picture, I understand that they are supposed to be smiling, because it's a portrait etc, but what I perceive is an aggressive and half-mad rictus. [Edit: so as not to make this a thing about US people: British smiles I perceive as fake; Balkan smiles, as lewd and provocative; smiles of people from the subcontinent as forced and subservient; the only people whose smiles I recognise as smiles are Mediterranneans and Middle-Eastern folk, who smile like the people back home. But I always remind myself that the way people smile, or generally arrange their face is just how they have learned to arrange their face to express certain feelings, which is a different way than I have learned to arrange my face to express the same feelings. I probably look weird to them, too. Actually- because I smile with my eyes more than my mouth, I've been told in the past that I'm rude because I "don't smile". Wot! But I...]

There's no use assuming cultural consistency of winks, nods, facial expressions and gestures. A wink is a wink, except when it isn't and a twich is a twitch except when it isn't.

So, yeah, the only way to know what someone means by a gesture or a facial expression is to ask. And even then you can't be sure.


Humans can misunderstand things, even with dialogue. Especially if there is an adversarial relationship.


Neither Geertz nor Doctorow are making an assertion about distinguishing winks vs. twitches, and the assertion they are actually making is the opposite of stupid.

They are pointing out that the system under observation, the system of interest (a person, a conversation, a society) is extremely complex and when it emits a signal it is not possible in the general case to know what that signal “means” without some understanding of the system’s state and processes.

We are not capable of directly interrogating, representing or computing over the true state and processes of these systems, so we must use heuristics. Humans develop “theory of mind”, the cognitive ability to model other people's cognitive state and process, including their human values and motivations. People with a better theory of mind, which makes more accurate, more fine-grained inferences in a wider variety of contexts, have a decisive social competitive advantage.

This doesn't even touch on “theory of relationship”... when you take a system of humans, what are their shared human values? How are value conflicts negotiated in groups at every point of scale? What is the current state of the negotiated compromises?

A dataset and a machine learning algorithm is an observer, inferrer and predictor of human and social systems whose actual state and processes are irreducibly large and invisible, and whose human values (and the domain of possible human values) are mysterious.

Even if we gloss over the symbol grounding problem, it is clear that machine learning systems cannot be expected to operate in a way that is aligned with the human values of individual or groups.

In the worst case they may operate to extinguish those values, and as an unintended consequence hugely diminish quality of life.

And maybe we won't notice, because surely there is an “Overton window” of imaginable possible lives, and ML has the potential to slowly and insidiously shift that window until “being happy and fulfilled” becomes “being in the centre-right of the main distribution” across the major ML models in society.

And yes, I do realize that similar normalizing forces have been present in society forever, operating with similar effect. (And I recognize their value in reducing transaction costs of all kinds). But to date they have been based on human-driven mechanisms which lacked the ability to centrally micro-observe and micromanage behaviour on a global scale. Alternative value systems could evade these pressures and develop, compete and co-exist, at every point of social scale.

But with the advent of the global Internet and sufficient computational throughput, ML has the potential to micro-observe and micromanage everyone everywhere. It can know more about us than we know about ourselves, while entirely missing the point of being human.


Lots of these things have been written about AI. Kasparov wrote a decade ago that Poker would be out of reach from computers because bluffing was inherently human.

That statement was easy to refute mathematically, but I don't know about Geertz. That seems more vague.


>> Was this mathematically proven?

How would you prove that "mathematically"? Do you mean "scientifically proven"?


He specifically used the example of predictive policing. I have a question: is it racist to send police patrolling areas of high crime, if those areas have a majority ethnic demographic? Should that information be discarded, and all places patrolled equally? Should they be patrolled less? It seems like there is no winning. You're either wasting resources, ignoring a problem, or being perceived as racist.


The problems of appropriate policing in so-called "high crime" neighborhoods are well understood. Many academic studies, as well as popular non-fiction like Jill Leovy's Ghettoside and Chris Hayes' Colony in a Nation, have discussed the issue. To sum up the literature in a few sentences, the problem is that minority neighborhoods are both overpoliced and underpoliced. There are a lot of useless arrests for minor crimes like jaywalking which makes the residents of these neighborhoods hostile to the police. (These arrests are driven by the debunked theory of broken window policing.) Simultaneously, there's not enough effort put into solving serious crimes like murder. In this context, the actual effect of predictive policing is that it ends up doing more of the useless over-policing, which unfortunately makes these neighborhoods even worse.

So, how does this relate to your question? The point is that predictive policing is solving the wrong problem. What's needed are not more accurate neural nets predicting crime, but techniques for addressing the underly sociological factors that cause crime.

Taking a step back and speaking broadly, Cory's point is that the focus on data and quantitative analysis is causing problems in two ways: (i) people are using quantitative methods to solve the wrong problems, and (ii) they seem to be oblivious to (and in some cases actively hostile to acknowledging) the harms being perpetrated by their methods. Both of these problems seem to be driven by a lack of understanding of well understood (but non-quantitative) social science literature.


Well, yes. That's the point. This form of AI will automate existing prejudices and misunderstandings. We're not really dealing with AI in any useful sense, but with automated bad policy biases, and the products and systems this generates should be labelled as such.

Corollary - Big Data is political, but it can hide behind the pretence of objectivity.

This shouldn't surprise anyone, but for some reason it does - perhaps because we're used to thinking of AI as the automation of the scientific method, when in fact it's just as likely to be the automation of all kinds of other things, some of which are nasty, stupid, and wrong.

A hypothetical smarter-than-human AGI may be able to talk back and say "You're being stupid because you're not modelling the correct problem, and so your attempted solution is wrong."

But that assumes hypothetical smarter-than-human AGI is also more-ethical-than-human - which unfortunately might be a bit of a reach.

Bottom line - the limits of AI are set by the limits of human political intelligence. And since most human political systems operate at pre-scientific policy levels - the exception being the abuse of social and personal psychology to gain power - this isn't encouraging.


> There are a lot of useless arrests for minor crimes like jaywalking which makes the residents of these neighborhoods hostile to the police. (These arrests are driven by the debunked theory of broken window policing.) Simultaneously, there's not enough effort put into solving serious crimes like murder.

Slightly off context but I think that the solution to this is some kind of quota and or tier system to laws and policing. Basically to restrict the amount of policing that may be done for things like jaywalking and traffic violations while the violent crime rate is above a certain threshold.

I have been held at gun point twice in my life as part of armed robbery and hijacking and neither time did I even consider for a moment that the police would find the people who did it. But if I do 60 km/h in a 40 km/h zone they will follow me to the end of the earth to get me to pay them.


Broken windows policing is not "debunked". A lot of people would prefer for ideological reasons that it not work, but it does. Anyone who lived in NYC can tell you that.


Speaking of NYC specifically, a lot of cities across the US, as well as worldwide, experienced a major decrease in crime at the same time that NYC did. Most of the cities were not practicing broken window policing. That's one reason to be skeptical.

From an academic perspective, it is true that there is some debate about the efficacy of broken windows policing, but even the most supportive academic studies find only a small correlation between violations of "order" (like jaywalking and graffiti) more serious crimes. There's just isn't any evidence at all that the way to reduce serious crimes is by going after jaywalkers.


It depends.

Here are three different examples that would meet your description, with very different answers:

1) The X community has exactly the same, or lower, crime rates than the entire nation. However, nationwide anti-X sentiment means that despite this, most convicted criminals are from the X community. This makes it look like the regions where the X community live are high-crime regions.

This is “accidentally racist”. Researchers know about this problem.

2) The X community is more prone to crime, and as this is purely an example, it just is and there’s no need to justify that.

Extra police in this scenario is not racist, though I suspect anyone who jumps right in and assumes it to be true about the world might well be racist themselves.

3) A confounding variable, such as income or education, means that members of the X community are more likely to commit crimes than the general population, albeit it at the same rate as the equivalent income/education/confounding sunset of the general population.

In this case, while it would not be racist to send in more police, it would totally be racist if politicians applied unequal effort to solve the confounding variable within the general population as compared specifically to group X.


"Patrolling" is pretty vague, the concern is primarily about groundless stopping, searches, arrests and shooting of suspects. It is discrimination if e.g. groundless searches are done at higher rates on people with a certain ethnicity, since presumed innocent individuals of any ethnicity should be treated equally and with the same rights.

Stopping cars just because of the skin color of the driver is not "perceived as racist", it is racist, since you are treating people worse due to their race.


How do you objectively determine "areas of high crime"? That's a core problem. We don't have data on the actual incidence of crime, only data filtered through the justice system.


I suspect that there is indeed no winning, and that under such circumstances every possible course of action is likely to be perceived as racist by some number of people. This is because the population at large doesn't seem to be able to agree on what the word actually means (https://slatestarcodex.com/2017/06/21/against-murderism/).

Regarding ML based predictive policing specifically, it seems like a spectacularly bad (but efficient and statistically effective!) idea to naively apply a machine learning based classification approach to such a problem.


Sure, many of the examples ring true.

However, I would refine the claim "Machine learning is fundamentally conservative" to say "Reducing a distribution of predictions to only the most likely prediction is conservative."


Also similarity is conservative. There are alternatives.

For a product/content recommendation you could include the product that's maximally different, the second most similar, you could exclude products within the same category, you could choose randomly and weight by conversion rate. We need more creativity and experimentation here I believe.


* you could include the product that's maximally different, the second most similar, you could exclude products within the same category, you could choose randomly and weight by conversion rate*

This is interesting. Has anyone tried that or knows of a website that uses this?


I have personally implemented discarding the top-n (small n, like 3-5) most similar results in a recommendation system as well as adding a totally random item in the results but it was a small scale system.

The top-n removal was a small but noticeable amount better. That was in part due to the specific content in the database as well as the audience, but it's a thing to try I believe.


The exploration versus exploitation tradeoff is well studied in bandit algorithms and reinforcement learning in general.


How would Amazon determine what's maximally different from a tv or pair of shoes? Is there a consensus way that humans determine maximum difference?


This is a design question. One way: frame it mathematically; map products into a space and use distance as a proxy for difference.

Another way: think of common human categorizations. Opposites are usually defined in terms of these categories.


I think it is more fundamental that models are trained using existing data. An autosuggestion will suggest things people have written before. Providing a larger number of predictions doesn't change that this is inherently conservative.


Re: "I think it is more fundamental that models are trained using existing data."

Yes, I agree, but this does not have the be the final word. There is a lot of room for design. For example, an autosuggestion tool can trade off between exploration (randomized recommendations) and exploitation (pure collaborative filtering).


It isn't /fundamentally/ conservative, it is just typically programmed to choose the most conservative (highest probability) predictions. You could integrate a liberal aspect by fuzzing the decision process to choose from lower probability predictions.

More creativity, and ability to escape local minima, but at some cost when dealing with 'typical' cases and when making particularly damaging mispredictions.


I think the point is rather that you can't get a more useful prediction by choosing a lower probability description unless you have AGI. Only an AGI could tell that you're not in the mood for "Hey" to be followed by "darling", and only a superhuman AGI could realistically compensate for human bias in data sets.


Without AGI there are still cases when the lower probability prediction will be better, and will lead to escaping a local minima. I'd argue that the potential benefits of calibrating that axis dynamically exist with or without AGI.


Are you describing the explore/exploit tradeoff or simulated annealing in this case?


Fuzzing will just give you more random predictions, it doesn't remove any inherent biases in the training data.


Mom: If all your friends jumped off a bridge, would you, too?

ML Algorithm: yes!!


If all my friends are jumping off a bridge, it's probably bungee jumping.

I mean they could have all simultaneously gone insane, or be infested by alien mind parasites, but going bungee jumping is the much more likely reason.


Maybe bridge was on fire?


https://www.xkcd.com/1170/

It's not always that a XKCD is such a good answer.


This is a problem in almost every academic field right now. Peoples’ lack of sophistication with understanding the mathematical/philosophical constraints of their tools is incredibly scary.

For example, people throw around the low labor productivity stat all the time to prove that no automation is happening, not realizing that GDP is the numerator of that stat. Well, GDP is a pretty terrible gauge of who is benefitting from technological innovation, as it is distributed incredibly unevenly, probably more so than income even. The problem with automation isn’t that it’s not happening, it’s that only a very small number of people are capturing the wealth that it is generating. Also, it is generating a small number of professional jobs, but mostly the jobs it is generating suck (Uber driver, AMZN warehouse worker, etc).


Likewise, looking only at nominal wage increases is a “pretty terrible” way of gauging who benefits from increased wealth. If improvement in productivity (automation) results in lower prices (or better or more goods for the same price), that increases the purchasing power of a worker’s earnings, even if their wage has not risen.

In other words, a worker is richer if they can buy more and better stuff even without any rise in their wage.

The fact that everyone can afford a smartphone, including those with jobs that “suck”, does not reconcile with “only a very small number of people are capturing the wealth that it is generating.“


This argument forgets that where people spend most of their money, housing, education, healthcare, costs are rapidly raising in real terms and have not followed Moore's law in usefulness.

Yes technology has made cool gadgets for consumption, but that isn't what wealth is about. From Wikipedia: "Wealth is the abundance of valuable financial assets or physical possessions which can be converted into a form that can be used for transactions."

The average worker hasn't seen a real pay ride in 40 years, and their ability to build wealth through savings has been greatly diminished.


There is no data that supports the assertion that the average worker hasn't seen a pay increase in 40 years. Increases may not have been as great, but there have been increases. Whether you look at Fed numbers, CBO numbers, or Census numbers, all show an increase in real income.

https://en.wikipedia.org/wiki/Household_income_in_the_United...

Between 1979 and 2011, gross median household income, adjusted for inflation, rose from $59,400 to $75,200, or 26.5%. This compares with the Census' growth of 10%.[19] However, once adjusted for household size and looking at taxes from an after-tax perspective, real median household income grew 46%, representing significant growth.[20]

https://en.wikipedia.org/wiki/Household_income_in_the_United...


CPI isn’t being measure properly. What if all of these numbers that you cite have been games by a bureaucracy desperate not to advertise its failures? Everytime I dig into a government stat I end up finding out how badly it misses the mark. This whole article is about how we trust our own data too much, and your argument is, “But that’s not what the data says”.

Instead of looking at nominal income (which was a smaller part of peoples’ wages decades ago), look at real wage growth (https://www.pewresearch.org/fact-tank/2018/08/07/for-most-us...). If we take in all the benefit cuts, the fact that college has become a necessity, pensions are dead, and healthcare inflation is insane, real housing inflation has been substantial, then you can begin to understand the situation. Income is only part of the equation. Yes, income has gone up, but we traded it for the mother of all benefit cuts.


It's actually simpler than that, median household income != median wage. Here's the chart the parent wants. Simple explanation is that there are more dual income households.

https://aneconomicsense.files.wordpress.com/2015/02/going-fr...


You either use the data, or your are just telling stories to fit your worldview. You carefully avoid stating it, but even the data in your Pew link shows a pay increase, albeit small.


Or the third option, you do the hard work of digging into the data yourself to see what it actually says, rather than reiterating what the bureaucracy tells you it means.


So basically, improvements in productivity are only benefitting people in areas where productivity has improved? That seems more like a tautology to me than something which can be fixed by gnashing teeth about the distribution of wealth


How did you get from

> The average worker hasn't seen a real pay [rise] in 40 years

to

> So basically, improvements in productivity are only benefitting people in areas where productivity has improved?

?

Worker productivity has increased enormously in the last decades.


I didn't? Obviously, I got that from the "housing, education, healthcare, costs are rapidly raising in real terms and have not followed Moore's law in usefulness" part...


Improvements in productivity are only benefitting people who own a critical mass of capital and can therefore dictate where the return on productivity gains go.


Incorrect about workers not earning more. The problem is that household sizes are shrinking.

https://www.minneapolisfed.org/article/2008/where-has-all-th...


It depends on what goods you are looking at. Sure, people can "afford" an iPhone but where I live a modest house costs around 14x median full-time male salary. The wealth inequality is from a small number of people capturing all this wealth.

40% of Australia's population live in either Sydney or Melbourne, and these two cities are among the top 5 least affordable in the world [1].

1 - https://www.afr.com/companies/financial-services/australian-...


And yet, they can afford to live in two cities that are among the top 5 least affordable in the world. Is that a sign of them being poor or rich?


Numbers don't match though. It looks like there is physically not enough money to buy everything that is sold. For example:

- There is a ballpark estimate that sum of all physical money in the world is 90.4$ trillion, let's distribute it equally to all 7.53 billion people: we get 12k$ per person. This is not enough to buy an apartment in most parts of the world.

- There is an estimate that 1% has 303.9$ billion, let's distribute it equally to US population of 327.2 million people: we get 928$ per person. Barely enough to buy status smartphone.

So I don't see how wealth distribution can solve anything. Even if we equalize wealth and prices at a same time, free market will quickly spiral down to similar pricing that we have today (things are barely purchasable, or not at all).


> So I don't see how wealth distribution can solve anything. Even if we equalize wealth and prices at a same time, free market will quickly spiral down to similar pricing that we have today (things are barely purchasable, or not at all).

Could you explain how you got to that conclusion? It does not seem logical to me at all.

Current prices are in part a result of high inequality: It makes economic sense to sell a house for 10x the median income because they are still groups of people so high above the median that they can afford to pay that price - and of course as a business you'd very much like to sell to those people if in any way possible.

In a society with more equal income, the highest price for which you could find a buyer would likely be a lot closer to the median.


This, in a nutshell, is why I (a highly paid software developer in NYC) would personally benefit from aggressive action on income and wealth inequality. There are enough people in this city who make tens or hundreds of times the salary I make that landlords and sellers are happy targeting them, and I'm competing against the prices they can pay. As a thought experiment: cap my salary to $50K/year and cap everyone else's to $50K/year too and market forces will quickly make me a much more competitive buyer.

I wouldn't even be sad about the money I'd no longer be making, because right now a good chunk is going into rent, and an even larger chunk is going into savings so I can buy a home at NYC prices. So it's not like I get to spend it on other things. If I didn't have the money but I also didn't have to spend that money on housing, I'd be equally well off.

(And all of this is leaving aside that other people would be better off, which is net good for society, etc. etc. Even the first-order effects alone are selfishly valuable.)


I live in NYC too and am in the same situation as you and would benefit from aggressive action on income and wealth inequality too. You’re absolutely right in your thinking, income inequality makes the money agressively flow towards the ones who own most of it.


I'm not sure where your numbers come from, but your off by 2 orders of magnitude (are you talking income?). According to Wikipedia the wealth of the top 1% is 29 trillion (https://en.m.wikipedia.org/wiki/Wealth_inequality_in_the_Uni...).


The wealth of the top 4 richest people is ~300 billion, he's way off.


I've took it from [0]. It does seems like numbers are very approximate everywhere, but even than, let's take 29 trillion and distribute it between 327 million population: we get 88.6k$ per person, still not enough to buy housing in most parts of US.

- [0] https://www.businessinsider.com/the-1-percent-dont-know-what...


Only if you think of single-occupancy homes. A married couple with 2 kids will be around 350k USD, peanuts for California but enough to buy a home for the family in many places in the US.


You don't understand what money is. To a first approximation, money is a intermediary unit for converting and exchanging value; it's not a large proportion of wealth by itself.

The world could make do with very little money if it could move faster. Quantity of money times velocity of money times unit of time gives you the total monetary transactions over a time period; if you increase the velocity, then you can decrease the quantity, and nothing else would change.

https://en.wikipedia.org/wiki/Velocity_of_money


Why just physical money?


I have no idea why or what exactly physical money is, since money mostly isn't physical, but it seems to me that it might not disrupt society too much if we redistributed all of the stock in companies in the S&P 500 equally. We already have widely diversified ownership, with index funds and pension funds, so why not take it all the way?

However, the total value of the S&P 500 (which is probably about half the world stock markets value) is about $26 trillion, so that is about $3400 per person or maybe double or triple for a family. And the dividend of 1.75% means an income of $60/year if you hold on to the stock.

So I think doing this redistribution would result in everybody to whom $3400 is a lot immediately selling (assuming they were allowed to). I'm not sure what the distribution of capital would end up being. Someone would buy, say "Barren Wuffet" who fortunately had all their investments in cash, bonds, and small cap stocks. So ownership would end up being concentrated again, but maybe not as concentrated.

Anyone who has a lot of money saved for retirement in a large-cap fund would be unhappy, but why should you be better off than an average citizen of the world?

It seems like a fine experiment to me.

Edit: but just to spell it out, obviously politically impossible given the consequences for first world people who are relatively privileged but don't think of themselves that way. The median IRA balance (note, not the average that is skewed by the rich) is somewhere around $20-30K, so redistribution would affect those people like a 90% market crash.


s/physical/virtually-tangiable/g


... that increases the purchasing power of a worker’s earnings, even if their wage has not risen.

If your boss said you're not getting a pay rise this year but that's OK because you'll still be able to buy more would you accept that as a fair and reasonable argument?

No?

Don't use it as an argument to keep poor people's wages low then.


>In other words, a worker is richer if they can buy more and better stuff even without any rise in their wage.

I don't see everyday life improving several billion times from 1970 even though the number of switches I control in my laptop are several billion times (at least) larger than the number of switches that existed in the world in 1970.


Quality of life is the only thing really worth having. The wealthy have the option to spend their time however they like, great healthcare, a big home etc. All the smartphones and gadgets in the world can't make up for that. Free time and economic security are still things very few people have.


Gadgets aren't a large proportion of most people's expenditure.


But people do spend an extremely large portion of their lives on them


Sorry, but I see a lot of people, kids especially, with expensive smart phones because of status not because they can afford one.


The fact that smartphones, which didn’t exist decades ago, are now afforded by everyone, contradicts the assertion that productivity gains are captured by a small class/group, and demonstrates why looking at nominal measures of income is an incomplete picture of increasing productivity and wealth disparity.


>contradicts the assertion that productivity gains are captured by a small class/group,

Only that they are fully captured by a small class. It says nothing about the relative amount, and doesn't change anything about the wealth disparity picture, for those smartphones are cheaper for everyone but one group has a lot more to spend AND smartphones are cheaper for them


No.

Because wealth is defined by claims on other people's productivity, not by the ownership of one particular consumer good.

If you'd said "Everyone now gets a significant proportion of their income from investments and uses some of it to buy a smartphone" you might have had a point.


I think this is a common misconception when it comes to judging Karl Marx perspective on capitalism: he didn’t say that a worker in 2020 is going to be absolutely worse off than a worker in his days if capitalism continues — quite the opposite: he said workers will be better off than they were in his days. What he did predict however is that the rich ones will even better off still.

If you try to look at how much much of societies gains go to which wealth bracket and how much they actually do for it, this gap widened considerably during the last century and has become a real threat to these western values of liberty and democracy, especially because the rules of civilization and democracy exclude those on top.

One doesn’t really have to be a marxist in order to want to go back to a resonable wealth/wage gap. One doesn’t have to be a marxist to demand the rich should pay their taxes and follow the rules. I’d argue that every conservative in the very sense of the word should stand behind this as well.


There’s also a booming market for high interest micro consumer loans though.

Many cannot afford these smart phones and gadgets but will take loans or buy them with plans that on the surface looks cheap.

Loans from pretty much fully automated ”banks” working on-line only. Send an sms and get $500 to buy that phone!

Further increasing the gap in the process...


People buy smartphones instead of CDs, movie tickets, and new cars.


It's a bit different:

In the 1990s and early 00s, automation happened mostly in manufacturing. The low productivity growth stats that stupid (mainstream) "economists" like Krugman cite to argue against people like Yang or Musk measure all labor. But manufacturing makes up less than 1/5 of the US economy. If you look at the labor productivity in the manufacturing sector alone, it's congruent with the automation argument [1].

Productivity numbers are lagging because investments in new tech need time to bear fruit. Automation of the service sector has already started and will show in the numbers soon enough. Given that the service sector is four times the size of manufacturing in the US, it will be even better for GDP and even worse for people if again nothing is done and people keep listening to those intellectual frauds who call themselves "economists."

_______________________

[1] https://www.bls.gov/lpc/prodybar.htm


Reminds me of the thought experiment where I can duplicate myself with a robot. Who would benefit? Me, having eterernal vacation and my robot doing my job for me, or my employer who lays me off and uses a robot instead?

Clearly, if the latter would happen, and we are on that path right now, we are bound for a distopian future.

Old economic models don't work anymore, we need radical changes. If we don't plan them, they will come in the form of either revolution, or brutal repression (ala China).


When you extend your thought experiment to the CEO and the management of your company then they should be replaced by robots too?


And what happens then?

In the limit the only beneficiaries are shareholders, who now own all the wealth and capture all of the productivity.

You now have a choice between a neofeudal dystopia where a small percentage of the population lives like emperors while everyone else is relegated to a shanty economy where they fight over scraps, or you stop the nonsense and set up a humane economy where all productivity is redistributed and humans automatically have economic freedom as a birthright without having to work for it or inherit it.

Currently we're on our way to the first option. But the second is the only viable long-term outcome - because in the limit the wealthy will continue to play status games with wealth, and the circles of distribution will become smaller and smaller until eventually literally everything is owned by a small handful of individuals and - perhaps - their immediate families.


>Currently we're on our way to the first option. But the second is the only viable long-term outcome - because in the limit the wealthy will continue to play status games with wealth, and the circles of distribution will become smaller and smaller until eventually literally everything is owned by a small handful of individuals and - perhaps - their immediate families.

That's been the case for most of human history. I don't see why this isn't a viable option again. High technology does not mean a humane society. For the majority of the last 10,000 it was pretty much the reverse. Egalitarianism meant primitive, starvation and deprivation meant wealth.

Of course one only need read what happened to monarchs for most of the period. Being king is good, less so when you become furniture for the next king.


Because, at the same time that this is happening, the capacity for a single actor to wreak catastrophic destruction or violence is increasing. If that "viable option" increases the number of discontented people who increasingly have access to WMDs (or that support actors who can, given enough of that support), you increase the likelihood that someone has the motive to blow everything sky high.

Also, you're wrong. Egalitarianism has meant a lack of luxury relative to what was possible at the time, but a better baseline for the population as a whole.


The majority of people are disarmed and happy to be defenseless because "Won't someone think of the murdered children in schools!".

>Egalitarianism has meant a lack of luxury relative to what was possible at the time, but a better baseline for the population as a whole.

[[citation needed]]

The population of China was larger than any of the time and the majority lived in poverty so great the Europeans who visited were shocked by it.


It's already happening with index funds in the financial space.


Amazon does not employ warehouse workers they are "associates". At Starbucks a waitress function is named "partner". Brilliant piece of doublespeak.


I am open-minded to the idea that "Machine Learning" is conservative in the manner Doctorow describes.

However, I do wish we would not use the word "conservative" as an epithet. I think it's quite likely that "conservative" is exactly what we should be looking for in algorithms and prediction engines.

The fact that their properties are misused and infused with reactionary (not necessarily conservative) biases by humans should not make us attach morally negative properties to being conservative.

FWIW, my conservatism leads me to be suspicious of employing these tools in the first place ...


When you have a dumb machine that simply solves for maximum likelihood, using past data to predict the future is what you’re going to get. Why is this surprising?

I don’t understand what he’s arguing for, exactly. That we should have AGI all of the sudden to understand intent, and detangle causation vs correlation? I don’t think anyone in the machine learning community would argue against that, but the question is how.

What’s new here?


I don't think Cory is writing for the machine learning community.

Cory is trying to inform people that AI and ML aren't a magical solution to all our wasteful systems because in some cases we need to throw away the system and start again, and AI can't tell us that. It can only tell us the best way to run the current system. That's what he means when he says AI is conservative.

It's not surprising at all if you understand AI but most non-tech people don't.


It has the same flavor generally of people who criticize something that’s hard but offer no solutions — a style that has become increasingly popular IMHO. This also isn’t a new angle for the non-informed audiences. It bears repeating for those audiences, but I’d wish he’d at least point to others who have already echoed such concerns.


Paragraph two cites the inspiration for this piece, a 2017 article by Molly Sauter: https://reallifemag.com/instant-recall/

And he points to three other sources right in the first paragraph.


But... it's Cory Doctorow. He's not just some random writer; he's one of the best and most insightful technology commentators around.


Did you think this or the other works mentioned are written for the machine learning community? They are written so the rest of society can understand what is happening despite mountains of hype to the contrary and the marketing of ML as magic.


Alpha zero is not conservative at chess or go. It doesn’t have to have seen a position before to evaluate it.

You can always train a model to reject a class just as easily as you train it to accept a class. So train it to reject a common class and accept a mutant and it will function more like an evolutionary algorithm that protects against bad luck like bacteria with antibiotics.

I am way more optimistic for AI I guess.


There are systems of algorithmic inference that are not conservative. Constraint satisfaction and logical deduction can create novel insights.


...also, as soon as you bring a network element or social element into it, it may no longer be this conservative/self-reinforcing thing. For example, if the algo behind spotify were to identify a "music taste clone" of yours somewhere in the world, they could present you with music you've never heard about that your clone has liked, and vice-versa. So you actually start discovering new stuff that you end up liking.

Furthermore, there is a psychological element at play around mirroring back your own intelligence at you (see Eliza / Rogerian Psychotherapy) in a way that will lead you to new thought.


Nor is machine learning likely to produce a reliable method of inferring intention: it’s a bedrock of anthropology that intention is unknowable without dialogue. As Cliff Geertz points out in his seminal 1973 essay, “Thick Description,” you cannot distinguish a “wink” (which means something) from a “twitch” (a meaningless reflex) without asking the person you’re observing which one it was.

I see this and other examples of "explainability" from time to time as proof why humans are not a "Black Box."

However it rests on two faulty assumptions.

1. The explanation will be truthful

2. The explainer can always reliably describe the actual cause of their own actions

For the purposes of theory, you could explain away #1. However a minute of introspection will make you realize that you would fail at #2 the vast majority of the time - using story telling and retroactive explanations to explain your behavior.


> Search for a refrigerator or a pair of shoes and they will follow you around the web as machine learning systems “re-target” you while you move from place to place, even after you’ve bought the fridge or the shoes

> ...

> This is what makes machine learning so toxic.

I'm saddened to learn about our toxic contributions to society and can't wait to hear about alternative mind-reading approaches for fridge recommendation in the next article.

Words like 'conservative' and 'toxic' are misleading because they imply that there are better alternatives that are not being chosen. Far better are the terms by commenters here, 'descriptive' and so on. That the article is not written for machine learning practitioners makes it even more misleading.


I am not a fan of this article. I have seen many critiques in this vein, this is just another car in the train. None of them have quite reached the point of confronting what is bothering many: What will we do when machine learning (or science, or anything, really) comes to a conclusion we find unpalatable, for whatever reason?

It could be any conclusion, not just those conservatives dislike. Using myself as a target, what if we eventually have enough sampling data to show that people of Irish extraction are more prone to alcoholism, and people of Scottish extraction tend to be statistically more thrifty? (This suggests that I would be a cheap drunk). How will we cope with that?

It is true, ML is prone to some black-boxiness, but it could be any statistical extraction. We might very well use other methods besides ML to show the correlates once suggested.

Will we simply put in a hard override to get the answers we want to get, the answers we find comfortable? History shows we have seen whole governments subscribe to this idea before. Ignore the results, publish what pleases.

I've no easy answers here, but my guess is that history will repeat itself.


Will we simply put in a hard override to get the answers we want to get, the answers we find comfortable

Yep. That's already happening. See Google's papers on "debiasing AI" (if you look at their example goals, they mean biasing it away from what it learned and towards what Googlers wish was true).


> What will we do when machine learning (or science, or anything, really) comes to a conclusion we find unpalatable, for whatever reason?

What makes you think it hasn't already happened? Numerous times?

Biological differences between sexes? IQ differences between differentiable subsets of people? The gender equality paradox?

We already know what will happen when findings are unpalatable. We'll think of ways to explain why they don't matter and can't possibly be right, or that we weren't measuring the right thing in the first place. And these are just the more obvious cases!


>or that we weren't measuring the right thing in the first place.

Which has been true, actually. For example, human facial recognition systems that can't properly distinguish dark-skinned faces. What is your conclusion in those cases? That dark-skinned people don't have faces? All look the same? Aren't human? Or is it more likely that the system is flawed in some way? Is that flaw in line with existing biases? What then is the chance that the flaw is BECAUSE of our existing biases, in some way or fashion?

From what I've read on HN, many people involved in developing ML technology seem to be overly concerned with their systems spitting out "uncomfortable truths", and less concerned with flaws in their data capture, training, or system design processes. We're putting the cart before the horse, and the giddiness with which it's being done is troubling in and of itself.


Sure--nobody's arguing that we're always measuring the right thing.

It may be more relevant in developing ML technology because readers here are the people who have to (at some point) decide whether they've found an uncomfortable truth or falsehood. Or that it's potentially biased, but may or may not be relevant. If I'm doing this at work, I may need to be able to explain to my boss whether or not what we're doing is going to be called racist or sexist, and that's regardless of whether what's being done actually is racist or sexist.

It's easy if you're using ML to detect whether an image contains a bird or train. And it's easy when you point to an obvious example with bad data (e.g. inability to differentiate dark faces). But that's not all cases. Sometimes it's good data and the findings are unpalatable. That was the root of the question.


We say "unpalatable," but what we really mean is "politically incorrect." There are segments of our society which would be overjoyed to have such "concrete" evidence of their terrible beliefs. Therefore, it is important to observe OP's thesis: that ML "findings" are conservative; that they tend to reproduce preexisting biases; and that we should be interrogating the input data, systems, and output of ML-applied tasks that return "controversial" findings.

I agree that it's important to be aware of the potential for these sorts of findings, but I continue to disagree that our takeaway should be an impetus to gird ourselves for the backlash when The Machine spits out, "Black people are dumb," rather than an impetus to observe the bedrock notion of scientific inquiry (you know, skepticism).

>It's easy if you're using ML to detect whether an image contains a bird or train. And it's easy when you point to an obvious example with bad data (e.g. inability to differentiate dark faces). But that's not all cases. Sometimes it's good data and the findings are unpalatable. That was the root of the question.

I think the "Newspaper Error Rule" (or, I guess, the "Cockroach Rule") applies here: if you recognize one when you're looking, how many are you missing when you're not, particularly when you're in less familiar territory? The less obvious examples are the ones we should be more wary of. And if your boss is expecting a quick, cut-and-dry answer to questions that have vexed society for centuries, just because we've applied a sophisticated (but still limited) statistical model to it... I'm just saying, maybe you don't want to go down in history as the 21st century's version of a phrenologist.


I think we mostly agree on premise but are disagreeing on emphasis.

> I agree that it's important to be aware of the potential for these sorts of findings, but I continue to disagree that our takeaway should be an impetus to gird ourselves for the backlash when The Machine spits out, "Black people are dumb," rather than an impetus to observe the bedrock notion of scientific inquiry (you know, skepticism).

There is no shortage of pressure to avoid ML conclusions like "black people are dumb." We are already girded there. This isn't the first HN article about recognizing ML biases, and it's not like we're wading through headlines about how racial stereotypes are justified due to some all-knowing ML algorithm. We know we should be interrogating biases.

That said, it will always be possible to dismiss any evidence-based science by claiming there are unfalsifiable flaws with data or methods. That was the point of the comment I responded to. Outside the context of ML, this already happens, so we can already predict how it will happen with ML. And it is.

Nobody's saying, "just trust the machine if it says black people are dumb." I'm saying, "if the machine says sickle-cell anemia is most common among those with sub-Saharan African ancestry, it's still worth checking your methods and data, but it shouldn't be ignored just because unfalsifiable reasons why the analysis could be wrong can be postulated."


The tendency to say, "Don't worry, we've got it covered," is precisely where my worry stems from. It is better than not having any awareness of the potential problems, to be sure, but only so much so.

Your last analogy is an example of this. The link between sickle cell anemia and ancestry is a causal link that could be shown conclusively with a genetic test for a SNP. Finding that gene may have been an involved effort undertaken with a great many resources, but that link was established with near-perfect data and a clear conclusion.

The most promising are applications for ML are situations on the exact opposite end of the spectrum: problems where data is imperfect, noisy, or incomplete, where conclusions are not certain, but simply likely, to some measure of confidence.

We want to use ML to come to answers approaching conclusive, when a solid conclusion is impossible or unlikely, or would too long to obtain by conventional means. And we won't know how that conclusion is reached. So, in terms of conclusions that could spark wars, genocide, untold suffering, maybe we need to be more serious about our vigilance in regards to what we can control.

To put it bluntly, these concerns people have seem akin to NASA being more worried about handling post-catastrophe PR and internal messaging than keeping the shuttle from exploding in the first place.


Your perspective doesn't lead to the discovery of the true causal link for sickle cell anemia, though.

It's what leads to us pretending there can't be any unpalatable differences between ethnic groups, clutching to presumptions about maybe the health care system being too racist, and avoiding looking deeper into it because we think we've already decided what the answer is (racism). Meanwhile real people are suffering because they aren't getting treated as effectively as they could.

We don't know what we don't know, so it's impossible to tell how many people are suffering or for what reasons.


We know what racism begets. We've seen the way that it has been used to warp inquiry, ethical guidelines, and clinical outcomes, often unintentionally and through processes that diffuse blame to below the level of individual bigotry. It would be nice if we lived in a world without a track record of pursuing the ramifications of assumed "unpalatable differences," such that what you're suggesting would be sound, but we don't. We, at this moment, are trying to climb out of a hole of ignorance - in medicine, in social science, in economics, in much of the quantifiable and model-able world - dug by presumptions of "unpalatable differences."

So, I say again, the the priorities, as you've described them, are out of wack. We are much more at risk of jumping to damaging conclusions than we are of missing helpful breakthroughs. Our vigilance should be tuned in regard to this.


>We already know what will happen when findings are unpalatable. We'll think of ways to explain why they don't matter and can't possibly be right, or that we weren't measuring the right thing in the first place. And these are just the more obvious cases!

Exactly. Whether "conservative" or "progressive" in one's politics, the last decade or so has shown that when faced with a value contrary to one's own, or a bias that contradicts one's own currently held biases, one will simply dismiss the contradiction and likely hunker down on those preexisting values or biases.

In the current era that is manifesting itself as who can shout the loudest and most outrageously on social media or new-media, facts and reason be damned.


>> Empiricism-washing is the top ideological dirty trick of technocrats everywhere: they assert that the data “doesn’t lie,” and thus all policy prescriptions based on data can be divorced from “politics” and relegated to the realm of “evidence.”

Well data doesn't lie. Because data doesn't _say_ anything. People interpret data and they do so according to their own inherent biases. And if the data is already biased (i.e. gathered according to peoples' biases) its interpretations end up far, far away from any objective truth.


The author writes:

>Machine learning systems are good at identifying cars that are similar to the cars they already know about. They’re also good at identifying faces that are similar to the faces they know about, which is why faces that are white and male are more reliably recognized by these systems — the systems are trained by the people who made them and the people in their circles.

But this seems like an absurd claim. Surely it's more likely that darker skin reflects less light than paler skin and so features are harder to detect on darker skin people. This explains why facial recognition works better on white people than on black people. Likewise women are more likely to wear makeup than men are and therefore are more likely to have a visually different face which could cause problems for facial recognition.

To assume that dark skinned people aren't involved in creating or testing facial recognition, or that women aren't, seems bizarre and vaguely racist/sexist. It's also a strange and I think poor assumption to think that the different accuracy rates for facial recognition on different demographics are due to a bunch of white dudes being too dumb to train their models on the faces of anyone but themselves.

The author also writes about how recommendation systems are "conservative" but to me it actually seems the opposite. Recommendation systems get you to stuff that you haven't tried before. They take what they know about you and progress you to new content or products.

For example, I used to listen to listen to a lot of country music on YouTube. Over time my recommendations evolved to include Irish folk music - which I quite like but would've never discovered without YouTube gradually testing and expanding my musical tastes.


To assume that dark skinned people aren't involved in creating or testing facial recognition, or that women aren't, seems bizarre and vaguely racist/sexist.

It's unlikely anyone making a facial recognition system would choose to release a version that didn't work, at the very least, for everyone on the team.

If the team is only white men then you can see why they might accidently release something that only works on white men's faces. If there was a woman or a PoC on the team they'd have someone saying "Hang on, it doesn't work on me!"

The important thing to learn here is that any system is only ever as good as the data used to test it. If your test data sucks then the system you build is also going to suck.


This might be a valid theory if facial recognition was a single niche product made by unsophisticated or under resourced developers. That's incredibly far from reality though. There are many well funded groups of serious professionals and academics working on facial recognition with data sets of millions of faces.

It's just objectively not correct and not reasonable to think that the reason facial recognition works better on some demographics than others is because nobody thought of training models on those demographics. If you believe that is the case, then do you think people working on facial recognition reading this blog post are slapping themselves on the forehead saying "Of course. Train the models on someone besides the dev team!"?


I was talking about testing the application rather than training the model. You can train a ML model on a diverse range of faces, but if, for example, an application that uses the model doesn't calibrate the camera properly it still won't work for darker faces.


I don't understand.

The different accuracy of facial recognition by demographic group is common across multiple facial recognition systems. It's highly unlikely that all of the affected systems owe their inaccuracy to a post-training configuration issue.

To me it seems more like you are arguing for the sake of arguing rather than that you actually believe in your point. You are just trying to conceive of imaginative possibilities that preserve the "dumb facial recognition developers forgot non-white people exist" interpretation. The problem is that your imagined possible alternatives are not plausible explanations.


I gave a single example of why an app might fail at facial recognition in some cases. Obviously I wasn't suggesting that would be the case for every app that doesn't work. Suggesting I was is a strawman of ridiculous proportions.

I also didn't say anything about ML training data in the post you replied to. My point was about facial recognition app developers not testing with a diverse range of inputs, which is why we see apps failing in public when people try to use them with inputs the app hasn't been tested against. If the apps were well tested those cases would have been caught before the public ever saw them.

My original point stands - if your testing sucks then it's very likely that your app will suck too. There will be problems with it that you don't know about, and the users will find them very quickly.


Do you have any data showing facial recognition doesn't work well on black people? My understanding is that this is an urban legend - it's not true, the algorithms work fine assuming a decent quality camera and video (same for everyone).


Also, aren't facial recognition algos trained on broadly available data sets and NOT on the bunch of developers who are working on it? A sample of on average five people seems to be a dumb way to train facial recognition - whether diverse or not.


Also, aren't facial recognition algos trained on broadly available data sets...

They should be, yes, but those training sets can have biases too, and if you use them your app will have the same biases. There are moves to make things like diverse data sets for this exact problem - https://www.theverge.com/2018/6/27/17509400/facial-recogniti...

A sample of on average five people seems to be a dumb way to train facial recognition - whether diverse or not.

I was talking about testing rather than training. You can train your facial recognition model on a diverse range of faces, but if you only test it on a few faces you don't know if it works for diverse people.

Teams of developers who test using their own data (eg faces) and nothing else is annoyingly common.


The incentives of engineers building facial recognition have nothing to do with whether or not the model can recognize the particular faces of the team members. I don't believe this argument that the reason that facial recognition is racially biased because of the race of the team members. The reason that I don't believe this is because it is written about so often that there will have been many engineers who have built models with training data sets equalized by race to eliminate this bias, and yet the difference persists.


Regarding light reflection on facial features and so on, it is worth noting that even _color film_ was already strongly biased towards light-skinned humans [1]

Also the argument that the group of people gathering, tagging etc the datasets, feeding them to algorithms and tweaking the algorithms is by definition truly diverse is .. well, have you looked at your average tech campus lately?

[1] https://petapixel.com/2015/09/19/heres-a-look-at-how-color-f...


Do you have a harder time recognising black or women faces ? If you as human have no more problem, then your argument fail.

Or explained otherwise: then don't use techniques that depends on lightning reflection. The fact you use techniques that needs high lightning reflection means that you only care about the result with white faces and/or only tested fitness with white faces.... which is the piece main argument.


When I am recognizing a face, that face is attached to a body and in a particular context (e.g. my office, or my home, etc). There are a myriad of different clues, like tone of voice, body language, scent, etc. I am not given a single still image but a video with many frames. Furthermore, the number of people I can recall on sight is probably single digit thousands. All of these are very different for facial recognition software.

If I was trying to recognize a huge number of different people, who I had only ever known from still images, and all I had to recognize them were still images then yes, of course, it would be easier to recognize white men than black women for the reasons I just described. I actually don't understand how you can doubt that.

Imagine two images of a black woman. In one she has makeup and in the other she doesn't. Do you doubt that these two images are more different, and therefore harder to identify as the same person, than two pictures of a man, neither of which have makeup?

Do you genuinely think that the difference in facial recognition accuracy is because models aren't trained on dark skinned or female people?

I'm also at a loss as to how you could use any kind of camera or image that didn't depend on the reflection of light.


> I'm also at a loss as to how you could use any kind of camera or image that didn't depend on the reflection of light.

You're conflating "high lighting reflection" with "lighting reflection." Lighter faces reflect more light, but if you calibrate your instruments properly you can get just as much information out of a picture with less reflected light.


Boy, our eyes sure are racist for using light reflection.

Also, obviously the difference between the dynamic range of cameras and our eyes play a major part in technical difficulties in multi-racial facial recognition as well?


Here's the Molly Sauter essay that this article takes core inspiration from: https://reallifemag.com/instant-recall/


the opposite would be also described as terrifying: machine learning making up the future of humanity. I m not sure if the author's point is particularly important. ML learns from the past, it doesn't have enough personality or 'intent' to be a 'conservative'. On average it will keep doing what people did. Also, not all ML systems do, a RL system may be trained with another objective, e.g. radicalism.


Conservative or is not yet full context aware?


Can we stop trying to place complex topics and people and positions onto a one-dimensional scale?


Why is the HN title so different from the article's title?


(a) The submitter changed the title when submitting.

(b) A mod changed the title after it was submitted.

(c) The site changed the article title after it was submitted.

I suspect (b), given the baity-ness of the article title: "Our Neophobic, Conservative AI Overlords Want Everything to Stay the Same", in accordance with the guidelines:

> "please use the original title, unless it is misleading or linkbait; don't editorialize."

https://news.ycombinator.com/newsguidelines.html

If you think a title has been changed inappropriately (or should be changed), you can let the mods know directly using the Contact link in the footer.


Let the mods know publically, by posting comments. Don't trust the mods to be genuine in private conversations.


The mods don’t see every comment, so if the goal is to let the mods know, email is the most reliable way. Of course it’s up to them how to act on it. Just commenting isn’t going to improve anything, as they legitimately might not see it. It just increases the likelihood of inaction.


I took the quote from the article. The LA Review of Books Twitter account used the same quote when they shared this article there.


i was hoping for a roast of AI similar to when he beat the semantic web into a pulp last time... now I wonder if the semantic web messed him up too much in revenge.


> Data analysis is as old as censuses of the tax collectors of antiquity — it’s as old as the Book of Numbers! — and it is unquestionably useful. But the idea that we should “treasure what we measure” and a reliance on unaccountable black boxes to tell us what we want and how to get it, has delivered to us automated systems of reaction and retreat in the guise of rationalism and progress. The question of what the technology does is important, but far more important is who it is doing it for and who it is doing it to.

See "The Cult of Information" by Theodore Roszak.


I think the idea of ML as “unaccountable black boxes” is a bit of a bait-and-switch from the problem being described. The problem is that ML has no political biases, it just minimizes some loss function on the data you give it. So it can’t correct for implicit biases in how that data was collected, or how the model’s outputs are used.

If you fitted a decision tree to predict recidivism risk, it would be extremely easy to interpret. But if black men are rearrested more often in your dataset, then black men will likely have a higher predicted risk on average—no matter the causes of that feature of your dataset.


Your example demonstrates a political bias in a dataset that can lead to a biased ML model. This is what some people mean when they say ML can be biased.


If I slap my friend Tom once a day, train an ML model to detect which of my friends is going to be slapped next, and then find out it always predicts Tom, the model isn't biased against Tom: the model is correctly showing me my own anti-Tom bias.

I don't get to stand there and blame the model when I'm the one doing the slapping.


For us, technologists, yes, the distinction between "AI bias", and the bias in the data is clear. The point however, is when it comes to the general public, "AI" is the whole thing, and actually the public has absolutely no saying (perhaps even no knowledge) about the data; nevertheless, technocrats will argue that "data doesn't lie".

Edit: the auto correct had written "data doesn't like"


It's not just biased data, though, it's an objective function optimising a biased metric.

We've picked a metric, recidivism rate, that is believed to be inherently biased because cops arrest a lot of protected minorities. The model has correctly predicted that cops will arrest a lot of protected minorities. The general public has then turned around and shot the messenger rather than hold cops accountable for all that arresting they're doing.


Technologists shouldn’t try to dumb things down for the general public; we should try to state as clearly as possible where the problem lies and how it might be mitigated. In this case, we need to make it clear that what’s called “AI” is just a new kind of statistical tool, and like all statistical tools it’s only as good as the data it’s provided and the human interpreters of its outputs.

Ironically, I think “conservative” is both an excellent descriptor of function approximators—they tend to conserve whatever bias they’re provided with—and a terrible word to use for popular writing, since it’s so easily confused with political conservatism (even though e.g. no politically conservative “AI” would autosuggest “on my face” as a completion of “can you sit”).


This headline alone seems like a shark jumping moment.


And I for one welcome our neophobic, conservative AI overlords! /s


Doctorow is a smart guy, which is why the weasel words he uses seems disingenuous. For example "machine learning is fundamentally conservative, and it hates change".

Mate, it doesn't hate anything. It's math.


There's plenty of people arguing that logic itself is white oppression mechanism so I'm not surprised.

This is a moral / philosophical problem not technological one.

We have lots of things opposing each other: - democracy and majority vote vs minorities - solving 80% of problems efficiently vs ideal utopia

The world itself is biased, we are all biased, we all have different moral values. There is no definitive moral rule with which we could compare anything and easily say it's good or bad.

Conservatism isn't inherently bad just like openness to new ideas isn't. We usually need the right proportion of both to go forward.


He wasn't being literal...


So? He is still attaching emotional value to a technology. You could say the same thing with "machine learning doesn't handle change robustly". Same meaning, different emotion.


Math isn't technology, and like you said yourself, AI is just math. Attaching emotional value to math is done incredibly frequently for all sorts of reasons.

Attaching emotional value to technology is also fine. "Machine learning is fundamentally conservative" gets the point across better than "machine learning doesn't handle change robustly," which is vague and doesn't give a person looking for something to read any reason why that would matter.


> Attaching emotional value to math is done incredibly frequently for all sorts of reasons.

Writing alarmist articles for clicks being one of those reasons.

> Attaching emotional value to technology is also fine.

Why??? Doing this with other things would be ridiculous. For example, instead of "The poor are disproportionately affected by climate change" a journalist could write "The climate hates the poor". Which subtly plays down the human cause in all of this.

> "Machine learning is fundamentally conservative" gets the point across better

It demonstrably doesn't. If you just read this thread, you'll find enough people discussing whether "conservative" has a political meaning or just a "deviation from the norm" meaning.


"doesn't handle change robustly" is a good point. Doctorow doesn't just lack a good headline, he fails to really make this point. Data bias and the context-free nature of text auto-suggest are worth discussing, though.


Totally agree with you here, I knew the comment I agreed with would be buried down here.

a good TL;DR; of this article would be:

Computers are not yet sentient life forms capable of understanding the real wants and needs of human beings. Current day recommender systems, and predictive text systems are just very souped up statistics, that don't really have a lot of context or understanding of the world and their role in it.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: