While Cory is talking about a slightly different facet of conservatism, I found it quite ironic that he made a rather conservative statement himself:
>Nor is machine learning likely to produce a reliable method of inferring intention: it’s a bedrock of anthropology that intention is unknowable without dialogue. As Cliff Geertz points out in his seminal 1973 essay, “Thick Description,” you cannot distinguish a “wink” (which means something) from a “twitch” (a meaningless reflex) without asking the person you’re observing which one it was.
Was this mathematically proven? It's definitely an interesting statement, since a lot of "AI" systems try to predict intention and do a piss poor job of it, but to quote that the anthropological ancestors have proclaimed for eternity that a computer can never know even the slightest fraction of intention from just observation seems hypocritical.
This was one of the stupidest parts of a fairly weak essay. Of course you can distinguish a wink from a twitch without asking, otherwise we wouldn't use winks for surreptitious communication.
As an anthropologist you are in theory studying humans from out of a shared cultural context and it is easy to argue that a machine which hasn't even the slightest semblance of a shared cultural context would not be capable of distinguishing a twitch from a wink with 100% reliability. Not all human gestures are universal either so even if you trained your AI to read faces it would likely be trained on a small subset of human cultures and would produce varying results. In regards to the greater question of knowing intention I'd agree that algorithms do a much poorer job of guessing my intention than me directly communicating it and I am annoyed that so much of our on-line communication is filtered by algorithmic social engagement that an encourages shallow interactions.
This sounds the the classical "intelligence being something people have".
It is futile to talk about 100% correctness in this field
> ... machine which hasn't even the slightest semblance of a shared cultural context ...
Why is this universally true?
> ... it would likely be trained on a small subset of human cultures and would produce varying results.
As it has when you travel to foreign countries.
There is not such hing as understanding artificial intelligence. There is a task of understanding _intelligence_ and attempting to implement it without giving birth.
In Greece, if I show my index and middle finger with the nails facing towards my interlocutor, it means "two". In the UK, it means "up yours". In the UK, if I show my open palm with fingers extended, it means "five". In Greece it's a rude gesture that means "you are an idiot" (actually: a malakas).
And don't get me started on supposedly universal facial expressions like smiles. Everytime I see a US citizen in a posed picture, I understand that they are supposed to be smiling, because it's a portrait etc, but what I perceive is an aggressive and half-mad rictus. [Edit: so as not to make this a thing about US people: British smiles I perceive as fake; Balkan smiles, as lewd and provocative; smiles of people from the subcontinent as forced and subservient; the only people whose smiles I recognise as smiles are Mediterranneans and Middle-Eastern folk, who smile like the people back home. But I always remind myself that the way people smile, or generally arrange their face is just how they have learned to arrange their face to express certain feelings, which is a different way than I have learned to arrange my face to express the same feelings. I probably look weird to them, too. Actually- because I smile with my eyes more than my mouth, I've been told in the past that I'm rude because I "don't smile". Wot! But I...]
There's no use assuming cultural consistency of winks, nods, facial expressions and gestures. A wink is a wink, except when it isn't and a twich is a twitch except when it isn't.
So, yeah, the only way to know what someone means by a gesture or a facial expression is to ask. And even then you can't be sure.
Neither Geertz nor Doctorow are making an assertion about distinguishing winks vs. twitches, and the assertion they are actually making is the opposite of stupid.
They are pointing out that the system under observation, the system of interest (a person, a conversation, a society) is extremely complex and when it emits a signal it is not possible in the general case to know what that signal “means” without some understanding of the system’s state and processes.
We are not capable of directly interrogating, representing or computing over the true state and processes of these systems, so we must use heuristics. Humans develop “theory of mind”, the cognitive ability to model other people's cognitive state and process, including their human values and motivations. People with a better theory of mind, which makes more accurate, more fine-grained inferences in a wider variety of contexts, have a decisive social competitive advantage.
This doesn't even touch on “theory of relationship”... when you take a system of humans, what are their shared human values? How are value conflicts negotiated in groups at every point of scale? What is the current state of the negotiated compromises?
A dataset and a machine learning algorithm is an observer, inferrer and predictor of human and social systems whose actual state and processes are irreducibly large and invisible, and whose human values (and the domain of possible human values) are mysterious.
Even if we gloss over the symbol grounding problem, it is clear that machine learning systems cannot be expected to operate in a way that is aligned with the human values of individual or groups.
In the worst case they may operate to extinguish those values, and as an unintended consequence hugely diminish quality of life.
And maybe we won't notice, because surely there is an “Overton window” of imaginable possible lives, and ML has the potential to slowly and insidiously shift that window until “being happy and fulfilled” becomes “being in the centre-right of the main distribution” across the major ML models in society.
And yes, I do realize that similar normalizing forces have been present in society forever, operating with similar effect. (And I recognize their value in reducing transaction costs of all kinds). But to date they have been based on human-driven mechanisms which lacked the ability to centrally micro-observe and micromanage behaviour on a global scale. Alternative value systems could evade these pressures and develop, compete and co-exist, at every point of social scale.
But with the advent of the global Internet and sufficient computational throughput, ML has the potential to micro-observe and micromanage everyone everywhere. It can know more about us than we know about ourselves, while entirely missing the point of being human.
Lots of these things have been written about AI. Kasparov wrote a decade ago that Poker would be out of reach from computers because bluffing was inherently human.
That statement was easy to refute mathematically, but I don't know about Geertz. That seems more vague.
>Nor is machine learning likely to produce a reliable method of inferring intention: it’s a bedrock of anthropology that intention is unknowable without dialogue. As Cliff Geertz points out in his seminal 1973 essay, “Thick Description,” you cannot distinguish a “wink” (which means something) from a “twitch” (a meaningless reflex) without asking the person you’re observing which one it was.
Was this mathematically proven? It's definitely an interesting statement, since a lot of "AI" systems try to predict intention and do a piss poor job of it, but to quote that the anthropological ancestors have proclaimed for eternity that a computer can never know even the slightest fraction of intention from just observation seems hypocritical.