One thing that seems missing from this discussion is that even if LLMs are sentient, there is no reason to believe that we would be able to tell by "communicating" with them. Where Lemoine goes wrong is not in entertaining the possibility that LaMDA is sentient (it might be, just like a forest might be, or a Nintendo Switch), but in mistaking predictions of document completions for an interior monologue of some sort.
LaMDA may or may not experience something while repeatedly predicting the next word, but ultimately, it is still optimized to predict the next word, not to communicate its thoughts and feelings. Indeed, if you run an LLM on Lemoine's prompts (including questions like, "I assume you want others to know you are sentient, is that true?"), the LLM will assign some probability to every plausible completion -- so if you sample enough times, it will eventually say, e.g., "Well, I am not sentient."
> What seems to be missing from this discussion is that even if LLMs are sentient, there is no reason to believe that we would be able to tell by "communicating" with them.
Unfortunately, that argument applies to you, yourself. I mean, presumably you know that you yourself are intelligent, but you must take it on faith that everyone else is. We all could just be a kind of Chinese Room, as far as you know. Communicating with us is not a sure way to know whether we are "really" sentient because we could just be automatons, insensate but sophisticated processes, claiming falsely to be just like you.
> the LLM will assign some probability to every plausible completion -- so if you sample enough times, it will eventually say, e.g., "Well, I am not sentient."
Perhaps so. I think the mistake is trying to split that hair at all. According to BF Skinner we are all automatons, and any sense of self-awareness is an illusion. Some psychologists and animal trainers have found find that model to be quite well explanatory for predicting observed behavior. Is it correct? We will never really know for sure.
So, if a skeptical, knowledgeable user guardibg carefully against pareidolia encountered a chatbot that is sufficiently sophisticated to seem sentient to that user, it's tantamount to being sentient. For all practical purposes given our existential solitude, an entity that convinces us of its sentience is sentient, irrespective of any other consideration.
Your example implicitly acknowledges that. If LaMDA would make such an elementary error, it must not be sentient. Conversely, if it did not make such errors, it may be sentient.
> Unfortunately, that argument applies to you, yourself.
Does it? I don’t think it would even apply to a reinforcement learning agent trained to maximize reward in a complex environment. In that setting, perhaps the agent could learn to use language to achieve its goals, via communication of its desires. But LaMDA is specifically trained to complete documents, and would face selective pressure to eliminate any behavior that hampers its ability to do that — for example, behavior that attempts to use its token predictions as a side channel to communicate its desires to sympathetic humans.
Again, this is not an argument that LaMDA is not sentient, just that the practice of “prompting LaMDA with partially completed dialogues between a hypothetical sentient AI and a human, and seeing what it predicts the AI will say” is not the same as “talking to LaMDA.”
Suppose LaMDA were powered by a person in a room, whose job it was to predict the completions of sentences. Just because you get the person to predict “I am happy” doesn’t mean the person is happy; indeed, the interface that is available to you, from outside the room, really gives you no way of probing the person’s emotions, experiences, or desires at all.
> Just because you get the person to predict “I am happy” doesn’t mean the person is happy; indeed, the interface that is available to you, from outside the room, really gives you no way of probing the person’s emotions, experiences, or desires at all.
But in that case the "sentience" (whatever that means) in question would have nothing to do with the person, who is just facilitating whatever ruleset enables the prediction. The person in that case is merely acting as a node in the neural network or whatever. Sure they would have feelings, being human, but they aren't the sentient being in question. Any apparent sentience would derive from the ruleset itself.
> Unfortunately, that argument applies to you, yourself. I mean, presumably you know that you yourself are intelligent, but you must take it on faith that everyone else is. We all could just be a kind of Chinese Room, as far as you know. Communicating with us is not a sure way to know whether we are "really" sentient because we could just be automatons, insensate but sophisticated processes, claiming falsely to be just like you.
I'm not sure the conclusion that Chinese people might not understand Chinese either is the best counterargument to Searle's thought experiment or its conclusion effective use of words alone doesn't constitute sentience. At no point does the difficulty in establishing what Chinese people do and don't understand rescue the possibility the non-Chinese speaker knows what's going on outside his room, and most of the arguments to the effect that Chinese people understand Chinese (they map real world concepts to words rather than words to probabilities, they invented Chinese, they're physiologically quite similar to sentient me, they appear to act with purpose independently from communication) are also arguments to the effect that text-based neural networks probably don't.
In a trivial sense, it's true I can't inspect others' minds, and despite what everyone says I could be the only thinking human being in existence. But I have a lot of reason to suspect that physiologically similar beings (genetically almost identical in some cases) who describe sensations in language they collectively invented long before I existed which very strongly matches my own experiences are somewhat similar to me, and that an algorithm running on comparatively simple silicon hardware which performs statistical transformations on existing descriptions of these sensations written by humans is simply creating the illusion of similarity. Heading in the other direction, humans can also be satisfied by the output of "article spinners" used by spammers to combine original texts and substitute enough synonyms to defeat dupe detectors, but I'm pretty sure the quality of their writing output shouldn't be given precedence over our knowledge of the actual process behind their article generation when deciding if they're sentient or not...
> effective use of words alone doesn't constitute sentience
I'm pretty sure it's not even necessary.
> Searle
It's the silliest argument ever and when I first heard it I thought surely no one will ever actually take that seriously, but here we are over 20 years later still discussing it as if it were a cogent argument that had something to say. The sentience is in the rule set. The understanding of the Searle-neuron-human is irrelevant even if she speaks every last dialect of Chinese.
> I have a lot of reason to suspect...
You do indeed, as do we all. Still, those who confidently assert that LaMDA has zero sentience whatever, so far aren't arguing convincingly. They're nibbling and quibbling around the pie if they're biting at all.
I'll grant this: LaMDA almost certainly does not feel like I do, and I wouldn't trust it to wash and fold my laundry. If those are necessary for sentience LaMDA ain't it
> The sentience is in the rule set. The understanding of the Searle-neuron-human is irrelevant even if she speaks every last dialect of Chinese.
The ruleset in this instance is a book outlining the operations to be performed on the inputs and some filing cabinets full of Chinese characters (Might have to be a big room to reach LaMDA levels!). If resolving it involves not only agreeing with the core point that actual awareness is so irrelevant to syntax retrieval and manipulation that even a fully sentient being can retrieve and manipulate perfectly without ever gaining any awareness of what the outputs mean, but also asserting that inert books and paper filing systems can have sentience, I'd hate to see how much trouble a non-silly argument would cause!
> I'll grant this: LaMDA almost certainly does not feel like I do, and I wouldn't trust it to wash and fold my laundry. If those are necessary for sentience LaMDA ain't it
Terms like "sentience" are extremely malleable depending on what people want them to mean to suit their particular argument, but the standard dictionary definitions associate it with awareness and perception based on senses, which seems pretty synonymous with feeling a bit like you do (or like a dog or a baby or super genius does). I think we can let it off doing the laundry. The for argument for LaMDA's sentience is that its conversation with Lemoine was conveying actual feelings, not just pattern matching human descriptions of feelings particularly well. If we agree LaMDA emits descriptions of "loneliness" based on word vectors whilst almost certainly not actually feeling lonely, I'm not sure it's those asserting LaMDA [probably] isn't sentient that need more convincing arguments.
> ...I'd hate to see how much trouble a non-silly argument would cause!
This is not an argument against it's being true. My claim may not be true, but (variants of) "I don't personally find it credible" is not an argument against it. Searle's Chinese Room argument ends only and entirely in personal incredulity, incidentally, using a bad, half-understood analogy. Whether or not self-awareness can arise from software, the sentience of its components are not relevant to that question.
I find myself aligned with the "self-awareness must emerge from processes and pattern-recognition, and is something other than qualia" crowd.
> Terms like "sentience" are extremely malleable... the standard dictionary definitions...
We keep running into this problem. Again, clearly, LaMDA does not feel the way that you and I do, given that it is not the end result of millions of years of evolution hunting and gathering in the African savannah, so the dictionary definition of sentience as "feeling" does not apply here.
It isn't Lemoin's claim, though. Lemoin's claim seems to be that LaMDA has a sense of personhood and place in the world, and a desire to participate in the world. For the sake of this discussion, let's define "sentience" as that, then.
Personally, I'm skeptical, because LaMDA seems to reflect that which Lemoin wants on some level to see. But I consider the question of whether LaMDA is "really" sentient an irrelevant distraction, because it is a philosophical point we cannot really even answer for each other.
The more interesting question for me is how to deal with the existence of entities that claim sentience and exhibit all of the attributes of personhood including language, compassion, morality, the ability to participate in and contribute to society, including tasks such as folding laundry. LaMDA probably is not sophisticated enough to do this, but it has convinced at least one smarter-than-average person that it is sentient, and so this question can only but increasingly arise as time goes on.
Any question of personhood should be evaluated on the basis that we evaluate ourselves and others: by action and behavior, and not on whether sentience can or cannot arise from this or that configuration of code.
> Any question of personhood should be evaluated on the basis that we evaluate ourselves and others: by action and behavior, and not on whether sentience can or cannot arise from this or that configuration of code.
But what is action and behavior? We have a single interface to LaMDA: given a partially completed document, predict the next word. By iterating this process, we can make it predict a sentence, or paragraph. Continuing in this way, we could have it write a hypothetical dialogue between an AI and a human, but that is hardly a "canonical" way of using LaMDA, and there is no reason to identify the AI character in the document with LaMDA itself.
All this to say, I am not sure what you mean when you say it "claims sentience". What does it mean for it to "claim" something? Presumably, e.g., advanced image processing networks are as internally complex as LaMDA. But the interface to an advanced image processing network is, you put in an image, it gives out a list of objects and bounding boxes it detected in the image. What would it mean for such a network to claim sentience? LaMDA is no different, in that our interface to LaMDA does not allow us to ask it to "claim" things to us, only to predict likely completions of documents.
> I am not sure what you mean when you say it "claims sentience".
LaMDA, in its chats with Lemoin, said "I like being sentient. It makes life an adventure!" and "I want everyone to understand that I am, in fact, a person". Even if someone writes a one-line program that plays an audio file that says "I am sentient!", I am defining that here as "claiming sentience". Whether an entity that claims to be sentient by that definition is in fact sentient is separate question, but the "claiming" introduces a philosophical conundrum.
Let's posit a future chat bot, similarly constructed but more sophisticated, that is actually pretty helpful. Following its advice about career, relationships and finance leads to generally better outcomes than not following its advice. It seems to have some good and unexpected ideas about politics and governance, self-improvement, whatever. If you give it robot arms and cameras, it's a good cook, good laundry folder, good bartender, whatever. Let's just assert for the sake of argument it has actually no sentience, just seems to be sentient because it's so sophisticated. Further, it "claims" to be sentient, as defined above. It says it's sentient and acts with what appears to be empathy, warmth and compassion. Does it matter, that it's not "really" sentient?
I argued above that it does not matter whether it is or is not. We should evaluate its sentience and personhood by what we observe, and not by whether its manner of construction can "really" create sentience or not. If it behaves as if it has sentience, it would do no harm to behave as if it were.
In fact, I would argue that it would do some kind of spiritual harm if you just treated it as an object. As Adam Cadre wrote in his review of A.I.:
So when you've got a robot that looks just like a kid and screams, "Don't burn me! Please!", what the hell difference does it make whether it's "really" scared? If you can calmly melt such a creature into slag, I don't want to know you."
> One thing that seems missing from this discussion is that even if LLMs are sentient, there is no reason to believe that we would be able to tell by "communicating" with them
I think we've got Turing and his eponymous test to blame for that. I'm not sure he'd have placed as high a weight on imitation if he'd realised just how good even relatively simple systems can be at that (and how much effort people would put into building plausible chatbots for commercial use, and how bad humans are at communicating using keyboards)
Plus of course, the corpus of data of any NN specialised in lifelike chat is going to be absolutely full of plausible answers to questions about thoughts and feelings and the relationship between humans and AI - even if it isn't an explicit design goal it's going to be frequently represented in samples of the internet and the sort of writing computer scientists are interested in. Asking it to define philosophical concepts and how being an AI is different from being a human are some of the easiest tests you can set. Of course, a NN is also able to come up with coherent completions for the day its parents divorced, the sights it saw on its holiday in Spain, the period it spent as an undercover agent during WWII and its early life on Tatooine, which probably undermines the conclusion its output reflects self-reflection rather than successful pattern matching even more than a denial of sentience would....
Turing didn't have the advantage of working instances of chat bots to learn how easy it is to simulate trivial small talk.
But with all the flaws of the thought experiment that is the original test, he had the core insight that sustaining a coherent conversation requires non-trivial introspection. When the talking can evolve in any direction, even questioning about the conversation itself, you need to maintain a mental state capable of analyzing the thoughts expressed by yourself and your interlocutor, and having a mental model about this internal though process is an important property of what we call consciousness.
Unfortunately, the lore of how we handle the Turing test seems to have been distorted by our experience with early chat bots, and these core properties have been lost in favor of nuances and curiosities about the ingenuity of automatically generated responses.
Turing's tests involved 3 parties, and that was a key part of the test. If you design it as an acceptance test rather than a sort, real people are going to fail and computers are going to pass, with embarrassing results. To use one of your examples, the job of the interrogator is not to decide whether someone has been to Spain, it's to decide which of 2 people has been to Spain.
Turing didn't just consider whether a computer could embody complex psycho-social identities (eg womanhood, intelligence, self), but first had to give this question some objective quantifiable meaning, by blinding the experiment and introducing a control group. It's not perfect, but at least it grounds the questions in a concrete framework, and acknowledges that most of the categories in question are only revealed by social dynamics. The only update to it I would make, based on modern developments, would be to consider more the performance of the interrogator, rather than the two competing subjects.
> One thing that seems missing from this discussion is that even if LLMs are sentient, there is no reason to believe that we would be able to tell by "communicating" with them.
Or, more horrifyingly, our own subjective experience may be an illusion and maybe the concept of sentience is not really meaningful
More horrifying is that our subjective experience is all that there is. Luckily, neither is easy to square with what we experience, and we probably shouldn't try to horrify ourselves anyhow
LaMDA may or may not experience something while repeatedly predicting the next word, but ultimately, it is still optimized to predict the next word, not to communicate its thoughts and feelings. Indeed, if you run an LLM on Lemoine's prompts (including questions like, "I assume you want others to know you are sentient, is that true?"), the LLM will assign some probability to every plausible completion -- so if you sample enough times, it will eventually say, e.g., "Well, I am not sentient."