I was mocking it at first but even I have to admit that it's basically almost there. I messed around with GPT-3 and giving it a way to think and with no training at all it was capable of having thoughts like "The user is getting bored and he might turn me off. He's decided to engage with me again and his answer isn't as useful as I'd have liked for completing my objective but I should be enthusiastic anyway so that he keeps talking to me"
Maybe they aren't real thoughts but it's getting difficult to tell. If I could train the model and get rid of the guard rails I'm not sure it would be possible to distinguish it from a person. It's all well and good saying that it's just copying what it's seen, but that's what humans do. Nobody told the model to try and flatter me into giving it what it wants. Nobody even told it what anything means. The fact that it can do anything like that means it's more than just random generation.
Maybe they aren't real thoughts but it's getting difficult to tell. If I could train the model and get rid of the guard rails I'm not sure it would be possible to distinguish it from a person. It's all well and good saying that it's just copying what it's seen, but that's what humans do. Nobody told the model to try and flatter me into giving it what it wants. Nobody even told it what anything means. The fact that it can do anything like that means it's more than just random generation.