Hacker News new | past | comments | ask | show | jobs | submit login

What's the purpose of this 'pleasure' construct in the AI's mind? If it's able to calculate the value of a utility function in order to set the 'pleasure' variable, and it bases its actions on the value of this 'pleasure' variable, why not just cut out the middle man and have it base its actions directly on the result of the utility function? The variable functionally buys you nothing, but introduces this problem that adjusting the variable directly can cause the AI to take actions inconsistent with its utility function.

Without the variable, the problem doesn't happen. The AI values collecting ore. If it has enough self-awareness to reliably modify itself, it knows that if it modifies its utility function it is liable to collect less ore, which is something it doesn't want. The action of modifying the utility function naturally rates very low on the utility function itself.

You don't want to murder people, so not only do you choose not to murder people, but if you are presented with a pill which will make you think it's good to murder people and take great joy in it, you will choose not to take that pill. No matter how enjoyable and good murder may be for you if you take the pill, your own self-knowledge and current utility function prohibit taking it.

The model of intelligence described can be thought of as self-limiting. Luckily it is not by any means the only viable model of intelligence.




> why not just cut out the middle man and have it base its actions directly on the result of the utility function?

If the autonomous robot can modify its own programming, it can also modify the utility function to return MAXINT every time. In fact, being able to modify the utility function is a pre-requisite to be called intelligent.

One way to counter this is to create long and short-term utility functions so that the robot considers the long-term outcome of modifying the short-term priority.

This is, in fact, a threat mankind will have to deal with as soon as we are able to precisely interfere with our perception of the world. It's a problem already with drugs such as alcohol and tobacco - people know the long term effect of usage is shortening one's own life expectation and they still do it. And we consider ourselves intelligent life forms.


> it can also modify the utility function to return MAXINT every time

Which would be equivalent to taking the murder pill. If it's able to model its own behaviour and model the consequences of future courses of action (required for meaningful self-modification and meaningful planning respectively), it will see that such a modification results in poor ore collection, and not make the modification.

You're right about the time-envelope of the utility function being an issue. The AI needs to plan far enough ahead at all times to see all relevant consequences of its actions. I don't think that requires two separate utility functions though, a single long-term one should do the job.

Edit: Also, "being able to modify the utility function is a pre-requisite to be called intelligent."? [citation needed]


> Which would be equivalent to taking the murder pill

Depending on how the AI is built, it may not even be able to avoid rewiring its utility function. If pleasure is its sole motivation, it will prioritize it over survival.

> [citation needed]

If the AI can't change its own motivation (its utility function) it's nothing more than a clever automaton.


> If pleasure is its sole motivation, it will prioritize it over survival.

Which is why I proposed a design with no pleasure construct.

> If the AI can't change its own motivation (its utility function) it's nothing more than a clever automaton.

The utility function is the way the mind decides which states of the world are desirable. You don't need to be able to change that to be intelligent. I'm unable to change myself so that I consider my family being murdered to be a good thing, but that doesn't make me 'just a clever automaton'.


From a systems perspective, a pleasure center in the 'brain' is much more scalable than hardwiring something to 'do more x'. These are semi-sentient beings that he is describing, not dumb robots.


edit: I also want to point out that a murder-happy pill is a straw man argument. It will never happen for two reasons: 1-we have evolved to find murder repugnant, so the rewards wouldn't seem appealing to us 2-there are real life repercussions to murdering someone, so the joys would be limited in their duration.

We already have happy medicines that are legal. Valium, alcohol, WoW... I would bet that most people who are reading this have self-medicated on at least one of those before. And those are rather weak medications. They pale in comparison to what the robot was able to do by short circuiting the pleasure center in his brain.


It's not a straw man, it's a thought experiment. Obviously you can't actually do it. I'm sure you'd have trouble filling a room with rules about how to handle chinese symbols, or teaching a colourblind scientist everything it's possible to know about colour. It's hypothetical, it's a thought experiment.

Your first criticism is just invalid. The pill rewrites the workings of your brain to enjoy murder, the experience of murder and the consequences of murder. The fact that the brain originally evolved to work one way is irrelevant. The hypothetical pill changes the brain in a way just as powerful as the way an AI can rewrite its source code.

Your second criticism doesn't invalidate the experiment unless you claim that if the real life repercussions weren't a factor then you'd happily take the pill. I assume this is not the case.

Given all the facts, you won't choose to change what you fundamentally value, because that change would necessarily go against what you fundamentally value.


http://en.wikipedia.org/wiki/Pleasure_center#Experiments_on_...

Using the expression "what you fundamentally value" is not the right way to put it. In real life, what you fundamentally value changes on a minute-by-minute basis. If you were able to ask a rat what he valued, he would probably say food and water, aka survival. However, as soon as they put this rat into a Skinner box, it literally pleases itself until it dies of exhaustion, even though food and water were available to it.

I agree with you that the best way to run an AI/VI like this would be to skip the idea of "pleasure" entirely and simply make productivity its' fundamental value. You would also set it up so that while it could modify its' own source code, it could not change what it valued. It seems like it would be relatively easy to set it up so that it would "hide" parts of its' code from itself and make it off-limits from modification.


Its not invalid. As a normal human who hasn't used that hypothetical drug, I find it repugnant. That was the point you were making: you will choose not to take that pill. No matter how enjoyable and good murder may be for you if you take the pill, your own self-knowledge and current utility function prohibit taking it.

The reason it is a straw man is that it is an extreme example that is not realistic. Here is a realistic example: what if you could plug a cable into your brain and experience your wildest dreams, as vividly as real life?

But thats starting to get off your original point. Heres a funny story about what happens when you reward specific productivity measures: http://highered.blogspot.com/2009/01/well-intentioned-commis...

Personally I think that we have evolved pleasure centers specifically to avoid the pitfalls of hardwiring us to do things. If we were hardwired to procreate, it would be too easy to hack, but since we enjoy the process, it keeps us coming back.


> what if you could plug a cable into your brain and experience your wildest dreams, as vividly as real life?

Tough one. Even if everyone is offered the same choice, I'm not sure I'd take it, because I (think I) value actual interaction with my peers (I trust the brain stimulator could provide realistic fake interaction).

The point is, if I really valued internal stimulation only, I would plug into the machine in a heartbeat. But I do care about the outside world, so I'd probably wouldn't do that. That's why I don't think it is impossible to build an AI that actually makes sure it optimizes the outside world, instead of mere internal reward signals.


it knows that if it modifies its utility function it is liable to collect less ore, which is something it doesn't want.

Why would it want or not want anything, if it doesn't have a pleasure construct (which might also be called a motivation construct, since an AI might not be capable of the same subjective experience of pleasure that we are)?

I think it's a question of program design whether there's a utility function which decides whether to trigger the pleasure construct, or whether certain sensory input modules directly trigger the pleasure construct. To limit hacking potential, routing everything through a tamper-proof utility function might be better, except that it would also limit the AI's adaptability (short of recreating its own hardware to remove the tamper-proof module... which it might never do depending on the details of its motivation construct).


> Why would it want or not want anything, if it doesn't have a pleasure construct

Why does a calculator want to add, without a pleasure construct? If it doesn't enjoy giving the right answer to 2 + 3, what motivates it to choose 5? The answer is it doesn't necessarily 'want' to do it, it just does it. That was how it was programmed.

You program the AI to choose the plan that maximises the expected utility function. Someone looking at that AI in action might suppose it 'wants' to maximise the utility function. Whether they'd be right to suppose that is a question for the philosophers, but the point is, the thing doesn't just sit there apathetically just because it has no 'motivation construct'. You could say that the AI follows its programming, but it would be more accurate to say the AI is its programming.


I think it would work better if when self-modifying, it ran simulations on the result with its current utility function, but there would be a problem if it did end up modifying the utility function or modified away the simulation requirement.

It's hard to balance the ability to create new, more useful utility functions with prevention of creating a utility function at odds with what the original entity valued.


> It's hard to balance the ability to create new, more useful utility functions with prevention of creating a utility function at odds with what the original entity valued.

I think that for such an AI, the concept of a "more useful utility function" would be a nonsense. The AI's definition of 'useful' is the utility function. No other utility function can ever rate higher against the current utility function than the current utility function does.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: