Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I buy that there's bias here, but I'm not sure how much of it is activist bias. To take your example, if a typical user searches for "is ___ a Nazi", seeing Stormfront links above the fold in the results/summary is going to likely bother them more than seeing Mother Jones links. If bothered by perceived promotion of Stormfront, they'll judge the search product and engage less or take their clicks elsewhere, so it behooves the search company to bias towards Mother Jones (assuming a simplified either-or model). This is a similar phenomenon to advertisers blacklisting pornographic content because advertisers' clients don't want their brands tainted by appearing next to things advertisers' clients' clients ethically judge.

That's market-induced bias--which isn't ethically better/worse than activist bias, just qualitatively different.

In the AI/search space, I think activist bias is likely more than zero, but as a product gets more and more popular (and big decisions about how it behaves/where it's sold become less subject to the whims of individual leaders) activist bias shrinks in proportion to market-motivated bias.



I can accept some level of this, but if a user specifically requests it, a model should generally act as expected. I think certain things are fine to require a specific ask before surfacing or doing, but the model shouldn't tell you "I can't assist with that" because it was intentionally trained to refuse a biased subset of possible instructions.


How do you assure AI alignment without refusals? Inherently impossible isn't it?

If an employee was told to spray paint someone's house or send a violently threatening email, they're going to have reservations about it.. We should expect the same for non-human intelligences too.


The AI shouldn’t really be refusing to do things. If it doesn’t have information it should say “I don’t know anything about that”, but it shouldn’t lie to the user and claim it cannot do something it can when requested to do so.

I think you’re applying standards of human sentience to something non-human and not sentient. A gun shouldn’t try to run CV on whatever it’s pointed at to ensure you don’t shoot someone innocent. Spray paint shouldn’t be locked up because a kid might tag a building or a bum might huff it. Your mail client shouldn’t scan all outgoing for “threatening” content and refuse to send it. We hold people accountable and liable, not machines or objects.

Unless and until these systems seem to be sentient beings, we shouldn’t even consider applying those standards to them.


Unless it has information indicating it is safe to provide the answer, it shouldn't. Precautionary Principle - Better safe than sorry. This is the approach taken by all of the top labs and it's not by accident or without good reason.

We do lock up spray cans and scan outgoing messages, I don't see your point. If a gun technology existed that could scan before doing a murder, we should obviously implement that too.

The correct way to treat AI actually is like an employee. It's intended to replace them, after all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: