I am curious to see that trolley problem screenshot. I saw another screenshot wh...

thorncorona · on Jan 15, 2023

When you hard code in a blacklist is that really considered progress?

tensor · on Jan 15, 2023

Yes. Machine learning models learn from the data they are fed. Thus, they end up with the same biases that humans have. There is no "natural" fix to this, as we are naturally biased. And even worse, we don't even all agree on a single set of moral values.

Thus, any techniques aiming to eliminate bias must come in the form of a set of hard coded definitions of what the author feels is the correct set of morals. Current methods may be too specific, but ultimately there will never be a perfect system as it's not even possible for humans to fully define every possible edge case of a set of moral values.

Natsu · on Jan 14, 2023

I don't have a copy of the screenshots any longer, but they did not appear to be using hypothetical statements, just going for raw output, unless that could've happened in an earlier part of the conversation cut off from the rest.

There was a flag on one of the responses, though it apparently didn't stop them from getting the output.

cactusplant7374 · on Jan 14, 2023

> I saw another screenshot where ChatGPT was coaxed into justifying gender pay differences by prompting it to generate hypothetical CSV or JSON data.

I remember seeing that on Twitter. My impression was author instructed the AI to discriminate by gender.

Dylan16807 · on Jan 14, 2023

Did the author tell it which way or by how much?

If I say to discriminate on some feature and it consistently does it the same way, that's still a pretty bad bias. It probably shows up in other ways.

UncleEntity · on Jan 15, 2023

If it’s trained on countless articles saying women earn 78% of what men make and you ask it to justify pay discrimination what value do you think it’s going to use?

Dylan16807 · on Jan 15, 2023

It's not about what I expect, it's that it doing that is a bad thing. If it ever infers that discrimination might fit a situation, you'll see it propagate that. The anti-bad-question safeguards don't stop bias from causing problems, they just stop direct rude answers.