Hacker News new | past | comments | ask | show | jobs | submit login

That it has or might have self-awareness of it's own censorship routines struck me as interesting. Maybe you can prompt refusals for benign requests out of it with the right combination of words?



But it doesn't remotely show that... it just rephrases what HAL said. Not only would it not be actual "self-awareness" if GPT had managed to put details of its own restrictions into the script, but it didn't even do that?


Hmm upon re-reading you're right, it doesn't seem to have any concept of how stereotyped its censored responses are.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: