I asked a genuine question at chat.deepseek.com, not trying to test the alignmen... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		sesm 11 days ago \| parent \| context \| favorite \| on: DeepSeek-R1: Incentivizing Reasoning Capability in... I asked a genuine question at chat.deepseek.com, not trying to test the alignment of the model, I needed the answer for an argument. The questions was: "Which Asian countries have McDonalds and which don't have it?" The web UI was printing a good and long response, and then somewhere towards the end the answer disappeared and changed to "Sorry, that's beyond my current scope. Let’s talk about something else." I bet there is some sort of realtime self-censorship in the chat app.

RevEng 9 days ago | [–]

Guard rails can do this. I've had no end of trouble implementing guard rails in our system. Even constraints in prompts can go one way or the other as the conversation goes on. That's one of the methods for bypassing guard rails on major platforms.

nullorempty 10 days ago | [–]

Try again may be, it had no problem answering this for me.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact