Hacker News new | past | comments | ask | show | jobs | submit login

I asked a genuine question at chat.deepseek.com, not trying to test the alignment of the model, I needed the answer for an argument. The questions was: "Which Asian countries have McDonalds and which don't have it?" The web UI was printing a good and long response, and then somewhere towards the end the answer disappeared and changed to "Sorry, that's beyond my current scope. Let’s talk about something else." I bet there is some sort of realtime self-censorship in the chat app.





Guard rails can do this. I've had no end of trouble implementing guard rails in our system. Even constraints in prompts can go one way or the other as the conversation goes on. That's one of the methods for bypassing guard rails on major platforms.

Try again may be, it had no problem answering this for me.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: