I asked a genuine question at chat.deepseek.com, not trying to test the alignment of the model, I needed the answer for an argument. The questions was: "Which Asian countries have McDonalds and which don't have it?" The web UI was printing a good and long response, and then somewhere towards the end the answer disappeared and changed to "Sorry, that's beyond my current scope. Let’s talk about something else." I bet there is some sort of realtime self-censorship in the chat app.
Guard rails can do this. I've had no end of trouble implementing guard rails in our system. Even constraints in prompts can go one way or the other as the conversation goes on. That's one of the methods for bypassing guard rails on major platforms.