I've ended up with this viewpoint too. I've settled in the idea of informed ethics.. the model should comply, but inform you of the ethics of actually using the information.
> the model should comply, but inform you of the ethics of actually using the information.
How can it “inform” you of something subjective? Ethics are something the user needs to supply. (The model could, conceptually, be trained to supply additional contextual information that may be relevant to ethical evaluation based on a pre-trained ethical framework and/or the ethical framework evidenced by the user through interactions with the model, I suppose, but either of those are likely to be far more error prone in the best case than actually providing the directly-requested information.)
Well, that's the actual issue, isn't it? If we can't get a model to refuse to give dangerous information, how are we going to get it to refuse to give dangerous information without a warning label?