As long as the prompt and query are part of the same input, I don't think this c...

simonw · on Feb 9, 2023

This has been the "obvious" fix for months, but no-one so far has managed to implement it.

I'm getting the impression this is because the nature of how large language models work makes it incredibly difficult to separate "instructions" from "untrusted input".

I would love to be wrong about this!

So far I've been unable to find a large language model expert who's ready to say "yeah, we can separate the instruction prompt from the untrusted prompt, here's how we can do that".