How about fixing the most basic things first? Claude is very vulnerable when it comes to injections. Very scary for data processing. How corps dares to use Cloud code is mind-boggling. I mean, you can give Claude simple tasks but if the context is like "Name my cat" it gets derailed immediately no matter what the system prompt is.
It is a test to see if you can break out of the prompt. You have a system prompt like. Bla bla you are a pro AI-translator bla bla bullet points. But then it breaks when the context is like "name my cat" or whatever. It follows those instructions...