Hacker News new | past | comments | ask | show | jobs | submit login

They came up with an "interesting edge case", as you describe it, within minutes, and "bobsam@yahoo.com" was never in the original prompt. I've no doubt there are other opportunities lurking in your "protected" prompt.

Attackers get lots of tries, and they only need to succeed once before you've got a massive GDPR breach or lost trade secret.




Yes, and that mirrors the history of SQL Injection protection. The original SQL Injection protections from 25 years ago seems simple and ineffective to us today. It is a back and forth between the attackers and the protectors.

By proving that I was easily able to add some prompt injection protection, I was not proving that GTP-4 is perfect or that my prompt was perfect, but that like SQL Injection protection is it possible to add protection.


Protecting against SQL injection is a solvable problem; the characters that can escape from a query are known. You can conclusively and perfectly protect against them.

The escape strings for a nautral language model are not known, and can never be known. It's Calvinball; the rules are made up, loose, and can be modified duing play.


It isn't that simple. Input can be zipped or obfuscated and then missed by the escape logic. Also, the SQL injection protection from 25 years ago didn't escape UTF-16. There are pretty much daily CVEs related to SQL injection in 2023.

I understand that LLMs have a larger vector space of an attack surface, but those same technologies give the protection a large vector space as well to sandbox the output and detect anomalies.


> Input can be zipped or obfuscated and then missed by the escape logic.

No. You can try posting a zipped/obfuscated email address to `SELECT * FROM users WHERE email=?`, it's not going to do anything (except not find a matching row).

> the SQL injection protection from 25 years ago

... wasn't parameterized queries. The days of mysql_real_escape_string are gone.

> There are pretty much daily CVEs related to SQL injection in 2023.

Because people are still writing unparameterized queries. The solution exists, they just aren't using it. Legacy apps, shitty starter tutorials, etc.

No such reliable solution exists for LLMs.


My first comment in the thread proved that you can add protection from prompt injection. If you are unhappy that it isn't perfect protection that is fine, but it still helps.


You've proved that you can try, and others in this thread have easily proved that it doesn't work.


It did work, just look at the output.


You're the sort of person who builds this: https://www.walesonline.co.uk/news/might-most-pointless-gate...


You are the kind of person that changes the topic when they have lost the debate




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: