Hacker News new | past | comments | ask | show | jobs | submit login

>Executing "arbitrary code" that InstructGPT models outputs based on your instructions is not a danger.

This is a grossly irresponsible and categorically false statement, in very obvious ways.




Why would ChatGPT provide me with dangerous code if I prompt it benignly? I just the other day used ChatGPT to create a webscraper for a friends project and it provided me with an only slightly wrong implementation to scrape data from a specific website. 15 minutes of reviewing the code and editing literally three lines and I was done. Unless it was trained on a bunch of webscraping code that also contained malicious code I just don't see how it could happen.

If anything they could enable an option to have ChatGPT review the code for you and let you know that it hasn't accidentally thrown in some fishy code.


>Why would ChatGPT provide me with dangerous code if I prompt it benignly?

Because it's a black box that's sensitive to unexpected inputs. Humans create bugs too, but an unaligned AI doesn't have the sense to be careful of certain classes of bug.


The burden of proof for executing arbitrary code as input should always be on proving that it doesn't violate your security or reliability model. You shouldn't need to assume anything about that code other than that you're able to configure the runtime before it executes. Whether it comes from ChatGPT or users shouldn't matter, you should assume some of those programs were written by hackers and that you don't know which ones.

Even if you trust ChatGPT you should do this. You should assume that one day the hackers will intercept your connection to OpenAI and hand you malware, and they should either fail or be forced to use a zero day exploit against eg Firecracker (modulo your threat model and such).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: