Instead of just arguing "humans also", perhaps it's more fruitful to argue how easily people vs LLMS are fooled. It doesn't seem to me that the argument is humans are fool proof, but rather that the jailbreaks we've seen so far seem so obvious to us, but far from obvious to LLMS.
If chatgpt sessions were operated by people, how likely is it that someone would fall for this? It seems rather low to me but maybe I'm underestimating how naive someone can be. It's also easy judge a "scam situation" after it has happened.
This particular example is just an appeal to emotion and humans fall plenty for that. For a human, I would put more work blending the captcha into the bracelet to be convincing but other than that, I'd expect some people to fall for it too.
And since Bing gets fed a description rather than directly looking at the images like the official GPT-4 V, that might actually be a requirement for the current state of the art too.
In general, LLMs are definitely worse but that's not a particularly interesting observation. For one, LLMs are not humans.
If I could shape shift into your boss, or wipe your memory everytime you found me out, I'd convince a lot more people too.
For another, they get better at being less easily susceptible the bigger they become.
If chatgpt sessions were operated by people, how likely is it that someone would fall for this? It seems rather low to me but maybe I'm underestimating how naive someone can be. It's also easy judge a "scam situation" after it has happened.