What would the system use for its corpus of images and text descriptions? The corpus would have to be significantly larger than what any given attacker could manually identify. Once an attacker has manually identified an image+text combination, they could store the combination and use it to solve any future CAPTCHAs with the same image+text.