On the other hand, consider the difficulty of taking massive amounts of data from the modern web and filtering out the subset that was actually generated by humans, rather than previous generations of language models.
Definitely an interesting future problem. I'm sure OpenAI and others are thinking about it but I don't think these models are ubiquitous enough to have much impact just yet.