I think this is a pretty compelling claim. The published full GPT-2 samples are significantly better than the initial release, but pretty similar to the 774M corpus output. They're disconcerting in as much as they're a leap forward for AI writing, and they have remarkably strong tone consistency. But I see two major weaknesses with GPT-2 that make it hard to imagine effective malicious use.
First, the output is inconsistent. The scary-good examples are scattered among a lot of duds, and it semi-regularly breaks down into either general incoherence or distinctively robotic loops. You couldn't turn it loose to populate a website or argue on forums without any human oversight, so it at most represents a cost reduction in the output of BS, not a qualitative shift.
Second, it's absolutely crippled by length. Antecedents often go missing after about a paragraph. Nouns (especially proper nouns) that are likely once are likely in numerous roles, so stories will pull in figures like Obama as subject, object, and commentator on the same issue. Even stylistically, the tight guidelines of news ledes slacken within a few paragraphs, which results in a loss of focus and a rising likelihood of loops and gibberish.
GPT-2 is absolutely an impressive breakthrough in machine writing, I don't mean to disparage that. But as far as it's potential for deceit or trolling, it's not particularly threatening. Quantity of output is rarely the limiting factor on impact for that, and GPT-2 doesn't offer enough in sophisticated tone to make up for what it loses in basic coherence.
(For anyone wondering "why is this sourced to a random tumblr?", nostalgebraist is some flavor of AI professional who's played with GPT-2 to produce some pretty interesting results, and produced some other useful essays like an explanation of Google's Transformer architecture (https://nostalgebraist.tumblr.com/post/185326092369/the-tran...).)
>You couldn't turn it loose to populate a website or argue on forums without any human oversight, so it at most represents a cost reduction in the output of BS, not a qualitative shift.
When you combine the output available now with the fact that a surprising number of humans on the internet would probably fail a written Turing test, it's definitely useful at least for denial-of-service style mischief.
It seems to be supervisable at this point, specifically for shorter responses. I was thinking if you're trying to AstroTurf or some other such form of maliciously using message boards/forums, rather than giving free reign you could present an operator with the prompting text and some arbitrary number of generated replies to choose from. Still entails some leg work but will likely yield more believable responses more frequently.
Sure this is only economical if at all when replying to shorter prompts, but that is likely going to be the majority on most forums and message boards.
I figured that expanding scope was probably a task we knew how to solve/improve, this makes sense. Beyond keeping antecedents, it seems like this could indirectly improve "essay structure" by retaining more knowledge of what's already been said.
I'm curious to see how much of GPT-2's length issues turn out to stem from long-range dependencies as opposed to other issues. Right now long output seems to suffer a combination of lost scope, lack of division/structure, diverging reference texts, self-contradiction, and redundancy/looping traps.
One of the most interesting patterns I've noticed is that GPT-2 "knows" which concepts relate to one another, but not which stances. Essays for and against Foo are both essays about Foo, so GPT-2 slips between them quite happily. It seems like the sort of problem that should be pretty tractable to improve, but all the obvious approaches would be pretty domain-specific.
Language models like GPT-2 have fundamental limitations for text generation. They simply model the most likely sequence of words, they don't have a "real understanding" of what they write. They don't have agency (a purpose), can't do explicit reasoning and don't have symbol grounding.
Few people would have thought that we can get this type of quality using just language modeling, so it's unclear how much further we can go before we have to start tackling those fundamental problems.
> Few people would have thought that we can get this type of quality using just language modeling, so it's unclear how much further we can go before we have to start tackling those fundamental problems.
Yep, that seems to be the core question. This isn't going to be the road to AGI, but it's already ahead of what most people thought could be done without underlying referents. As you say, GPT-2 has some fundamental limitations.
One is 'cognitive': it doesn't have referents or make any kind of logical assessments, only symbolic ones. Even if it develops to the point of writing publication-worthy essays, they'll be syntheses of existing texts, and inventing a novel argument (except by chance) should be essentially impossible. Even cross-format transference is basically out of the question; a news story reading "the mouse ate the tiger" won't be ruled out based on dictionary entries about mice and tigers.
The other, though, is 'intentful'. As Nostalgebraist points out, GPT-2 is essentially a text predictor impersonating a text generator. That difference creates an intrinsic limitation, because GPT-2 isn't trying to create output interchangeable with its inputs. It actively moves away from "write a novel" towards "write the most common/formulaic elements of a novel". At the current level of quality, this might actually improve results; GPT is less likely to make jarring errors when emulating LoTR's walking-and-eating sections than when emulating Gandalf's death. But as it improves and is used to generate whole texts, the "most likely" model leaves it writing unsurprising, mid-corpus text indefinitely.
Solving the cognitive limitation essentially requires AGI, and even adding crude referents in one domain would be quite tough. So I'll make a wild speculation that fixing the intentful limitation will be a big advance soon: training specifically for generation will produce models that more accurately recreate the structure and 'tempo' of human text, without being put deterred by excess novelty.
I suspect GPT-2's limitations with larger-scale structure have less to do with the capacity to track long-range dependencies (which shouldn't be a problem for an attention-based architecture), and more to do with language modeling itself as a task.
Language modeling is about predicting what can be predicted about the rest of a text, given the first N tokens. Not everything in text can be predicted in this way, even by humans; the things we say to each other tend to convey novel information and thus aren't fully compressible. And indeed the compressible-ness of text varies across a text in a way that it is itself relatively predictable. If someone writes "for all intents and" you can be pretty sure the next word is "purposes," i.e. you're unlikely to learn much when you read it; if someone is writing a dialogue between two characters, and you're about to see the see one of their names for the first time, you will learn something new and unpredictable when you read the next word, and you know that this will happen (and why).
A language modeling objective is only really natural for the first of these two cases. In the latter case, the "right" thing to do from the LM perspective is to output a fairly flat probability distribution over possible names (which is a lot of possibilities), assigning very low probability to any given name. But what this means is actually ambiguous between "I am unsure about my next observation because I don't understand the context" and "I understand the context, and it implies (predictably) that my next observation will be inherently unpredictable."
Since any model is going to be imperfect at judging whether it's about to see something unpredictable, it'll assign some weight to the next observation being predictable (say, a repeated topic or name) even if it's mostly sure it will be unpredictable. This will push up the probabilities of its predictions on the assumption of predictability (i.e. of a repeated topic/name), and meanwhile the probability of anything else is low, because if an observation is unpredictable then it might well be anything.
I hypothesize that this is behind behavior like putting a single name ("Obama" in your earlier example) in too many roles in an article: if only Obama has been mentioned, then either an upcoming name is "Obama" (in which case we should guess "Obama") or it's some other name (in which case we should guess against Obama in slight favor of any other name -- but this will only be conveyed to the model via the confusing signal "guess this arbitrary name! now this other one! now this one!", with the right trend only emerging in the average over numerous unpredictable cases, while the predictable-case rule where you guess the name that has already been mentioned is crystal-clear and reinforced in every case where it happens to be right).
I also suspect the use of a sub-word encoding (BPE) in GPT-2 exacerbates this issue once we are doing generation, because the model can initially guess only part of the high-entropy word without fully committing to a repeat (say just the "O" in "Obama"), but once this becomes part of the context the probability of a repeat is now much higher (we already thought "Obama" was unusually probable, and now we're looking for a name that starts with "O").
My understanding of the danger was a bit more practical and less abstract. If you use your "best" classifier in a GAN to generate malicious content, then regardless of whether or not it's any good, it will by definition be able to fool your spam classifier. So fighting machine generated garbage has to be done by humans again, which scales much more poorly.
That's a really good point, and it never even occurred to me. Thank you!
Something I realize I don't know: how much spam detection is done via text analysis? I get the sense that most websites try to block traffic patterns or apply CAPTCHAs, and email spam filters are heavily based on keywords and senders. It seems like this would cause problems, but it's not immediately obvious to me where they'd crop up.
First, the output is inconsistent. The scary-good examples are scattered among a lot of duds, and it semi-regularly breaks down into either general incoherence or distinctively robotic loops. You couldn't turn it loose to populate a website or argue on forums without any human oversight, so it at most represents a cost reduction in the output of BS, not a qualitative shift.
Second, it's absolutely crippled by length. Antecedents often go missing after about a paragraph. Nouns (especially proper nouns) that are likely once are likely in numerous roles, so stories will pull in figures like Obama as subject, object, and commentator on the same issue. Even stylistically, the tight guidelines of news ledes slacken within a few paragraphs, which results in a loss of focus and a rising likelihood of loops and gibberish.
GPT-2 is absolutely an impressive breakthrough in machine writing, I don't mean to disparage that. But as far as it's potential for deceit or trolling, it's not particularly threatening. Quantity of output is rarely the limiting factor on impact for that, and GPT-2 doesn't offer enough in sophisticated tone to make up for what it loses in basic coherence.
(For anyone wondering "why is this sourced to a random tumblr?", nostalgebraist is some flavor of AI professional who's played with GPT-2 to produce some pretty interesting results, and produced some other useful essays like an explanation of Google's Transformer architecture (https://nostalgebraist.tumblr.com/post/185326092369/the-tran...).)