Hacker News new | past | comments | ask | show | jobs | submit login
GPT-2 is not as dangerous as OpenAI thought it might be (nostalgebraist.tumblr.com)
161 points by luu on Sept 9, 2019 | hide | past | favorite | 89 comments



I think this is a pretty compelling claim. The published full GPT-2 samples are significantly better than the initial release, but pretty similar to the 774M corpus output. They're disconcerting in as much as they're a leap forward for AI writing, and they have remarkably strong tone consistency. But I see two major weaknesses with GPT-2 that make it hard to imagine effective malicious use.

First, the output is inconsistent. The scary-good examples are scattered among a lot of duds, and it semi-regularly breaks down into either general incoherence or distinctively robotic loops. You couldn't turn it loose to populate a website or argue on forums without any human oversight, so it at most represents a cost reduction in the output of BS, not a qualitative shift.

Second, it's absolutely crippled by length. Antecedents often go missing after about a paragraph. Nouns (especially proper nouns) that are likely once are likely in numerous roles, so stories will pull in figures like Obama as subject, object, and commentator on the same issue. Even stylistically, the tight guidelines of news ledes slacken within a few paragraphs, which results in a loss of focus and a rising likelihood of loops and gibberish.

GPT-2 is absolutely an impressive breakthrough in machine writing, I don't mean to disparage that. But as far as it's potential for deceit or trolling, it's not particularly threatening. Quantity of output is rarely the limiting factor on impact for that, and GPT-2 doesn't offer enough in sophisticated tone to make up for what it loses in basic coherence.

(For anyone wondering "why is this sourced to a random tumblr?", nostalgebraist is some flavor of AI professional who's played with GPT-2 to produce some pretty interesting results, and produced some other useful essays like an explanation of Google's Transformer architecture (https://nostalgebraist.tumblr.com/post/185326092369/the-tran...).)


>You couldn't turn it loose to populate a website or argue on forums without any human oversight, so it at most represents a cost reduction in the output of BS, not a qualitative shift.

It might be getting closer than you think: https://old.reddit.com/r/SubSimulatorGPT2/ (Note can be nsfw, meta sub has best-of highlights)

Still a lot of complete nonsense but this is just a hobby project (not mine)


When you combine the output available now with the fact that a surprising number of humans on the internet would probably fail a written Turing test, it's definitely useful at least for denial-of-service style mischief.


It seems to be supervisable at this point, specifically for shorter responses. I was thinking if you're trying to AstroTurf or some other such form of maliciously using message boards/forums, rather than giving free reign you could present an operator with the prompting text and some arbitrary number of generated replies to choose from. Still entails some leg work but will likely yield more believable responses more frequently.

Sure this is only economical if at all when replying to shorter prompts, but that is likely going to be the majority on most forums and message boards.


Would be good for phishing too.


"Professional Sock with 8 years of experience in the advertising industry - AMA."

A personal favorite.


It's weird, after reading comments in this reddit and coming back to HN it feels like comments here are generated too :)


Wait till you get to twitter :/


Impressive. It's mostly nonsense but compared to regular r/SubredditSimulator/ it definitely looks a lot more conscious.


The hilarious thing is that the project owner fine-tuned against /r/SubredditSimulator and the extra nonsense shines through.

E.g.: https://www.reddit.com/r/SubSimulatorGPT2/comments/d1zh7r/a_...

Stroll through the replies to get a flavor of the sanity of them then ctrl+f subredditsimulatorgp


Transformer-XL should be better at learning long-term dependencies.

See the last section of this paper: https://arxiv.org/pdf/1901.02860.pdf


Interesting, thanks!

I figured that expanding scope was probably a task we knew how to solve/improve, this makes sense. Beyond keeping antecedents, it seems like this could indirectly improve "essay structure" by retaining more knowledge of what's already been said.

I'm curious to see how much of GPT-2's length issues turn out to stem from long-range dependencies as opposed to other issues. Right now long output seems to suffer a combination of lost scope, lack of division/structure, diverging reference texts, self-contradiction, and redundancy/looping traps.

One of the most interesting patterns I've noticed is that GPT-2 "knows" which concepts relate to one another, but not which stances. Essays for and against Foo are both essays about Foo, so GPT-2 slips between them quite happily. It seems like the sort of problem that should be pretty tractable to improve, but all the obvious approaches would be pretty domain-specific.


Language models like GPT-2 have fundamental limitations for text generation. They simply model the most likely sequence of words, they don't have a "real understanding" of what they write. They don't have agency (a purpose), can't do explicit reasoning and don't have symbol grounding.

Few people would have thought that we can get this type of quality using just language modeling, so it's unclear how much further we can go before we have to start tackling those fundamental problems.


> Few people would have thought that we can get this type of quality using just language modeling, so it's unclear how much further we can go before we have to start tackling those fundamental problems.

Yep, that seems to be the core question. This isn't going to be the road to AGI, but it's already ahead of what most people thought could be done without underlying referents. As you say, GPT-2 has some fundamental limitations.

One is 'cognitive': it doesn't have referents or make any kind of logical assessments, only symbolic ones. Even if it develops to the point of writing publication-worthy essays, they'll be syntheses of existing texts, and inventing a novel argument (except by chance) should be essentially impossible. Even cross-format transference is basically out of the question; a news story reading "the mouse ate the tiger" won't be ruled out based on dictionary entries about mice and tigers.

The other, though, is 'intentful'. As Nostalgebraist points out, GPT-2 is essentially a text predictor impersonating a text generator. That difference creates an intrinsic limitation, because GPT-2 isn't trying to create output interchangeable with its inputs. It actively moves away from "write a novel" towards "write the most common/formulaic elements of a novel". At the current level of quality, this might actually improve results; GPT is less likely to make jarring errors when emulating LoTR's walking-and-eating sections than when emulating Gandalf's death. But as it improves and is used to generate whole texts, the "most likely" model leaves it writing unsurprising, mid-corpus text indefinitely.

Solving the cognitive limitation essentially requires AGI, and even adding crude referents in one domain would be quite tough. So I'll make a wild speculation that fixing the intentful limitation will be a big advance soon: training specifically for generation will produce models that more accurately recreate the structure and 'tempo' of human text, without being put deterred by excess novelty.


I suspect GPT-2's limitations with larger-scale structure have less to do with the capacity to track long-range dependencies (which shouldn't be a problem for an attention-based architecture), and more to do with language modeling itself as a task.

Language modeling is about predicting what can be predicted about the rest of a text, given the first N tokens. Not everything in text can be predicted in this way, even by humans; the things we say to each other tend to convey novel information and thus aren't fully compressible. And indeed the compressible-ness of text varies across a text in a way that it is itself relatively predictable. If someone writes "for all intents and" you can be pretty sure the next word is "purposes," i.e. you're unlikely to learn much when you read it; if someone is writing a dialogue between two characters, and you're about to see the see one of their names for the first time, you will learn something new and unpredictable when you read the next word, and you know that this will happen (and why).

A language modeling objective is only really natural for the first of these two cases. In the latter case, the "right" thing to do from the LM perspective is to output a fairly flat probability distribution over possible names (which is a lot of possibilities), assigning very low probability to any given name. But what this means is actually ambiguous between "I am unsure about my next observation because I don't understand the context" and "I understand the context, and it implies (predictably) that my next observation will be inherently unpredictable."

Since any model is going to be imperfect at judging whether it's about to see something unpredictable, it'll assign some weight to the next observation being predictable (say, a repeated topic or name) even if it's mostly sure it will be unpredictable. This will push up the probabilities of its predictions on the assumption of predictability (i.e. of a repeated topic/name), and meanwhile the probability of anything else is low, because if an observation is unpredictable then it might well be anything.

I hypothesize that this is behind behavior like putting a single name ("Obama" in your earlier example) in too many roles in an article: if only Obama has been mentioned, then either an upcoming name is "Obama" (in which case we should guess "Obama") or it's some other name (in which case we should guess against Obama in slight favor of any other name -- but this will only be conveyed to the model via the confusing signal "guess this arbitrary name! now this other one! now this one!", with the right trend only emerging in the average over numerous unpredictable cases, while the predictable-case rule where you guess the name that has already been mentioned is crystal-clear and reinforced in every case where it happens to be right).

I also suspect the use of a sub-word encoding (BPE) in GPT-2 exacerbates this issue once we are doing generation, because the model can initially guess only part of the high-entropy word without fully committing to a repeat (say just the "O" in "Obama"), but once this becomes part of the context the probability of a repeat is now much higher (we already thought "Obama" was unusually probable, and now we're looking for a name that starts with "O").


My understanding of the danger was a bit more practical and less abstract. If you use your "best" classifier in a GAN to generate malicious content, then regardless of whether or not it's any good, it will by definition be able to fool your spam classifier. So fighting machine generated garbage has to be done by humans again, which scales much more poorly.


That's a really good point, and it never even occurred to me. Thank you!

Something I realize I don't know: how much spam detection is done via text analysis? I get the sense that most websites try to block traffic patterns or apply CAPTCHAs, and email spam filters are heavily based on keywords and senders. It seems like this would cause problems, but it's not immediately obvious to me where they'd crop up.


The whole notion of GPT-2 being "dangerous" was ridiculous to begin with. OpenAI does some impressive technical work but they're a little too impressed with themselves and appear somewhat detached from reality.


Specifically, they believed it could be dangerous as a tool for generating really good fake news and related things, not necessarily AGI.


You don't need AI to do that, NI does it well enough.


So what? Lots of news is fake already, written by journalists who already have an agenda. You don't need any AI breakthrough to have that problem.


It's great for marketing though. Having something so 'powerful' you can't release it to the public. Also ties in with the 'fake news' and Russian bot hysteria very well.


Second order thinking here, it was a great narrative for supporting the next AI-winter


Sometimes it’s better to be cautious and err on the side of not releasing something safe, than to err on the side if releasing something dangerous.


I think the danger of neural network bots overrunning social media with garbage is entirely valid.


It seems pretty clear to me by now that GPT-2 is not as dangerous as OpenAI thought (or claimed to think) it might be. (It may yet be as deadly as OpenAI believed it was, but we won't know until it's actually released). That's not even mentioning that it's entirely likely that it will become the first machine learning platform to win a Turing Award.

What do you think? Are you ready to get involved in the GPT-2 debate? Or do you already have your own AI platforms and don't like GPT-1? Or do you have questions and would like the community to shed light on them? Please join us in IRC at #gpt: http://gstreamer.freenode.net/?channel=gpt

Please refer to the #GPT4 thread at Wikipedia.

GPG key fingerprint: A58F D3A2 CAF7 2E0D D4A3 8CD1 C721 5FB8 2F8E 15C4

Links


I like the drift in GPT numbers across this message.


What's the point of having a debate? GPT is proprietary IP belonging to a private company. They can release it or not as they please, regardless of whether their stated reasons are valid.


I believe that sharemywin's post is satire. It is at least, intended to appear to be gpt-generated.


Did you really mean Turing Award?

Or do you mean "pass the turing test"?


The GP comment was generated from GPT-2.


Social media is already pretty full of garbage. What is the actual threat here?


The actual threat is the danger of neural network bots eventually overrunning social media with far more entertaining content.


humans already do that and even provide advertising data along with it


It was a good publicity stunt that gave them wide media coverage and a lot of attention.


No, they're just responsible scientists. That's all.


In my opinion, the rationale that the original GPT-2 is too dangerous to release is not good for two reasons: a finetuned small GPT-2 model on a targeted domain dataset (https://minimaxir.com/2019/09/howto-gpt2/) already gives better results than a large default GPT-2 model for targeted news, and despite independent researchers creating GPT-2 clones at a fraction of the cost of adversarial organizations, there hasn't been any evidence of mass-produced fake news in the six months since GPT-2 was open sourced and the paper was released.

GPT-2 will likely be used more for mass-producing crazy erotica (https://twitter.com/Fred_Delicious/status/116678321475044557... [NSFW text]) than fake news.


>GPT-2 will likely be used more for mass-producing crazy erotica

We should, at any cost, do not release this to the masses


Fake news concern isn't the right place to focus. News type content is low-volume, has a structured sharing model (i.e. people usually share the articles), is long-form, and requires fairly high precision. This is where humans excel and where automated language models are the weakest.

On the other hand, comments on sites like Facebook/Reddit/4chan are perfect for language model bots. The content is high-volume, semi-anonymous, usually shared without an explicit network, and can be extremely low precision.

So if you had a bot that could get on any discussion network for planning protests for example, and spammed it with destructive and divisive fake comments, it could actually make organizing pretty hard. And while there's some demand on the language model, many comments are just a few sentences long. And the content doesn't need to be very precise, it just needs to be convincing enough to be distracting.

I also think that the worst abusers are likely to be current power-players like Google/Facebook/Chinese Gov rather than small actors.


I have played with OpenGPT-2 for a few hours, testing its capabilities for generating politically-motivated texts in particular.

Indeed, it fails at large sample sizes, but it is _quite_ capable of replying to tweets and writing e.g. comments on Hacker News, Reddit or elsewhere.

With some human assistance, it can produce believable text fragments that can be later copypasted and reassembled into genuine articles or blog posts. If your full-time job is shitposting and/or generating politically divisive content by megabytes, it increases your productivity by orders of magnitude.


This, 100% this. It's the massive productivity gains, multiplied by the ever increasing ability to measure virility/persuasiveness that makes this combination dangerous.


The biggest problem here is that, like so many of these "smart machines", OpenGPT-2 is a self-correcting machine that does not care about real-world context.

...

That, of course, is exactly what GPT-2 wants you to think ;)

https://i.imgur.com/lFkB3gI.png


You have officially won this thread.


To illustrate your point, here is a twitter by a model trained on Donald Trump's tweets: https://twitter.com/botustrump


Meh, all this "we can't release it because it's dangerous" business is just marketing


I know I'm treading on dangerous waters here considering Sam Altman's involvement, but I'm gonna burn some karma regardless: Is there anything open about OpenAI or is it just a name?


The mission statement reads, in part:

> We’re hoping to grow OpenAI into such an institution. As a non-profit, our aim is to build value for everyone rather than shareholders. Researchers will be strongly encouraged to publish their work, whether as papers, blog posts, or code, and our patents (if any) will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.

I have no idea if they live up to this but "GPT-2 is too dangerous to release" crackpottery suggests they at least sometimes don't.


That mission statement has been abandoned. OpenAI restructured as a for-profit corporation several months ago.


Plot twist, the article was written by GPT-2


I gave the first sentence to https://talktotransformer.com and it returned this:

It seems pretty clear to me by now that GPT-2 is not as dangerous as OpenAI thought (or claimed to think) it might be. Generating fake news using GPT-2 is more difficult than the original "propaganda" model, but as we have seen it can be very effective.

The problem with GPT-1, and all AI projects that rely on it, is that they have completely misunderstood what human beings are like. Humans like to look and feel like they are human. It's how we have evolved. While the OpenAI AI team (and probably the GPT-1 team as well) believe they understand the human brain, they completely lack empathy when it comes to real humans. A fake news system that gets to the truth of a story based solely on "what it would feel like to be" a human being is no more human than a machine trying to "feel" what it is like.

As such, the idea that the OpenAI AI team believes their AI will be able to "see" what a human might see is a complete and utter failure. The truth is that we don't understand what


Dang, that's actually pretty convincing. It follows a reasonably coherent logical thread.


I couldn't believe it at first and I tried "the next sentence is garbage":

The next sentence is garbage. The author says that 'The next sentence is garbage.'

2) I can do anything (in this sentence).

3) So could you (in this sentence). This is garbage.

4) It's just a (in this sentence). Again garbage. Let's say the author writes 'So could you'. The sentence is still garbage: the word 'it' is a single-character word whose meaning does not exist. Let's compare this with

5) Let me try (in this sentence). This is garbage.

6) Can I (in this sentence)?

7) Yes I (in this sentence).

8) Does anyone (in this sentence)?

It is obvious from the preceding statements that these sentences are not true of any sentence, but of a series of sentences. It is not obvious how these sentences differ. Perhaps the author means it: I can do anything. Perhaps he means it: it's just a series of sentences. But then he says, 'So could you'. So how do these

------

I don't know if I should be scared or amazed.


...and that's just GPT-2_medium.


I actually thought snazz had forgotten to include the generated snippet. However, I didn't understand the reasoning in the second generated paragraph and stopped reading. At that point, I still thought they were human (and wrong).

I suppose my question is why it matters who wrote it. I've always been taught that an argument should be judged on its own merits, and from that perspective nothing changes.


Well, if the blog post actually had been written by GPT-2, its very existence would be a counterexample to one of its main claims, that GPT-2 isn’t really good enough to generate convincing long-form nonfiction.

Also, that part of the argument is not purely a priori, instead being supported by a wide range of factual claims about specific limitations of GPT-2. If the post had been written by GPT-2, those claims would probably be false, since GPT-2 is not designed to differentiate truth from plausible-sounding fiction. And false claims would invalidate the whole argument. Assuming a human author, on the other hand, the claims are probably true. They could be false if the author was either misinformed or lying, but those possibilities are subjectively unlikely.


> Assuming a human author, on the other hand, the claims are probably true.

Maybe this is where we differ? I don't agree. People are mistaken all the time. Your hypothetical even assumes an untrustworthy human is directing the algorithm.

To me, the most convincing point in favour of the truth of the claims in the article is that nobody has contested them. The claims appear to be easily falsifiable, so the more scrutiny they withstand, the more trust they deserve.


Funny your question is kinda addressed by the generated text:

'..."what it would feel like to be" a human being is no more human than a machine trying to "feel" what it is like.'


While this doesn't sound anything like Rob's post, it does sound like a generic post on AI.


It's quite fascinating. There are definitely some sentences that are nonsensical, but the problem is, there are often entire paragraphs that seem internally nonsensical in text written by what I assume to be humans, too.

Some more examples. Input (from wikipedia page on WW2):

>In Europe, Germany and Italy were becoming more aggressive. In March 1938, Germany annexed Austria, again provoking little response from other European powers. Encouraged, Hitler began pressing German claims on the Sudetenland, an area of Czechoslovakia with a predominantly ethnic German population.

-------------------------------- Output:

Meanwhile, Italy and France continued to make war on Germany, and in July 1938 they annexed the small kingdom of Yugoslavia without much fanfare. Meanwhile, in 1941, Churchill went to London to ask for the declaration of war on Germany. There, he argued that if it took Germany three months to attack Britain, and he said the war could be won within a matter of weeks, then Britain could afford to stand back while the Nazis moved across Europe. The Prime Minister's position was endorsed by other senior government ministers, and the British Foreign Office was given the power to intervene without the Commons' approval by the War Cabinet under Winston Churchill.

Churchill was also able to get Churchill's cabinet to approve the idea of establishing British military bases in the Far East. The British were convinced by their advisers at the time that the Russians might

-------------------------------- Input (from a random 0 upvote reddit thread on /r/wow):

>The servers have been having instability issues for the last few days. It's probably some residual issues from the DDOS attack.

-------------------------------- Output:

It shouldn't affect normal gameplay, but we are definitely experiencing delays in getting players to the new server. We will update this thread if this bug persists. (UPDATE: The server is back up). We're getting some feedback from the forums where you're encountering problems and we've worked with our technical team to resolve the problem. We will let you know whether you're going to get the newer version or the older version. We apologize for the inconvenience, but the server is more important to us than that tiny little lag spike in performance. (UPDATE: The player population was just too low to be on it). If you experience these issues on the old version, we suggest that you take your data to your data providers and ask them to migrate your character over to the new server. Some players in the forums have reported that there are more players in the game than the matchmaking system permits. This is an issue with the matchmaking system. We have been working with the server owner to fix this issue.


You read my mind


What's eye opening is the need for articles like these to seriously investigate and characterize the distinction with humans. Then comments here discussing it.

Beyond all the AI hype, there are notably more scenarios where the question has to be seriously asked and a detailed answer is informative.

It reminds me of the .com bubble in a way. There is too much hype, misunderstanding of the real current state of things, and bad predictions about the future.

However only flip side simultaneously something huge is happening, it's real. The real non-academic milestones of progress are just smaller, more numerous and subtle than the headlines.

In this case, it's not whether it's dangerous that's interesting as much as the subtle improvements made and how the march continues forward by degree.


fwiw I pumped 'GPT-2 is not as dangerous as' into writeup.ai and got:

> GPT-2 is not as dangerous as the older version. The second issue is that the newer version of the MSA- 1 does not support the new EFI bootloader. This means that the old MSA- 1 cannot be used to boot the newer MSA-2. The MSA-2 is a newer version of the MSA-1.

Talktotransformer.com thinks:

> GPT-2 is not as dangerous as erythromycin and sulfamethoxazole and should be used with caution during pregnancy.

While this tool is not yet smart enough to comment on this debate, a future version of it might be.

Other good prompts are 'A good pasta sauce begins with'


"hello" generates...

hello_world:

from funcs import funcs_builder from funcs_spec import funcs_spec from funcs_unit import Unit import ui import os import sys import sys.stdin import time import json from scopes import Context , QueryBuilder , StructContext , ContextHandler , QueryBuilderType , ContextType , StringContext def add_from ( builder , context_id ): """ Create a query builder that expects a builder object. """ # Create the context that contains the custom query builder if Context ( builders_context ) is None : context_type = StructContext ( builder ) . type . get () # Define the builder for this context in context_builder ( context_id ): context = builder . get_context ( context_id ) # Create the query builder for the query builder in context . query_builder ( context , context_id ): return builder return "hello_world" class CustomSchema ( QueryBuilder ): querysource = { 'base' : 'world' , 'world_scopes' : [], 'query_builder' : 'hello_world' , } query_builder = CustomSchema () def add_to


I'm probably not qualified to say this so take it with a grain of salt, but I'd be surprised if OpenAI is concerned about a lone hobbyist, and is much more concerned about nation states with access to much more hardware, software, and experts who can do things that no single person or small group of people could do.


Sounds like exactly the kind of actor that would have the budget and expertise to replicate GPT-2 itself, rather than having to rely on a pretrained model.


This is certainly a reasonable idea, and the OpenAI release statement did include a risk assessment attempting to determine who would be able to replicate their work.

It's sort of hard for me to understand how this lead to the decision they made, though. OpenAI's original aim was democratizing access to AI tools so they wouldn't be abused by a handful of players, but releasing a scaled-down model seemed to primarily limit its utility for hobbyists. Assembling and training on a larger corpus is precisely the sort of step that's out of reach for individuals, but extremely approachable for governments or corporations with lots of computing power and expert-hours.

That said, OpenAI also noted that their staged-release model was something of an experiment, so it's not entirely clear to me how much danger they thought GPT-2 actually posed. It seems to have been a useful experiment in gradual release and collaboration with other researchers, and a test of 'inoculating' people via a weak model. GPT-2 isn't a great tool for tricking people or disrupting conversations, but publicizing awareness of AI text generation before turning it loose has probably further decreased that risk.


If you have the resources of a nation state then you already have the resources to push your agenda through traditional media. It's not clear what benefit this type of AI generated content would have over the traditional approach.


Asymmetric information warfare. Flood every channel with misleading and contradicting information beyond the capacity for fact-checking. Make entirely fake social media threads to steer perceptions towards a predetermined direction. Create exaggerated stereotypes of both sides of a debate to inflame and harden opinions.

All of this is already being done right now. The difference is that a robust language model would allow this to be done with far fewer humans involved. Fewer people to raise their hands about how far down the ethical rabbit hole they are willing to go. And a single AI model could react to news much more quickly than a team of people who have to sleep at some point.


That's not asymmetric, any nation could do it. Also, the Reddit farms wouldn't have to sleep, because the writers could work in shifts.


The human writers are constrained by what they think they can get away with. The automated writers don't even work that way. You can see from the examples used in discussions here, the more limited the domain of discourse the more convincing the samples are (except the hello world one that looked like a mash-up of every programming tutorial ever written). I think this thing could do a great job making fake announcements to air travelers about flight delays.

I think the synthetic texts are already convincing enough that the natural desire to try to impart meaning to writings/utterances will go a long way to convincing forum user "victims" that something is being said that needs to be thought about. And here in this discussion, we know that the fakes are fakes (except for one comment which is unlabeled and I'm not sure about), in an aggression against a forum the fakes won't be labeled.


The less said, the easier it is for a language model to approximate a useful thread comment for the purposes of mass propaganda.

“These graphics look terrible. I will never play this game.”

“$Candidate is a corporate shill and everyone knows it.”

“I can’t wait for $Artist’s next album! They’re sooooo good!”

Doesn’t need to be an extensive, well thought out comment to drive thought and discourse. GPT2 is good enough for that.


You don't even need a nation state. 1 person could sufficiently weaponize this against multiple individuals by using it to mimic their writing style by training it on their social media/blog/articles they've written and then using it to start various accounts online as the target and slowly building up followers with content and at some point switching to saying highly controversial, or downright offensive, things to immediately cause people to change their opinion of the individual. Racist/sexist/etc content, allegations against someone else, fake content about a company/product/service that could cause varying levels of financial damages to the company etc.

It also is a new tool for blackhat SEO, instead of paying people to write crappy content, or running relevant articles through any number of commercially available software packages that will rewrite the article enough to fool search engines as being unique for weeks or even years, you can simply train the software on stacks of books of material about a given subject and have it churn out semi-intelligible-to-humans articles about the subject that could allow you to build a rather large amount of content to manipulate search engine rankings or even to just create contextual/affiliate ad farms.

If you have the software churn out smaller sets of content, then quickly edit it yourself, a single individual could churn out 10-20,000 words of content a day with banners/images/contextual ads inserted. A small blackhat SEO team in a country like Bangladesh or India could offer prices far cheaper than competition using more traditional methods, have articles that are more 'natural' in the customer's language, and be of a likely higher quality in a much higher volume.

This is the problem with entities like Open AI, they're all "AI is great, AI is good, AI is our future savior, yay AI" but are any of them going "well, here's the 16 ways I can think off, off the top of my head, how to wildly exploit this technology for personal/corporate/government gain"? AI doesn't have to be SkyNet or robotic killing drones to be exploited, an individual can benefit considerably (and cause considerable hardship for an entity) with stuff like this. Who at places like Open AI are asking these questions? Where are the employees/advisors/consultants that look at each project and offer real time feedback on ways to abuse the project in its current and near-future states?


My understanding was that the main issue is this would make spam too hard to detect.

There is also the question of whether we are overanalyzing the output of the model. The model pretty much spits out garbage, but our brains are hellbent to find patterns in that stream, because thats what brains do. A simple analysis would probably show that it s no more meaningful than random words.


Like other chatterbots that don't really know what they're talking about, coherence between sentences is poor. There's no internal model beyond likely words.

The next stage is something that takes an outline and cranks out a paper or speech to fit. There are specialized systems like that for sports reporting.


Could we talk about how horrible Tumblr's "privacy form" is? It is literally impossible to decline.


You can see how GPT-2 works on a social media site in the "SubSimulatorGPT2" subreddit, where every post and comment generated by it (but it's probably just the smaller model) sometimes it's funny sometimes it's scary:

https://www.reddit.com/r/SubSimulatorGPT2/


Anyone claiming GPT-2 not to be dangerous is very uninformed about humans.

If you can generate text, and find a way to measure it's persuasive effectiveness (which is not hard in 2019).. It will be used to push an agenda. And given enough time, it will do so with hypnotic levels of persuasion.

The claim it's not dangerous is absolutely missing the mark.


Did anyone else read that whole article waiting for the reveal that it was generated by GPT-2?

Ok fine... I skimmed it


Just quick question. Which graphic card should I use to run 774M version? I tried running with 2070 RTX in the 355M and it went out of memory.


Haven't tested myself, but the minimum for current SOTA NLP models is 1-2x Titan RTX.


This is an article that the author is going to regret in five to ten years, sadly.

Basic claims:

1. Long form text coherence problems in GPT-2 mean it's not useful for creating propaganda.

2. No propaganda has been noticed in the wild since GPT-2 announced, hence GPT-2 or its near successors are not dangerous.

3. Existing alt-right propaganda (presumed written by humans) is already plenty effective, who needs better written propaganda?

Maybe it's enough just to write down the logical premises, but I'll say what I think, (and we can check back here in 10 years and see who's more correct, I hope it's the author) -- anything that changes the cost of information creation and dissemination is fundamentally a highly powerful force.

In A16Z terms, software looks like it's going to eat writing, at least certain forms of writing. We know from numerous examples that this means radically faster innovation time, cheaper scale and wealth creation and destruction are likely not far behind.

Compounding the problem for threat assessment is the toupeé fallacy -- only obviously poorly generated text is noticeable as "AI generated". I would urge anyone thinking about these things to flat out disregard any statements that AI-generated text is not in the wild or if it is that it is not effective. You literally have no way of knowing if you have read AI generated text -- in fact, the examples curated from existing models suggest curated text can be high quality enough that it is likely being used online in some way today.

It's not going to get harder to generate text that reads well. It's not going to get more expensive. It's not going to get slower. Architectures that tune text creation to create clickbait titles with underlying goals are going to get worked on and thought about and tested.

Whether it will be more typically a nation state tool, corporate tool or instead the equivalent of a molotov cocktail - cheap digital force extension for renegades or infoterrorists -- this isn't clear yet. But my money would be on all three, sadly.

To solutions - Comparatively little science fiction thought work has been done about this that I'm aware of. Vernor Vinge speculated about broadscale disinformation as a public service in Rainbow's End -- The Friends of Privacy -- and at some level projects like Xanadu and more recently advertisement attribution (blockchain or not) tech startups are all working on this from different angles.

To my mind, these questions come from a part of the Internet's architecture - text isn't generally signed or strongly attributed, and if it were, we don't really have a solid global identity infrastructure - hence a lot of worrying and hand wringing.

I'm trying to invest in stuff that works on this identity layer specifically, but honestly it's an immense problem with few compelling stories about how things could change.


To my mind, these questions come from a part of the Internet's architecture - text isn't generally signed or strongly attributed, and if it were, we don't really have a solid global identity infrastructure - hence a lot of worrying and hand wringing.

You're way overthinking this. Any website with an SSL cert is a "channel" and if it doesn't publish AI generated text, then people can go there to get pure human generated nonsense instead of machine generated nonsense. No new infrastructures are required: in fact we already have many such websites, like newspapers or blogs.

Also, really, listen to yourself. "Infoterrorist"? What is an infoterrorist? You're making up meaningless new words on the fly to try and create a general sense of unease in your reader. Indeed you're trying to make people fear speech, which could itself be described as alt-left propaganda. Does it matter? No not really. You're just a guy posting on the internet, as am I. Spambots have existed since the start of the internet. The content is what matters, not where it came from.


hence GPT-2 or its near successors are not dangerous

FWIW, my post that's linked here doesn't claim that GPT-2 won't have dangerous successors, just that GPT-2 itself is not very dangerous, and that this doesn't vary much with the model size. The point is not about the five-to-ten-year trajectory of natural language generation research, which could go all sorts of unexpected places -- indeed, GPT-2 itself was such an "unexpected place" from the perspective of a few years ago. The point is a narrower one about OpenAI's staged release plan, which assumed different levels of risk, or at least different prior distributions over risk, for different sizes of the same model.

I agree that

Architectures that tune text creation to create clickbait titles with underlying goals are going to get worked on and thought about and tested

but that is a distinct discussion, one about future research milestones enabled by the relative success of generation by sampling from transformer LMs. My claim is that significant further milestones will be necessary (not that they won't happen); there are big and relevant limitations on sampling from currently existing transformer LMs, and simply making the transformer LMs bigger does not remove these. In particular, these limitations make it unlikely to be cost-effective to make clickbait or propaganda by curating and cleaning samples from such a model.

If you doubt this, I'd encourage you to actually go through the exercise: imagine you want to generate a lot of text of some sort for a specific real-world purpose that would otherwise require human writers, and try out some strategy of sampling, curation, and (optionally) fine-tuning with one or more of the available GPT-2 sizes. If your experience is like mine, you'll draw the same conclusion I've drawn. If not, the results would be fascinating and I would honestly love to see them written up (and OpenAI might as well).


Neal Stephenson addresses this problem and solutions to it, among other topics, really good in his newest novel "Fall, or Dodge in Hell".


I disagree, you can have software spit out these meh articles then pay a room full of people with the writing ability of an average 15 year old to go line by line rewriting the content with their own 'voice' and you can easily churn on massive amounts of content based on whatever you trained it on.

For over a decade individuals have used software to take already written content and change it just enough to fool search engines for SEO purposes, it's been an effective tactic despite the articles often being largely unintelligible to a human. If you can make something moderately intelligible to a human, about a given area and have some minimum wage employee 'personalize' it and BAM you can turn a room of a 10, 20, 100 people into a hardcore content machine.

I've suggested to Altman before that, given a sufficient body of work, you could churn out fiction in the style of famous deceased authors by having the machine do the bulk of the work then having a small team go in and edit the work to make it fully coherent and an enjoyable read.

With someone like me, that uses their name as their username virtually everywhere, you could sufficiently train the machine on my reddit and blog alone to imitate me on social media platforms. It could learn my writing style, my habits of using 'heh' and 'haha' way too much on reddit/twitter/facebook and you suddenly create Bizarro Ryan that you can create new social media accounts for and start tossing in some hate speech in an anti-me campaign. While this wouldn't do much to me, to a celebrity/politician/expert in a field it could absolutely ruin their career, even if later proven to have been faked because popular opinion will still associate that person with that undesirable behavior.

While pursuing this technology would be amazing for creating new literary works from people like Verne, Heinlein, Burroughs (your favorite authors here), I can weaponize it RIGHT NOW.

This is the problem with entities like Open AI, they're all "AI is great, AI is good, AI is our future savior, yay AI" but are any of them going "well, here's the 16 ways I can think off, off the top of my head, how to wildly exploit this technology for personal/corporate/government gain"? AI doesn't have to be SkyNet or robotic killing drones to be exploited, an individual can benefit considerably (and cause considerable hardship for an entity) with stuff like this. Who at places like Open AI are asking these questions? Where are the employees/advisors/consultants that look at each project and offer real time feedback on ways to abuse the project in its current and near-future states?

Maybe they have someone but, methinks they don't.


> given a sufficient body of work, you could churn out fiction in the style of famous deceased authors by having the machine do the bulk of the work then having a small team go in and edit the work to make it fully coherent and an enjoyable read.

This is interesting but very dubious in my opinion. The current state of the art tech seems to be good at low-level stuff, like stylistic mimicry and maintaining (relative) coherence at the sentence level (and sometimes the paragraph level). It seems weaker at higher-level coherence, and I've seen no evidence that it would be capable of creating a book-length, or even short-story-length, work with a plot that made any sense (let alone a compelling one) or characters that are plausible (let alone interesting). If it does fail at those things, what are you supposed to do with the okay-in-isolation fragments that it spits out? You'd be lucky if they could be stitched together into anything worthwhile, even with a lot of human effort.

> With someone like me, that uses their name as their username virtually everywhere, you could sufficiently train the machine on my reddit and blog alone to imitate me on social media platforms. It could learn my writing style, my habits of using 'heh' and 'haha' way too much on reddit/twitter/facebook and you suddenly create Bizarro Ryan that you can create new social media accounts for and start tossing in some hate speech in an anti-me campaign. While this wouldn't do much to me, to a celebrity/politician/expert in a field it could absolutely ruin their career, even if later proven to have been faked because popular opinion will still associate that person with that undesirable behavior.

If someone wanted to target an individual, or a small number of people, couldn't they already do this manually? And if they wanted to target a huge number of people, surely they would very quickly burn the credibility of the platforms they hijacked.


>If someone wanted to target an individual, or a small number of people, couldn't they already do this manually? And if they wanted to target a huge number of people, surely they would very quickly burn the credibility of the platforms they hijacked.

Yes, but you can add incredible amounts of credibility to claims if you've used AI to create a bunch of deep faked images of completely artificial people, populated social media profiles, had AI create photos of these individuals together in random settings, create a network of these accounts that follow each other as well as real people/are friends with each other and real people, and organically feed claims out.

This sort of AI use, for faking images/video/audio/text, makes this much much easier to do with more believably as well as to scale it considerably for personal use or for hiring out.

You can already go on various darknet markets and hire various harassment services.

You can already go on various websites and order blackhat SEO that uses very 'dumb' software to generate content to spam to obviously fake social media accounts, blog posts, etc for SEO purposes - there are dozens and dozens of VPS services that rent you a VPS with gobs and gobs of commercial software pre-installed (with valid licenses) specifically intended for these uses and if you'd rather just farm it out there are hundreds of providers on these forums that sell packages where you provide minimal information and in days or weeks they deliver you a list of all of the links of content they've created and posted.

With stuff like GPT-2 you suddenly make more coherent sentences, tweets, short blog posts, reviews etc that you've trained on a specific demographic in the native language and not written by an English as a 3rd language individual that then had software reword it to past copyscape protection. Pair it with deepfaked images/video that you then add popular social media filters to and you can suddenly create much more believable social media presences that don't scream 'BOT' because it isn't a picture of an attractive woman in a bikni with the name Donald Smith that's only friends with women in bikinis with names like "Abdul Yussef" "Greg Brady" "Stephanie Greg" "Tiffany London" that you constantly see sending people friend requests on fb or following you on twitter/instagram because you used #love in a post.

Software applications like this, make the process much easier to do with a higher level of believability. Humans, without knowing, are often decent at detecting bullshit when they read a review or a comment. Inconsistent slang or regional phrasing, grammar that feels wrong but not necessarily artificial (English as a second language for a German speaker for example, where it might be something like "What are you called?" instead of "What is your name?" or more subtle like "What do you call it?" instead of "What's it called"?) which can be defeated with AI that is trained on tweets/blog posts/instagram posts that someone scrapes of 18-23 year old middle class women, or 30-60 year old white male gun owners, or 21-45 year old British working class males.

The whole point of AI is to make some tasks easier by automating them, when you're dealing with AI that mimics imaes/video/speech you're just making it far easier for individuals that already manually employ (or use 'dumb' software) these tactics to scale and increase efficacy.


It was a great PR stunt though


“It is less good at larger-scale structure, like maintaining a consistent topic or (especially) making a structured argument with sub-parts larger than a few sentences.”

Have you surfed Facebook recently? Shockingly few people are good at these either. The quality of actual human writing on social media is so bad that today’s neural networks can be of equal quality, which is very dangerous if used to make certain ideas seem more widely held than they are. After all, our morals/ethics are basically set by what we observe in the world around us (compare your view on torture, public execution, slavery, etc) to what your view would be if a baby with your exact DNA had been born in Ancient Babylon.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: