My deepfake DALL-E 2 vacation photos passed the Turing Test

karpierz · on May 18, 2022

Passing the Turing test requires, for any human judge:

1. This judge is aware that they have to discern whether the 'bot' is real or a machine.

2. The judge cannot discern whether the 'bot' is real or a machine better than random chance.

This failed 1. And even given that advantage, might have failed 2 as well?

Often I see headlines along the lines of "X fools people and beats the Turing test!". But the point of the Turing test isn't to trick a person, it's to make it functionally impossible for a person to distinguish between the real and simulated thing, no matter how hard they try. For something to pass a Turing test, it would need to be able to pass the following:

"Anyone can play as judge any number of times. You can take as long as you want, and if you're successful in breaking it under controlled conditions (IE, you don't cheat and use an out-of-band communication protocol with the 'bot'/human), we'll give you a 10,000,000$."

The "One Million Dollar Paranormal Challenge" (https://en.wikipedia.org/wiki/One_Million_Dollar_Paranormal_...) is a solid example of a Turing test for magic.

jameshart · on May 18, 2022

I mean, you can read Turing's own definiton if his test - 'the Imitation Game' - on the first page of his 1950 paper Computing Machinery and Intelligence[1]. There's nothing in there about repetition, duration, or $10,000,000 prizes. It's a party game. And he just frames his question (which will "replace our original, 'Can machines think?'") as "Will the interrogator decide wrongly as often when the game is played like this [with a human and a computer] as he does when the game is played between a man and a woman?"

So, to perform the experiment, one must have some people play the game with humans a few times and then play with a human and a machine a few times, and look to see if the results are statistically significant. When they aren't, Turing posits, the question 'can machines think?' will have been answered in the affirmative.

That is not to say that this DALL-E vacation photo social media post constitutes a rigorous 'passes the Turing test'. But I don't think it's fair to criticize someone for using 'the Turing Test' colloquially as a catchall for saying 'you probably didn't notice this output was machine generated, therefore you might want to adjust your priors on the answer to the question, "can machines think?"'. Because that's exactly the spirit that Turing was working in when he proposed using a party game as a test of intelligence.

[1] https://www.csee.umbc.edu/courses/471/papers/turing.pdf

karpierz · on May 18, 2022

There's the literal definition of the Turing test, as described above, which doesn't actually work for proving any sort of intelligence.

Then there's the conceptual argument of the Turing test, which we can turn into a test of intelligence. It relies on the idea that we can abstract the mind into a "thinking" black box which takes inputs and outputs. And then posits that any black box which can't be distinguished from an actual "thinking" black box may as well be "thinking".

Passing the literal Turing test is a sign that some humans can be tricked for some small domain over some small period of time. Passing the conceptual argument that the Turing test relies on shows that there are non-human entities which cannot be separated from humans on the basis of how they interact with a human (through some limited medium).

The repetition, duration and prizes are just practicality; prizes incentivize people to try, repetition ensures that the results are robust, and duration ensures that the domain the AI can be tested over is equivalent to a humans.

sdenton4 · on May 18, 2022

Meanwhile, if you ran an 'rigorous image-generation Turing test' between dalle2 and randomly selected humans, the machine would be obvious because it's much higher quality than a randomly selected human would be able to produce, thereby failing the Turing test.

Aside from some corners (probably to be filled over the course of the next year or three), dalle2 is obviously outperforming almost all humans at its task. The cross-style ability is probably exceeding almost all human /artists/, who tend to specialize in a single style.

And some of the creativity criticisms (can only styles it's seen before) are basically true of all but the tiniest sliver of humanity, whose names we tend to remember.

savanaly · on May 19, 2022

Funny thing is this is going to happen with all AI generated things. If and when GPT-4 comes out for writing and it's as much better than GPT-3 as GPT-3 was from 2, then it will be easy to know if a given piece of writing was human or AI generated. Is it grammatically correct, concise and to the point, and have a spark of life and originality to it? Then that's something that 90% of humans can't do and what AI will be able to do every time, so you can bet it's AI and beat the Turing test!

jameshart · on May 19, 2022

Just threw some of Turing's original prompts at GPT-3 to see how it does. Its willingness to play along with the conceit is what gives it away, not the quality of its answer:

    This is a dialog between Q, a human who is trying to work out if he is talking to a computer, and A, a computer who is pretending to be human:
    
    Q: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?
    A: No, I don't think so. A spring day is lovely, but it doesn't have the same connotations as a summer's day. Summer is a time of warmth and languor, of long days and lazy afternoons. It's a time for picnics and swimming and lazy days in the sun. Spring is a time of rebirth and new beginnings, of freshness and hope. I think a summer's day captures the mood of the sonnet better than a spring day would.

joe_the_user · on May 18, 2022

"But I don't think it's fair to criticize someone for using 'the Turing Test' colloquially as a catchall for saying 'you probably didn't notice this output was machine generated, therefore you might want to adjust your priors on the answer to the question, "can machines think?"'"

The colloquial meaning of "pasted the Turing test" has come to be "has been able to demonstrate intelligence when put to some serious, systematic testing". That may be switching that "has been able to fool people when they didn't look hard". That might be changing but I don't think it's changed yet and until it's changed, I'll protest 'cause that's terrible change imo.

ma2rten · on May 19, 2022

That's not at all in the spirit of Turing's paper. He "party game" involves the interrogator knowing that they participating in it, talking to the machine for 5 minutes, posing serious challenges such as playing chess (at the time considered to be a hard problem for machines).

dredmorbius · on May 18, 2022

The test is now live all the time.

You must constantly be aware that images, or text, or voice, or other audio, or other signals or data, might be computer generated or altered.

All the time.

And you individually, or those about you, or societies at large, may be influenced in large or small ways by such signals, patterns, and records.

Your elderly neighbour or relative might be scammed out of life savings. Investors of false product claims. Voters of some fake outrage --- particularly of the October Surprise variety. Soldiers and diplomats of mock attacks, or false depictions of a tranquil situation where in fact danger lurks.

The test never ends.

This is your final warning.

jmmcd · on May 18, 2022

The only good news is that this is not your final warning.

dredmorbius · on May 19, 2022

Final advance warning might have been more accurate.

Sense meant was "the threat is active, you can't presume otherwise, ignorance is not safety".

That said, I suppose that message will be repeated occasionally in future. Possibly even by myself.

hooande · on May 19, 2022

this has been true since photoshop existed? you don't need an algorithm to make a fake picture or to fake a military attack, etc. humans generally have better grasp of appropriate details

dredmorbius · on May 19, 2022

The biggest problem with this line of response is not only that it utterly ignores scale, cost and speed, as has already been addressed in responses, but consequence.

Yes, in the 8th century BCE one could commission bards to gin up epic sagas of conquest and atrocity to stir the public spirit. In the 15th century, one could hire one of the most brilliant sculptors and painters of all time to create artpieces legitimising a power-hungry thieving dynasty which had seized control of banking, commerce, and the Church. In the 16th century, an earlier horde could by mobilised by an Orange tyrant through pamphlets cheaply printed and distributed. In the 19th century, Wreck-a-Feller could intercept (and rewrite) telegraph transmits, newspapers could sensationalise photographs, and preachers could roll holy until districts burned out.

And so into the 20th and early 20st centuries.

What was not possible until quite recently was for high-fidelity, all-but indistinguishable, continuous imagery, audio, and video to be produced in realtime and distributed globally.

The time between fraud and mass impact is now measure in seconds, and can reach globally. This is well within response intervals not only of populations and institutions, but of individual weapons of mass destruction. The risks of massive consequence are huge.

And that is what is new and novel.

The interaction with other technologies raises yet more concerns. With Slaughterbots a highly-plausible if not actual reality, "leaked" video showing a plausible Slaughterbots attack might itself be a weapon of war. (https://yewtu.be/watch?v=vR91F3tp6eQ)

That is, one of the risks of some presumed capability or knowledge is that even the belief of its existence or validity becomes a tool to be used. The late senator from Wisconsin claimed to have, though never presented, evidence of disloyalty, a common practice in witch-hunts. With increasing scope and scale of data collection and storage, such claims become of themselves ever more plausible, with numerous effects. Concepts such as the Panopticon and "chilling effects" operate not based on actual or observed risks, but on presumed ones. Adam Smith in Wealth of Nations uses the word invisible twice. The unhanded one references the decline in moral amongst soldiers faced with sudden and invisible death at any time, and the costs of perpetual vigilance.

So, no, really, this is not the same. Scale matters. Effects matter.

pjc50 · on May 19, 2022

Fakes have played a key role in historical events: https://en.wikipedia.org/wiki/Zinoviev_letter ; https://en.wikipedia.org/wiki/The_Protocols_of_the_Elders_of... ; Roosevelt's map faked by British intelligence https://www.jstor.org/stable/40257685 and so on.

It's a matter of quantity as well as quality. It gets easier to produce convincing forgeries all the time, but it doesn't get easier to debunk them. This allows for public discourse to be destroyed by flooding it with fakes.

dredmorbius · on May 19, 2022

Not only quality and quantity, but time.

Gans, deepfakes, X-does-not-exist, DALL-E, GPT-3, etc., are approaching if not at realtime in generation.

Fakes hit the newsfeed with, or before reality.

The latter most especially if part of a counterintelligence / disinformation blitz by attackers to maximise confusion.

_justinfunk · on May 18, 2022

da da dummmm....

GuB-42 · on May 18, 2022

Furthermore, the judge is supposed to be an expert. That is, not only his task is explicitly to tell between human and computer but he has to have a good idea on how to do it. Random people from the internet are not enough.

In the "paranormal challenge", the juges usually include stage magicians, because they know the tricks that would fool ordinary people. James Randi himself is a magician.

karpierz · on May 18, 2022

Conceptually, to pass the Turing test you should be able to fool any human (including expert judges). It's just practically easier to choose an expert judge as opposed to test on every human on Earth, since you'd hope that if an expert can't do it, then neither can anyone else.

Also, it shouldn't matter if the human is someone who worked on the AI, or has read the code, or has seen every previous Turing test that the AI underwent. There shouldn't be any information a person could know that would allow them to tell that it's an AI.

RogerL · on May 18, 2022

I have acquaintances that that couldn't pass this based on their common texting skills. There's probably no point to argue about what the Turing test "is", but the definition in this comment chain is pretty uncompromising - I can't imagine fooling any (meaning all, I assume) humans; there is too much ambiguity in the signal.

"Are you going to be at the blah blah because I need blah". response: "I really wqnt it". Nonresponsive, typos, what does 'want' refer to? Who knows. Is this a bad bot, someone spun out on meth, someone with cognitive processing issues, a busy mom texting while distracted, or ?

karpierz · on May 18, 2022

> I can't imagine fooling any (meaning all, I assume) humans; there is too much ambiguity in the signal.

I think you've inverted the expectation here. What I was saying is a machine passes if you can't find anyone in the world who can distinguish between it and an actual human. Meaning that if the "world's smartest person" can distinguish between it and an actual human, it fails, even if it can fool everyone else.

A machine can pass the test by deliberately feigning to be a human with limited communication capabilities (ex: I think we can simulate 2 month old baby talking via text). But then all you've shown is that your machine is as capable of thought as a 2 month old baby, which probably isn't the bar that you're trying to reach.

sdenton4 · on May 18, 2022

It's a good thing we don't subject human intelligence to such tests...

godelski · on May 18, 2022

I think another important factor here is that it is unclear if the OP cherry picked photos or used the first ones given. Dall-E 2 has a bias to be better at real world scenes since it can just pull (near) memorized images, but I also wouldn't be surprised if these images were downselected from a larger set.

karpierz · on May 18, 2022

Great point. I should've added a third restriction: the 'bot' should not be able to communicate with, or use the judgement of, humans once it's started.

mattkrause · on May 18, 2022

The original version of the test allows for interaction, and I think that’s probably a good thing. Language models currently have a hard time staying consistent/on topic and that’s a potentially valuable tell.

Instead, I think you don’t want any third parties “editing” or “curating” the exchange (beyond whatever blinding is needed to make it work).

godelski · on May 18, 2022

Yeah I think the interaction is a key part. People are confusing AGI with useful AI. DALL-E is very clearly useful even if 90% of the images it produced were unusable. The time save is still large and cost reduced for simple tasks. Same with language models. They may have a hard time staying consistent over long sequences but adding human selective pressure to responses saves everyone time and money. But this is very clearly different from a machine that can think and form new ideas on its own. We're still a long way from that.

causality0 · on May 19, 2022

If someone told me they found a magic lamp and one of their wishes was that every time someone misused the term "Turing test" they got smacked in the face by an opening door I'd think "not bad, not bad".

sullyj3 · on May 19, 2022

As usual, the headline is more sensationalised than the actual article

> It's likely that with a harder version of the Turing Test, in which real and fake images of the same content are presented side by side and people are told that one of them is fake, it would be much easier to detect the fake images.

VikingCoder · on May 18, 2022

How often have we run a Turing Test, where we asked the judge how confident they were in their final answer, except both participants were humans?

IshKebab · on May 18, 2022

It doesn't really matter for the purposes of the test how confident they say they are.

If you're asking "do people often imagine they can confidently distinguish things when they actually can't" then the answer is a solid yes - things like audiophile and wine testing have proven that again and again.

a-dub · on May 18, 2022

if a human participates in interactive gamified social media, and this participation begins to change, shape, reinforce or otherwise mutate their beliefs, for the purposes of the test are they still actually a human? could the entire social media mechanism (from the builders to the participants) be considered a form of a sort of singleton autonomous intelligence in and of itself?

non_sequitur · on May 18, 2022

More accurately, “DALLE2 made me realize no one cares about or looks closely at your vacation photos”

alx__ · on May 18, 2022

The snarky side of me wants to say it's because people take boring photos

But really it's just information overload, most things on social media I just scan the thumbnail and move on. Only my family would care to see my vacation photos :D

B1FF_PSUVM · on May 18, 2022

One, or three, carousels of vacation slides used to be an effective way of putting a party to sleep ...

teaearlgraycold · on May 19, 2022

My rule of thumb: never share more than one photo per vacation unless asked for more.

spchampion2 · on May 19, 2022

Diver here. I'm looking at these four pictures after the fact, so I already know they're fake. They're good, but they also have some weird flaws. That said, I don't think I would have immediately recognized any of these as wrong on Facebook (maybe the diver photo).

- The nudibranch (slug thing) on the green coral doesn't look like anything I've seen in the Caribbean before, and the coral also looks odd for the region. That said, this is probably the most difficult photo for me to differentiate. I would have accepted this as a cool find of something I haven't seen before.

- The grouper (big fish) photo is actually pretty good, although DALL-E has misplaced its eyes a bit. That said, the lit foreground and dark noisy background are exactly the look I would expect for someone using a basic camera + lights with wonky post-processing.

- The diver photo is a horror show. There's a hose going nowhere on her back. It looks like she's blowing out of a harmonica instead of a regulator. Bubbles are collecting around the top of her mask for some reason. Her fins look like they were badly Photoshopped. Nothing looks right here.

- The lobster photo has a real but subtle flaw: Caribbean lobsters don't have big claws. It also looks like it's under a rock like you would find in cold waters around Massachusetts and Maine instead of the Caribbean.

Interesting stuff though. It will force me to be more skeptical when I look at people's photos in the future.

twiceaday · on May 18, 2022

This is off topic but horny people are by far the most interested in conjuring up custom images. DALL-E trained on porn would be huge.

harpersealtako · on May 18, 2022

This is absolutely true. Look at text prediction models as an example (e.g. GPT-3). One of the biggest (if not the biggest) applications was story-generation tools like AI Dungeon. Guess what most people actually used AI Dungeon for? Erotica. Guess what happened when OpenAI cracked down on it? A huge portion of the userbase jumped ship and built a replacement (NovelAI) using open-source EleutherAI models that explicitly did support erotica, which ended up being even better than the original ever was. I can tell you that there is very strong interest in nsfw image generation in those communities, as well as multiple hobby projects/experiments attempting to train models on NSFW content (e.g. content-tagged boorus), or bootstrap/finetune existing models to get this sort of thing to work.

mrkramer · on May 18, 2022

This is one of the experiments with Progressive Growing GAN (ProGAN) technology from Nvidia:

NSFW: https://medium.com/@davidmack/what-i-learned-from-building-a...

stevage · on May 19, 2022

Interesting that they talk about how hard it is to make a business out of it:

>I felt at this point that I’d hit a dead end. Press and fundraising would be tough and require some extra creativity and force. I spoke to friends about hiring them, and had polarized answers. Overall, this project had become less appealing.

Seems to me there is a lot of money to be made here if it works. The interesting thing about porn compared to say TV is that people have very, very specific interests and basically just want an infinite amount of content within that interest. It's not like with TV where cooking shows become popular, and the people that used to watch dramas are now watching cooking shows.

So the ability to generate highly specific content tailored to an individual's very precise requirements seems potentially very lucrative.

bsnal · on May 19, 2022

>A day later I start to see detailed results from the model. Its generated images actually look like vaginas:

This doesn't say much about the amount of vaginas that this guy has seen.

colinmhayes · on May 18, 2022

This might be horrible to say, but could this be a solution to csam? From what I've seen most people who enjoy csam do genuinely feel bad for the children, but they're sick, and can't control themselves. Might they be willing to indulge in fake csam instead?

rootusrootus · on May 18, 2022

I don't think it's horrible, it just seems like a practical solution. It is such a taboo subject that nobody seems to really talk about the possibilities, but it's worth asking the question -- if someone so inclined can gratify their desires in private with fake imagery, will it prevent them from leaving their home and seeking out someone to hurt?

Or will it strengthen their need for 'the real thing' as someone else suggested in a sibling comment?

In any case, we still don't have a great answer for the legal question. Possession of realistic fake imagery is illegal, on the grounds that its very existence is a risk to children. There isn't any actual science behind that, it's just what the politicians have said to justify regulating what would otherwise be a constitutionally protected right. I imagine it will become a topic of discussion again (my quick research says the last major revision to US law in this regard was about 20 years ago).

hallway_monitor · on May 18, 2022

It seems more misguided than horrible. I'm not a psychologist, but indulging in pathological behaviors would seem to strengthen them. Heroin addicts need to quit, not use methadone forever.

stickfigure · on May 18, 2022

Is that true? Someone on HN once described it like "eventually you get tired of the addiction and want to quit" (if you live long enough, that is). No personal experience, but I have known a couple former addicts and this seems to reflect their reality.

Maybe an effective approach would be to maximize harm reduction until the addiction has run its course? That seems to be the Portugal solution, and it seems to be successful.

AussieWog93 · on May 19, 2022

I don't think it's accurate to equate pedohilia with drug addiction. It's a sexual orientation, not a chemical dependence.

Do gay men eventually get tired of being homosexual and turn straight?

awillen · on May 18, 2022

The truth is that addiction, like all mental illness, is complex and unique to each individual.

Diagnosing mental illness generally consists of identifying some number of symptoms out of a possible list - often something like five out of eight possible. That means two people can be diagnosed with the same thing with only two overlapping symptoms.

So basically, don't listen to the guy who starts with "I'm not a psychologist" and then decides to play armchair psychologist.

krageon · on May 19, 2022

> indulging in pathological behaviors

Let us take this lesson from "pray the gay away" camps and other types of "conversion therapy": You cannot take these preferences out of people, it just does not work. At best, you can make it clear how they are is a travesty and a lot of them will hide it for the rest of their life successfully. This is not a good solution compared to what you were talking about earlier because it increases human suffering by a lot and doesn't have an advantage over a suffering-free solution.

That said, I don't think I could justify to myself to create such an AI. I've quite simply been so disillusioned by the depravity of man and especially in this case I want nothing to do with it even peripherally. Perhaps that makes me a little hypocritical. Philosophically it would still be nice to solve this issue, even if it is just so no more children need to suffer (which should always be the main goal).

danachow · on May 19, 2022

> I'm not a psychologist, but indulging in pathological behaviors would seem to strengthen them. Heroin addicts need to quit, not use methadone forever.

Yes, clearly not a psychologist, nor an addiction treatment specialist. Methadone is often used indefinitely as maintenance therapy for opioid use disorder.

teddyh · on May 18, 2022

IIUC, fake CSAM is also illegal.

rootusrootus · on May 18, 2022

I believe that as of 2003 it has to be a realistic fake, however. Obvious cartoons are no longer illegal.

I imagine it'll get challenged again at some point on constitutional grounds. It is illegal right now on a moral basis, which is probably the weakest argument over the long term.

jwalton · on May 18, 2022

> Obvious cartoons are no longer illegal.

AFAIK they are still illegal in Canada.

dredmorbius · on May 18, 2022

I'm aware that Japanese lolicon in anime, manga, video games, and other contexts, is at least ... problematic ... in numerous areas. Several online sites have banned it, and on Mastodon and the Fediverse, there are often peer-level blocks against sites in which lolicon is permitted.

Then name itself is a portmanteau of Lolita complex, after the Nabokov novel.

https://en.wikipedia.org/wiki/Lolicon#Legality_and_censorshi...

intrasight · on May 18, 2022

Correct. Differs from, for example, rules protecting cruelty to animals. You can fake such cruelty without consequence - as is done in movies regularly.

More interesting question is this. Is it a crime if you generate CSAM just for ones own consumption?

rootusrootus · on May 18, 2022

> Is it a crime if you generate CSAM just for ones own consumption?

Yep. If it isn't obviously fake (i.e. a cartoon) the possession is illegal whether you produce it yourself or not. Though it's probably safe to say that you're unlikely to get caught if you're not sharing those images with other people.

intrasight · on May 18, 2022

What if it's in the "uncanny valley"?

My point is that the courts are going to have a hard time with this.

rootusrootus · on May 18, 2022

Well, the US law says "[when it] appears virtually indistinguishable [from the real deal]" (insert appropriate legal terminology on either end, but the three quoted words are the relevant bit.

I think we're in agreement that the advancement of the technology is going to make this topic come back up for legal debate. When the gulf between CGI and real photography was large, it was pretty straightforward. Not so much now.

Gigachad · on May 18, 2022

I think the hard part is there is close to no way to know its fake. But you could also that if its so easy to fake photo realistic content, why would you ever make real content.

kingcharles · on May 19, 2022

Depends on jurisdiction. Not a huge amount of places have outlawed fake CSAM yet, but it is growing.

rootusrootus · on May 18, 2022

Porn seems to quietly power the Internet, in so many ways. I imagine people are already getting creative with fake porn, and it's only going to intensify over time. Especially on the types of imagery that are illegal to possess.

shakna · on May 18, 2022

It probably would be. Unfortunately, the DALL-E people have foreseen that use, and balked at it for some reason:

> We’ve limited the ability for DALL·E 2 to generate violent, hate, or adult images.

throwaway0x7E6 · on May 18, 2022

>for some reason

"Our investors include Microsoft, Reid Hoffman’s charitable foundation, and Khosla Ventures."

kingcharles · on May 19, 2022

One of the primary features of DALL-E 2 is inpainting. Thus, if they allowed it, you could easily just paint out a celebrity's clothed figure and ask DALL-E to replace it with a nude.

killerstorm · on May 18, 2022

It's not a Turing test if a judge is not actively trying to discern.

jascii · on May 18, 2022

The article claims (in a screenshot without quoting sources, so take it for what it's worth) that "A recent blog post pointed out that GPT-3-generated text already passes the Turing test if you're skimming and not paying close attention".

This is certainly debatable, and I agree that it is pushing the limit a bit.

I think in the end, the "Turing test" was devised as a thought experiment, not as a final definition of AI. So I guess some freedom of interpretation is reasonable.

killerstorm · on May 18, 2022

Well, as a thought experiment, it suggests a concrete scheme: it's a game where everybody is trying to do their best.

If it's a game where nobody cares, it's a stupid game, and results are meaningless.

woojoo666 · on May 18, 2022

Results are not meaningless if in the real world, everybody is skimming anyways.

bryanrasmussen · on May 18, 2022

>if you're skimming and not paying close attention

Also, if I'm drunk and reading nonsense I might not realize it.

Dylan16807 · on May 18, 2022

It's a bend of the scenario if the judge is skimming and not paying close attention. This "pushes the limit".

It's a break of the scenario if they didn't go in with the goal of detecting fakery. This makes it useless as a "Turing test".

jascii · on May 18, 2022

I agree with you that it brings things outside the original scope of the Turing test. I do find it interesting to observe that a metric based on casual observation can have value in a society where elections can be swayed by online fakery.

VikingCoder · on May 18, 2022

Right, and Moore's Law doesn't say X, Y, or Z.

We get it.

But it's a pretty close analogue.

101008 · on May 18, 2022

The images look great but I think the experiment was helped by the fact that most of us can see any image, being told that the image is underwater and we'll believe it. Most of the people don't know about deep waters, and the plans and animal that live there. I suspect the experiment would have been different if the vacations were in a city or a beach or something like that.

raldi · on May 18, 2022

Wouldn’t a proper Turing test be one where people knew some of the photos were artificial and were asked to figure out which ones they were?

dredmorbius · on May 18, 2022

https://news.ycombinator.com/item?id=31426512

thfuran · on May 19, 2022

And also they were marine biologists and local dive masters.

ducktective · on May 18, 2022

One cool application of DALL-E could be generating a painting or sketch for each paragraph or sentence of novels. Imagine listening to an audio book of famous novels with visuals/cartoons made by AI.

Hope someone with invite access could do this for Moby Dick or Sherlock Holmes stories or 1984.

freemint · on May 18, 2022

DALL-E so far has no way to create consistent characters across multiple pictures. However good news if this discriminator approach https://blog.salesforceairesearch.com/gedi/ caries over to images such constraints could be imposed. This is just a handful of follow up publications away.

Gigachad · on May 18, 2022

There was an anime art generator posted here recently which could do this. If I understood correctly, they manually went through all the parameters and worked out which ones correlated with things like hair color, face type and built a web UI which could fix certain known meaningful parameters and allow the user to design a character and then the final step showed the same character in a bunch of different poses.

gwern · on May 19, 2022

Yes, you reverse-engineer the latent. This is a very old approach in GANs by this point, going back probably to like 2015. (The canonical examples were making faces smile or adding sunglasses.) If you want to see it done for a lot of different models, check out Artbreeder: that's how all of the editing attributes are done.

Diffusion models don't really have a latent but the CLIP embedding would serve the same function. The problem is, OA would have to implement it themselves. There's no way you can implement it as a user with the current interface. (This is also true of alternative methods like gradient ascent or GEDI etc.)

freemint · on May 18, 2022

Actually i need to correct myself. Something along these lines. The mechanism of DALL-E is a bit different but a discriminator approach should work anyway although maybe some magic might be needed to ensure the spatial invariance of the discriminator.

ducktective · on May 18, 2022

Thanks!

mproud · on May 18, 2022

This is a flawed experiment. If I see a bunch of photos, and many of them look real at first glance, I’m not instantly going to critique whether all of them were real, unless I was given specific instructions to do so.

Also, underwater photos are not something many people have personal experience seeing. Most of us don’t live underwater. We may not be equipped well enough to tell the difference, where above water, especially urban photos, we will likely notice better.

cowpig · on May 18, 2022

> My deepfake DALL-E 2 vacation photos passed the Turing Test

Most people didn't notice that some of my vacation photos were fake, therefore it passed the Turing Test... why is this clickbait nonsense getting so much attention?

Can someone who upvoted this article explain why you upvoted it? Did the fact that the title is flatly false not bother you? If someone wrote an article about cracking some encryption algorithm and titled it "I proved P=NP" would you upvote it?

RcouF1uZ4gsC · on May 18, 2022

It is easy to pass Turing tests when the subject material is unfamiliar to people. As other posters have mentioned, most people have only a vague idea of specific underwater plants and animals and vague ideas of how the water distorts light.

I bet I can come up with a simple generator that generates galaxies/nebula pictures and if I interspersed those in with NASA Hubble generated images, most people could not pick out the real Hubble images from my generated images.

yes_i_can · on May 18, 2022

I'm wondering if we'll ever get to a point where we can invoke fake vacation / travel experiences, like We Can Remember It for You Wholesale (more popularly, Total Recall), by creating ML-generated images of the trip rather than inducing a dream. It seems plausible.

cyberlurker · on May 18, 2022

That The Fifth Element (1997) scene linked in the article actually holds up well.

chris_va · on May 18, 2022

Amusingly I think most of that is not actually CGI (as we now define it).

They had a giant warehouse with a toy NYC that they flew a camera through with little models... pretty nuts given how movies are made now.

The making of Fifth Element is a pretty great watch.

dekhn · on May 18, 2022

Take a look at https://www.youtube.com/watch?v=UENRVfdnGxs 3:48-4:02. 5 seconds of screen time tell more story than many science fiction movies do in 2 hours.

Detail in https://www.reddit.com/r/sciencefiction/comments/53p7gw/orig...

gwern · on May 18, 2022

I don't buy the argument we'll just automatically learn to see the CGI as fake. This is a selection effect: you see as fake only the CGI you see as fake; if it just looks real and fools you, how will you ever know you were fooled? Early CGI was bad, but it kept getting better every year. When I watch documentaries or CGI breakdowns, I'm routinely shocked at how things I would never in a million years have predicted were fake were fake. When someone shows a clip of a glowing floating space alien octopus, you can know it's fake because you know there are no space alien octopuses in real life; but when it's some box in a corner with some stuff on it which the director added in prod because he realized he needed a more homey and cluttered feeling for that scene...

phdelightful · on May 18, 2022

I want to see a fully-synthetic multimodal social media influencer that is nearly indistinguishable from reality. She does the same thing as the real ones except everything is completely artificially generated (housing, clothes, vacations, social circle). All text/image/video posts are completely synthetic but internally-consistent with this fabricated persistent universe. The only real things would probably be product placement. If you’re a brand, you’d just make a new online influencer instead of finding an organic one.

nullc · on May 18, 2022

> It still struggles with faces

I believe they intentionally hobbled it in this respect for "safety" (iow to keep themselves out of a scandal when someone asks it to create "President Biden accepting Bribes" or whatnot...)

Certainly far simple diffusion models trained including faces do just fine at creating photorealistic faces.

guerrilla · on May 18, 2022

Why do you believe that?

egeozcan · on May 18, 2022

Coming up with ok-looking generated faces is not ground-breaking in the ML world and when the state-of-the-art model botches it completely, you start looking for the reason.

shadowgovt · on May 18, 2022

How compose-able are ML solutions? I'm wondering if coming up with ok-looking generated headshots (face at a specific, known, predetermined angle, in a neutral and standardized context, independent of background features) is not ground-breaking, but embedding that capability in a dynamic image generator with myriad other objects, headgear, lights and shadows, etc. may be.

gwern · on May 19, 2022

Facebook's Make-A-Scene, which looks almost as good as DALL-E 2, does something very much like that, using face-specific targeting to boost the quality of that part of images: https://arxiv.org/abs/2203.13131#facebook

lupire · on May 18, 2022

And coming up with very good looking modifications/poses of known actual faces is very well achieved.

melissalobos · on May 18, 2022

It was mentioned in the DALL-E github repo.

nullc · on May 18, 2022

Because they said so:

  Preventing Harmful Generations

  We’ve limited the ability for DALL·E 2 to generate violent, 
  hate, or adult images. By removing the most explicit content 
  from the training data, we minimized DALL·E 2’s exposure to 
  these concepts. We also used advanced techniques to prevent 
  photorealistic generations of real individuals’ faces, 
  including those of public figures.

vmception · on May 18, 2022

Is there a fork that unhandicaps it?

nullc · on May 18, 2022

Like a lot of other Open AI work, DALL-E 2 isn't open. There are people working at re-implementing it, but training has considerable computational costs.

not2b · on May 18, 2022

To do such a fork, wouldn't you have to build a completely new model with the same training data and everything else the same except leaving out the restrictions?

somebodythere · on May 18, 2022

You should be able to finetune on a dataset that has captions of human faces.

heavyset_go · on May 18, 2022

No, you can use transfer learning.

hnthrowaway0328 · on May 18, 2022

I think the Turing Test is more than that?

dekhn · on May 18, 2022

The turing test isn't very useful. At its core, it merely tests whether a computational agent can imitate a humnan agent well enough to fool an typical individual.

Which tells us nothing at all.

killerstorm · on May 18, 2022

Turing did not mention a "typical individual". The question is whether it's possible to make AI which is indistinguishable from a human. Obviously, it makes sense if an interrogator comes prepared if we want to test this.

gopher_space · on May 18, 2022

We had AI indistinguishable from a bored teen who isn't really into the conversation decades ago. It's also really easy to pass a turing test if your model is a pissed off friend who won't respond to your texts.

Who are we trying to talk to, I wonder?

killerstorm · on May 18, 2022

Well, Turing explicitly formulated it as a _game_, and for a game to be meaningful, players have to understand rules and have a desire to win. And given that the question was "Can machines think?", a human playing the game should have a good thinking ability.

Game with bored, disinterested players would be entirely meaningless.

gopher_space · on May 18, 2022

It's looking more like "how can I tell if I'm talking to a machine unless the potential person on the other end is strapped into a torture chair and forced to respond rationally to my inane philosophical ramblings."

And I'd like it more as a Gedankenexperiment if people weren't talking about it as a tool or metric. That kind of thinking gains momentum.

lupire · on May 18, 2022

We are so blase about the amazing progress in computation.

dekhn · on May 18, 2022

There's nothing in the turing test or systems that solve it that are really amazing progress. To me they are just an expected and obvious outcome of the general improvement of scientific modelling of reality.

Here's something that I'm not blase about: AlphaFold. That is one of the crowning achievments of humanity. It solved a problem that people have been working hard on for 40 years using an algorithm that is less than 5 years old on a computational framework that's a couple years old, on hardware with ML training powers orders of magnitude higher than anything that existed 5 years ago, and conclusively demonstrated that evolutionary sequence information, rather than physical modelling, is sufficient to predict nearly all protein structures. And, once the best competitor had a hint how it was done, they were able to reproduce (much of) the work in less than a year.

Now that's amazing. World-class. Nobel-prize worthy. Totally unexpected for at least another 10 years, if ever. Completely resets the landscape of expectations for anybody doing biological modelling. However, it also won't transform drug discovery any time soon.

dredmorbius · on May 18, 2022

https://news.ycombinator.com/item?id=31426512

cookingrobot · on May 18, 2022

The images look great because only the last 4 pictures on this blog are the fake ones.

All the first impressive looking shots at the top of this article are real.

zone411 · on May 19, 2022

Fun idea. We're about test out melodies I've created together with a generative neural net and see how they're rated computer to real melodies. The plan is use Amazon's Mechanical Turk but one problem is what to do about real melodies that people will mark as already familiar to them. I think making it a comparison with unfamiliar melodies should be fine?

stavros · on May 18, 2022

Is DALL-E still invite-only?

_just7_ · on May 18, 2022

Yep, though from what I heard they are planning to role it out quicker than gpt-3, with a full launch this summer

_7bxa · on May 18, 2022

Hacker News has this surprising tendency to cling to the past.

People here are nitpicking over the definition of the Turing test. What actually matters here is that, if not already now, but certainly in 1-5 years neural nets will most certainly be good as the 99th percentile artist.

Does that mean AGI is here? Probably not. But we are missing the forest for the trees.

llamaimperative · on May 18, 2022

Nothing more "missing the forest for the trees" than implying that 99th percentile artistry is about producing photorealistic representations of unreal things.

optimalsolver · on May 18, 2022

It would be in concept art.

Also note that DALL-E isn't limited to photorealistic styles.

IshKebab · on May 18, 2022

> Hacker News has this surprising tendency to cling to the past.

Definitely, but this isn't an example of that, it's an example of people on Hacker News not wanting things to be wrong. The title is clearly wrong.

anotheryou · on May 18, 2022

It's kind of an alien world to us, no lighting like we know it, all organic shapes with a lot of unidentifiable stuff, blue tint etc. It all helps to make it an easier case.

dalle-e is still impressive, but taking this to the extreme it would be like making it simulate pictures of TV noise and show we couldn't tell it from the real thing.

aftbit · on May 18, 2022

>Could I use DALL-E 2 to create a fake vacation? Or, more ethically, could I use DALL-E 2 to recreate events from my vacation that actually happened, but I was unable to get a good photo of them?

What would be unethical about creating a fake vacation? As long as you're not defrauding anyone, I don't see who would be hurt by this.

taylorius · on May 18, 2022

Your holiday photos are memories. If you create fake images, and mix them with genuine ones, don't be surprised if in the future, you yourself forget what is real and what is not.

munk-a · on May 18, 2022

If taken to extremes - only a Voight-Kampff test would be able to tell whether your holiday was entirely made out of whole cloth or not.

taylorius · on May 19, 2022

I suspect a lot of today's instagramming types would already fail that test. :-)

groos · on May 18, 2022

Just nitpicking but the 'Turing test' can only be failed, not passed, which is quite apt given another problem associated with Turing: the halting problem.

trinovantes · on May 18, 2022

How will our legal systems cope if one day you can conjure up any "evidence" you want? We're still safe today but the future will be scary.

nautilus12 · on May 18, 2022

Every marketing company making money off influencers and organic content like this just silently screamed.

llaolleh · on May 19, 2022

It's not a Turing test if you don't tell them that it's a test.

WoodenChair · on May 18, 2022

Shocking—photos made out of a data set of existing photos look like photos!

Terry_Roll · on May 18, 2022

So when are you vacating on the Moon or better still, can you beat Elon Musk to Mars?

ars · on May 18, 2022

You don't need DALL-E 2 to make those photos, you can just download generic underwater photos from other people.

In fact how do you know DALL-E actually created them, and did just regurgitate some it was trained with?

LeanderK · on May 18, 2022

> In fact how do you know DALL-E actually created them, and did just regurgitate some it was trained with?

I think DALL-E is not released so researchers are unable to take it apart yet, but this question was already researched a lot in the context of other generative models and so far they really did generalise (assuming a well trained model, the can overfit).

tedunangst · on May 18, 2022

Where did dall-e find a photo of a scuba diver with incorrect hoses?

jhfds · on May 18, 2022

Just ask? "photo of a scuba diver with incorrect hoses" :-)

gus_massa · on May 18, 2022

@GP: This looks like a very interesting comment. What is wrong with the hoses of the scuba diver? Can you post an edited version in imgur or something with a big arrow pointing to the error?

gus_massa · on May 18, 2022

I tried to search in Google the images (during 10 minutes), but I didn't find any of them.

The image of the fish is strange. I found a few photos of similar fishes with vertical stripes but the fish in the image has squares.

drewzero1 · on May 19, 2022

The head is a very odd shape as well. If I saw a grouper that looked like that I'd be watching out for nearby barrels of nuclear waste on the seabed.