Passing the Turing test requires, for any human judge:
1. This judge is aware that they have to discern whether the 'bot' is real or a machine.
2. The judge cannot discern whether the 'bot' is real or a machine better than random chance.
This failed 1. And even given that advantage, might have failed 2 as well?
Often I see headlines along the lines of "X fools people and beats the Turing test!". But the point of the Turing test isn't to trick a person, it's to make it functionally impossible for a person to distinguish between the real and simulated thing, no matter how hard they try. For something to pass a Turing test, it would need to be able to pass the following:
"Anyone can play as judge any number of times. You can take as long as you want, and if you're successful in breaking it under controlled conditions (IE, you don't cheat and use an out-of-band communication protocol with the 'bot'/human), we'll give you a 10,000,000$."
I mean, you can read Turing's own definiton if his test - 'the Imitation Game' - on the first page of his 1950 paper Computing Machinery and Intelligence[1]. There's nothing in there about repetition, duration, or $10,000,000 prizes. It's a party game. And he just frames his question (which will "replace
our original, 'Can machines think?'") as "Will the interrogator decide wrongly as often when the game is played like this [with a human and a computer] as he does when the game is played between a man and a woman?"
So, to perform the experiment, one must have some people play the game with humans a few times and then play with a human and a machine a few times, and look to see if the results are statistically significant. When they aren't, Turing posits, the question 'can machines think?' will have been answered in the affirmative.
That is not to say that this DALL-E vacation photo social media post constitutes a rigorous 'passes the Turing test'. But I don't think it's fair to criticize someone for using 'the Turing Test' colloquially as a catchall for saying 'you probably didn't notice this output was machine generated, therefore you might want to adjust your priors on the answer to the question, "can machines think?"'. Because that's exactly the spirit that Turing was working in when he proposed using a party game as a test of intelligence.
There's the literal definition of the Turing test, as described above, which doesn't actually work for proving any sort of intelligence.
Then there's the conceptual argument of the Turing test, which we can turn into a test of intelligence. It relies on the idea that we can abstract the mind into a "thinking" black box which takes inputs and outputs. And then posits that any black box which can't be distinguished from an actual "thinking" black box may as well be "thinking".
Passing the literal Turing test is a sign that some humans can be tricked for some small domain over some small period of time. Passing the conceptual argument that the Turing test relies on shows that there are non-human entities which cannot be separated from humans on the basis of how they interact with a human (through some limited medium).
The repetition, duration and prizes are just practicality; prizes incentivize people to try, repetition ensures that the results are robust, and duration ensures that the domain the AI can be tested over is equivalent to a humans.
Meanwhile, if you ran an 'rigorous image-generation Turing test' between dalle2 and randomly selected humans, the machine would be obvious because it's much higher quality than a randomly selected human would be able to produce, thereby failing the Turing test.
Aside from some corners (probably to be filled over the course of the next year or three), dalle2 is obviously outperforming almost all humans at its task. The cross-style ability is probably exceeding almost all human /artists/, who tend to specialize in a single style.
And some of the creativity criticisms (can only styles it's seen before) are basically true of all but the tiniest sliver of humanity, whose names we tend to remember.
Funny thing is this is going to happen with all AI generated things. If and when GPT-4 comes out for writing and it's as much better than GPT-3 as GPT-3 was from 2, then it will be easy to know if a given piece of writing was human or AI generated. Is it grammatically correct, concise and to the point, and have a spark of life and originality to it? Then that's something that 90% of humans can't do and what AI will be able to do every time, so you can bet it's AI and beat the Turing test!
Just threw some of Turing's original prompts at GPT-3 to see how it does. Its willingness to play along with the conceit is what gives it away, not the quality of its answer:
This is a dialog between Q, a human who is trying to work out if he is talking to a computer, and A, a computer who is pretending to be human:
Q: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?
A: No, I don't think so. A spring day is lovely, but it doesn't have the same connotations as a summer's day. Summer is a time of warmth and languor, of long days and lazy afternoons. It's a time for picnics and swimming and lazy days in the sun. Spring is a time of rebirth and new beginnings, of freshness and hope. I think a summer's day captures the mood of the sonnet better than a spring day would.
"But I don't think it's fair to criticize someone for using 'the Turing Test' colloquially as a catchall for saying 'you probably didn't notice this output was machine generated, therefore you might want to adjust your priors on the answer to the question, "can machines think?"'"
The colloquial meaning of "pasted the Turing test" has come to be "has been able to demonstrate intelligence when put to some serious, systematic testing". That may be switching that "has been able to fool people when they didn't look hard". That might be changing but I don't think it's changed yet and until it's changed, I'll protest 'cause that's terrible change imo.
That's not at all in the spirit of Turing's paper. He "party game" involves the interrogator knowing that they participating in it, talking to the machine for 5 minutes, posing serious challenges such as playing chess (at the time considered to be a hard problem for machines).
You must constantly be aware that images, or text, or voice, or other audio, or other signals or data, might be computer generated or altered.
All the time.
And you individually, or those about you, or societies at large, may be influenced in large or small ways by such signals, patterns, and records.
Your elderly neighbour or relative might be scammed out of life savings. Investors of false product claims. Voters of some fake outrage --- particularly of the October Surprise variety. Soldiers and diplomats of mock attacks, or false depictions of a tranquil situation where in fact danger lurks.
this has been true since photoshop existed? you don't need an algorithm to make a fake picture or to fake a military attack, etc. humans generally have better grasp of appropriate details
The biggest problem with this line of response is not only that it utterly ignores scale, cost and speed, as has already been addressed in responses, but consequence.
Yes, in the 8th century BCE one could commission bards to gin up epic sagas of conquest and atrocity to stir the public spirit. In the 15th century, one could hire one of the most brilliant sculptors and painters of all time to create artpieces legitimising a power-hungry thieving dynasty which had seized control of banking, commerce, and the Church. In the 16th century, an earlier horde could by mobilised by an Orange tyrant through pamphlets cheaply printed and distributed. In the 19th century, Wreck-a-Feller could intercept (and rewrite) telegraph transmits, newspapers could sensationalise photographs, and preachers could roll holy until districts burned out.
And so into the 20th and early 20st centuries.
What was not possible until quite recently was for high-fidelity, all-but indistinguishable, continuous imagery, audio, and video to be produced in realtime and distributed globally.
The time between fraud and mass impact is now measure in seconds, and can reach globally. This is well within response intervals not only of populations and institutions, but of individual weapons of mass destruction. The risks of massive consequence are huge.
And that is what is new and novel.
The interaction with other technologies raises yet more concerns. With Slaughterbots a highly-plausible if not actual reality, "leaked" video showing a plausible Slaughterbots attack might itself be a weapon of war. (https://yewtu.be/watch?v=vR91F3tp6eQ)
That is, one of the risks of some presumed capability or knowledge is that even the belief of its existence or validity becomes a tool to be used. The late senator from Wisconsin claimed to have, though never presented, evidence of disloyalty, a common practice in witch-hunts. With increasing scope and scale of data collection and storage, such claims become of themselves ever more plausible, with numerous effects. Concepts such as the Panopticon and "chilling effects" operate not based on actual or observed risks, but on presumed ones. Adam Smith in Wealth of Nations uses the word invisible twice. The unhanded one references the decline in moral amongst soldiers faced with sudden and invisible death at any time, and the costs of perpetual vigilance.
So, no, really, this is not the same. Scale matters. Effects matter.
It's a matter of quantity as well as quality. It gets easier to produce convincing forgeries all the time, but it doesn't get easier to debunk them. This allows for public discourse to be destroyed by flooding it with fakes.
Furthermore, the judge is supposed to be an expert. That is, not only his task is explicitly to tell between human and computer but he has to have a good idea on how to do it. Random people from the internet are not enough.
In the "paranormal challenge", the juges usually include stage magicians, because they know the tricks that would fool ordinary people. James Randi himself is a magician.
Conceptually, to pass the Turing test you should be able to fool any human (including expert judges). It's just practically easier to choose an expert judge as opposed to test on every human on Earth, since you'd hope that if an expert can't do it, then neither can anyone else.
Also, it shouldn't matter if the human is someone who worked on the AI, or has read the code, or has seen every previous Turing test that the AI underwent. There shouldn't be any information a person could know that would allow them to tell that it's an AI.
I have acquaintances that that couldn't pass this based on their common texting skills. There's probably no point to argue about what the Turing test "is", but the definition in this comment chain is pretty uncompromising - I can't imagine fooling any (meaning all, I assume) humans; there is too much ambiguity in the signal.
"Are you going to be at the blah blah because I need blah". response: "I really wqnt it". Nonresponsive, typos, what does 'want' refer to? Who knows. Is this a bad bot, someone spun out on meth, someone with cognitive processing issues, a busy mom texting while distracted, or ?
> I can't imagine fooling any (meaning all, I assume) humans; there is too much ambiguity in the signal.
I think you've inverted the expectation here. What I was saying is a machine passes if you can't find anyone in the world who can distinguish between it and an actual human. Meaning that if the "world's smartest person" can distinguish between it and an actual human, it fails, even if it can fool everyone else.
A machine can pass the test by deliberately feigning to be a human with limited communication capabilities (ex: I think we can simulate 2 month old baby talking via text). But then all you've shown is that your machine is as capable of thought as a 2 month old baby, which probably isn't the bar that you're trying to reach.
I think another important factor here is that it is unclear if the OP cherry picked photos or used the first ones given. Dall-E 2 has a bias to be better at real world scenes since it can just pull (near) memorized images, but I also wouldn't be surprised if these images were downselected from a larger set.
Great point. I should've added a third restriction: the 'bot' should not be able to communicate with, or use the judgement of, humans once it's started.
The original version of the test allows for interaction, and I think that’s probably a good thing. Language models currently have a hard time staying consistent/on topic and that’s a potentially valuable tell.
Instead, I think you don’t want any third parties “editing” or “curating” the exchange (beyond whatever blinding is needed to make it work).
Yeah I think the interaction is a key part. People are confusing AGI with useful AI. DALL-E is very clearly useful even if 90% of the images it produced were unusable. The time save is still large and cost reduced for simple tasks. Same with language models. They may have a hard time staying consistent over long sequences but adding human selective pressure to responses saves everyone time and money. But this is very clearly different from a machine that can think and form new ideas on its own. We're still a long way from that.
If someone told me they found a magic lamp and one of their wishes was that every time someone misused the term "Turing test" they got smacked in the face by an opening door I'd think "not bad, not bad".
As usual, the headline is more sensationalised than the actual article
> It's likely that with a harder version of the Turing Test, in which real and fake images of the same content are presented side by side and people are told that one of them is fake, it would be much easier to detect the fake images.
It doesn't really matter for the purposes of the test how confident they say they are.
If you're asking "do people often imagine they can confidently distinguish things when they actually can't" then the answer is a solid yes - things like audiophile and wine testing have proven that again and again.
if a human participates in interactive gamified social media, and this participation begins to change, shape, reinforce or otherwise mutate their beliefs, for the purposes of the test are they still actually a human? could the entire social media mechanism (from the builders to the participants) be considered a form of a sort of singleton autonomous intelligence in and of itself?
The snarky side of me wants to say it's because people take boring photos
But really it's just information overload, most things on social media I just scan the thumbnail and move on. Only my family would care to see my vacation photos :D
Diver here. I'm looking at these four pictures after the fact, so I already know they're fake. They're good, but they also have some weird flaws. That said, I don't think I would have immediately recognized any of these as wrong on Facebook (maybe the diver photo).
- The nudibranch (slug thing) on the green coral doesn't look like anything I've seen in the Caribbean before, and the coral also looks odd for the region. That said, this is probably the most difficult photo for me to differentiate. I would have accepted this as a cool find of something I haven't seen before.
- The grouper (big fish) photo is actually pretty good, although DALL-E has misplaced its eyes a bit. That said, the lit foreground and dark noisy background are exactly the look I would expect for someone using a basic camera + lights with wonky post-processing.
- The diver photo is a horror show. There's a hose going nowhere on her back. It looks like she's blowing out of a harmonica instead of a regulator. Bubbles are collecting around the top of her mask for some reason. Her fins look like they were badly Photoshopped. Nothing looks right here.
- The lobster photo has a real but subtle flaw: Caribbean lobsters don't have big claws. It also looks like it's under a rock like you would find in cold waters around Massachusetts and Maine instead of the Caribbean.
Interesting stuff though. It will force me to be more skeptical when I look at people's photos in the future.
This is absolutely true. Look at text prediction models as an example (e.g. GPT-3). One of the biggest (if not the biggest) applications was story-generation tools like AI Dungeon. Guess what most people actually used AI Dungeon for? Erotica. Guess what happened when OpenAI cracked down on it? A huge portion of the userbase jumped ship and built a replacement (NovelAI) using open-source EleutherAI models that explicitly did support erotica, which ended up being even better than the original ever was. I can tell you that there is very strong interest in nsfw image generation in those communities, as well as multiple hobby projects/experiments attempting to train models on NSFW content (e.g. content-tagged boorus), or bootstrap/finetune existing models to get this sort of thing to work.
Interesting that they talk about how hard it is to make a business out of it:
>I felt at this point that I’d hit a dead end. Press and fundraising would be tough and require some extra creativity and force. I spoke to friends about hiring them, and had polarized answers. Overall, this project had become less appealing.
Seems to me there is a lot of money to be made here if it works. The interesting thing about porn compared to say TV is that people have very, very specific interests and basically just want an infinite amount of content within that interest. It's not like with TV where cooking shows become popular, and the people that used to watch dramas are now watching cooking shows.
So the ability to generate highly specific content tailored to an individual's very precise requirements seems potentially very lucrative.
This might be horrible to say, but could this be a solution to csam? From what I've seen most people who enjoy csam do genuinely feel bad for the children, but they're sick, and can't control themselves. Might they be willing to indulge in fake csam instead?
I don't think it's horrible, it just seems like a practical solution. It is such a taboo subject that nobody seems to really talk about the possibilities, but it's worth asking the question -- if someone so inclined can gratify their desires in private with fake imagery, will it prevent them from leaving their home and seeking out someone to hurt?
Or will it strengthen their need for 'the real thing' as someone else suggested in a sibling comment?
In any case, we still don't have a great answer for the legal question. Possession of realistic fake imagery is illegal, on the grounds that its very existence is a risk to children. There isn't any actual science behind that, it's just what the politicians have said to justify regulating what would otherwise be a constitutionally protected right. I imagine it will become a topic of discussion again (my quick research says the last major revision to US law in this regard was about 20 years ago).
It seems more misguided than horrible. I'm not a psychologist, but indulging in pathological behaviors would seem to strengthen them. Heroin addicts need to quit, not use methadone forever.
Is that true? Someone on HN once described it like "eventually you get tired of the addiction and want to quit" (if you live long enough, that is). No personal experience, but I have known a couple former addicts and this seems to reflect their reality.
Maybe an effective approach would be to maximize harm reduction until the addiction has run its course? That seems to be the Portugal solution, and it seems to be successful.
The truth is that addiction, like all mental illness, is complex and unique to each individual.
Diagnosing mental illness generally consists of identifying some number of symptoms out of a possible list - often something like five out of eight possible. That means two people can be diagnosed with the same thing with only two overlapping symptoms.
So basically, don't listen to the guy who starts with "I'm not a psychologist" and then decides to play armchair psychologist.
Let us take this lesson from "pray the gay away" camps and other types of "conversion therapy": You cannot take these preferences out of people, it just does not work. At best, you can make it clear how they are is a travesty and a lot of them will hide it for the rest of their life successfully. This is not a good solution compared to what you were talking about earlier because it increases human suffering by a lot and doesn't have an advantage over a suffering-free solution.
That said, I don't think I could justify to myself to create such an AI. I've quite simply been so disillusioned by the depravity of man and especially in this case I want nothing to do with it even peripherally. Perhaps that makes me a little hypocritical. Philosophically it would still be nice to solve this issue, even if it is just so no more children need to suffer (which should always be the main goal).
> I'm not a psychologist, but indulging in pathological behaviors would seem to strengthen them. Heroin addicts need to quit, not use methadone forever.
Yes, clearly not a psychologist, nor an addiction treatment specialist. Methadone is often used indefinitely as maintenance therapy for opioid use disorder.
I believe that as of 2003 it has to be a realistic fake, however. Obvious cartoons are no longer illegal.
I imagine it'll get challenged again at some point on constitutional grounds. It is illegal right now on a moral basis, which is probably the weakest argument over the long term.
I'm aware that Japanese lolicon in anime, manga, video games, and other contexts, is at least ... problematic ... in numerous areas. Several online sites have banned it, and on Mastodon and the Fediverse, there are often peer-level blocks against sites in which lolicon is permitted.
Then name itself is a portmanteau of Lolita complex, after the Nabokov novel.
Correct. Differs from, for example, rules protecting cruelty to animals. You can fake such cruelty without consequence - as is done in movies regularly.
More interesting question is this. Is it a crime if you generate CSAM just for ones own consumption?
> Is it a crime if you generate CSAM just for ones own consumption?
Yep. If it isn't obviously fake (i.e. a cartoon) the possession is illegal whether you produce it yourself or not. Though it's probably safe to say that you're unlikely to get caught if you're not sharing those images with other people.
Well, the US law says "[when it] appears virtually indistinguishable [from the real deal]" (insert appropriate legal terminology on either end, but the three quoted words are the relevant bit.
I think we're in agreement that the advancement of the technology is going to make this topic come back up for legal debate. When the gulf between CGI and real photography was large, it was pretty straightforward. Not so much now.
I think the hard part is there is close to no way to know its fake. But you could also that if its so easy to fake photo realistic content, why would you ever make real content.
Porn seems to quietly power the Internet, in so many ways. I imagine people are already getting creative with fake porn, and it's only going to intensify over time. Especially on the types of imagery that are illegal to possess.
One of the primary features of DALL-E 2 is inpainting. Thus, if they allowed it, you could easily just paint out a celebrity's clothed figure and ask DALL-E to replace it with a nude.
The article claims (in a screenshot without quoting sources, so take it for what it's worth) that "A recent blog post pointed out that GPT-3-generated text already passes the Turing test if you're skimming and not paying close attention".
This is certainly debatable, and I agree that it is pushing the limit a bit.
I think in the end, the "Turing test" was devised as a thought experiment, not as a final definition of AI. So I guess some freedom of interpretation is reasonable.
I agree with you that it brings things outside the original scope of the Turing test.
I do find it interesting to observe that a metric based on casual observation can have value in a society where elections can be swayed by online fakery.
The images look great but I think the experiment was helped by the fact that most of us can see any image, being told that the image is underwater and we'll believe it. Most of the people don't know about deep waters, and the plans and animal that live there.
I suspect the experiment would have been different if the vacations were in a city or a beach or something like that.
One cool application of DALL-E could be generating a painting or sketch for each paragraph or sentence of novels. Imagine listening to an audio book of famous novels with visuals/cartoons made by AI.
Hope someone with invite access could do this for Moby Dick or Sherlock Holmes stories or 1984.
DALL-E so far has no way to create consistent characters across multiple pictures. However good news if this discriminator approach https://blog.salesforceairesearch.com/gedi/ caries over to images such constraints could be imposed. This is just a handful of follow up publications away.
There was an anime art generator posted here recently which could do this. If I understood correctly, they manually went through all the parameters and worked out which ones correlated with things like hair color, face type and built a web UI which could fix certain known meaningful parameters and allow the user to design a character and then the final step showed the same character in a bunch of different poses.
Yes, you reverse-engineer the latent. This is a very old approach in GANs by this point, going back probably to like 2015. (The canonical examples were making faces smile or adding sunglasses.) If you want to see it done for a lot of different models, check out Artbreeder: that's how all of the editing attributes are done.
Diffusion models don't really have a latent but the CLIP embedding would serve the same function. The problem is, OA would have to implement it themselves. There's no way you can implement it as a user with the current interface. (This is also true of alternative methods like gradient ascent or GEDI etc.)
Actually i need to correct myself. Something along these lines. The mechanism of DALL-E is a bit different but a discriminator approach should work anyway although maybe some magic might be needed to ensure the spatial invariance of the discriminator.
This is a flawed experiment. If I see a bunch of photos, and many of them look real at first glance, I’m not instantly going to critique whether all of them were real, unless I was given specific instructions to do so.
Also, underwater photos are not something many people have personal experience seeing. Most of us don’t live underwater. We may not be equipped well enough to tell the difference, where above water, especially urban photos, we will likely notice better.
> My deepfake DALL-E 2 vacation photos passed the Turing Test
Most people didn't notice that some of my vacation photos were fake, therefore it passed the Turing Test... why is this clickbait nonsense getting so much attention?
Can someone who upvoted this article explain why you upvoted it? Did the fact that the title is flatly false not bother you? If someone wrote an article about cracking some encryption algorithm and titled it "I proved P=NP" would you upvote it?
It is easy to pass Turing tests when the subject material is unfamiliar to people. As other posters have mentioned, most people have only a vague idea of specific underwater plants and animals and vague ideas of how the water distorts light.
I bet I can come up with a simple generator that generates galaxies/nebula pictures and if I interspersed those in with NASA Hubble generated images, most people could not pick out the real Hubble images from my generated images.
I'm wondering if we'll ever get to a point where we can invoke fake vacation / travel experiences, like We Can Remember It for You Wholesale (more popularly, Total Recall), by creating ML-generated images of the trip rather than inducing a dream. It seems plausible.
I don't buy the argument we'll just automatically learn to see the CGI as fake. This is a selection effect: you see as fake only the CGI you see as fake; if it just looks real and fools you, how will you ever know you were fooled? Early CGI was bad, but it kept getting better every year. When I watch documentaries or CGI breakdowns, I'm routinely shocked at how things I would never in a million years have predicted were fake were fake. When someone shows a clip of a glowing floating space alien octopus, you can know it's fake because you know there are no space alien octopuses in real life; but when it's some box in a corner with some stuff on it which the director added in prod because he realized he needed a more homey and cluttered feeling for that scene...
I want to see a fully-synthetic multimodal social media influencer that is nearly indistinguishable from reality. She does the same thing as the real ones except everything is completely artificially generated (housing, clothes, vacations, social circle). All text/image/video posts are completely synthetic but internally-consistent with this fabricated persistent universe. The only real things would probably be product placement. If you’re a brand, you’d just make a new online influencer instead of finding an organic one.
I believe they intentionally hobbled it in this respect for "safety" (iow to keep themselves out of a scandal when someone asks it to create "President Biden accepting Bribes" or whatnot...)
Certainly far simple diffusion models trained including faces do just fine at creating photorealistic faces.
Coming up with ok-looking generated faces is not ground-breaking in the ML world and when the state-of-the-art model botches it completely, you start looking for the reason.
How compose-able are ML solutions? I'm wondering if coming up with ok-looking generated headshots (face at a specific, known, predetermined angle, in a neutral and standardized context, independent of background features) is not ground-breaking, but embedding that capability in a dynamic image generator with myriad other objects, headgear, lights and shadows, etc. may be.
Facebook's Make-A-Scene, which looks almost as good as DALL-E 2, does something very much like that, using face-specific targeting to boost the quality of that part of images: https://arxiv.org/abs/2203.13131#facebook
Preventing Harmful Generations
We’ve limited the ability for DALL·E 2 to generate violent,
hate, or adult images. By removing the most explicit content
from the training data, we minimized DALL·E 2’s exposure to
these concepts. We also used advanced techniques to prevent
photorealistic generations of real individuals’ faces,
including those of public figures.
Like a lot of other Open AI work, DALL-E 2 isn't open. There are people working at re-implementing it, but training has considerable computational costs.
To do such a fork, wouldn't you have to build a completely new model with the same training data and everything else the same except leaving out the restrictions?
The turing test isn't very useful. At its core, it merely tests whether a computational agent can imitate a humnan agent well enough to fool an typical individual.
Turing did not mention a "typical individual". The question is whether it's possible to make AI which is indistinguishable from a human. Obviously, it makes sense if an interrogator comes prepared if we want to test this.
We had AI indistinguishable from a bored teen who isn't really into the conversation decades ago. It's also really easy to pass a turing test if your model is a pissed off friend who won't respond to your texts.
Well, Turing explicitly formulated it as a _game_, and for a game to be meaningful, players have to understand rules and have a desire to win. And given that the question was "Can machines think?", a human playing the game should have a good thinking ability.
Game with bored, disinterested players would be entirely meaningless.
It's looking more like "how can I tell if I'm talking to a machine unless the potential person on the other end is strapped into a torture chair and forced to respond rationally to my inane philosophical ramblings."
And I'd like it more as a Gedankenexperiment if people weren't talking about it as a tool or metric. That kind of thinking gains momentum.
There's nothing in the turing test or systems that solve it that are really amazing progress. To me they are just an expected and obvious outcome of the general improvement of scientific modelling of reality.
Here's something that I'm not blase about: AlphaFold. That is one of the crowning achievments of humanity. It solved a problem that people have been working hard on for 40 years using an algorithm that is less than 5 years old on a computational framework that's a couple years old, on hardware with ML training powers orders of magnitude higher than anything that existed 5 years ago, and conclusively demonstrated that evolutionary sequence information, rather than physical modelling, is sufficient to predict nearly all protein structures. And, once the best competitor had a hint how it was done, they were able to reproduce (much of) the work in less than a year.
Now that's amazing. World-class. Nobel-prize worthy. Totally unexpected for at least another 10 years, if ever. Completely resets the landscape of expectations for anybody doing biological modelling. However, it also won't transform drug discovery any time soon.
Fun idea. We're about test out melodies I've created together with a generative neural net and see how they're rated computer to real melodies. The plan is use Amazon's Mechanical Turk but one problem is what to do about real melodies that people will mark as already familiar to them. I think making it a comparison with unfamiliar melodies should be fine?
Hacker News has this surprising tendency to cling to the past.
People here are nitpicking over the definition of the Turing test. What actually matters here is that, if not already now, but certainly in 1-5 years neural nets will most certainly be good as the 99th percentile artist.
Does that mean AGI is here? Probably not. But we are missing the forest for the trees.
Nothing more "missing the forest for the trees" than implying that 99th percentile artistry is about producing photorealistic representations of unreal things.
It's kind of an alien world to us, no lighting like we know it, all organic shapes with a lot of unidentifiable stuff, blue tint etc. It all helps to make it an easier case.
dalle-e is still impressive, but taking this to the extreme it would be like making it simulate pictures of TV noise and show we couldn't tell it from the real thing.
>Could I use DALL-E 2 to create a fake vacation? Or, more ethically, could I use DALL-E 2 to recreate events from my vacation that actually happened, but I was unable to get a good photo of them?
What would be unethical about creating a fake vacation? As long as you're not defrauding anyone, I don't see who would be hurt by this.
Your holiday photos are memories. If you create fake images, and mix them with genuine ones, don't be surprised if in the future, you yourself forget what is real and what is not.
Just nitpicking but the 'Turing test' can only be failed, not passed, which is quite apt given another problem associated with Turing: the halting problem.
> In fact how do you know DALL-E actually created them, and did just regurgitate some it was trained with?
I think DALL-E is not released so researchers are unable to take it apart yet, but this question was already researched a lot in the context of other generative models and so far they really did generalise (assuming a well trained model, the can overfit).
@GP: This looks like a very interesting comment. What is wrong with the hoses of the scuba diver? Can you post an edited version in imgur or something with a big arrow pointing to the error?
1. This judge is aware that they have to discern whether the 'bot' is real or a machine.
2. The judge cannot discern whether the 'bot' is real or a machine better than random chance.
This failed 1. And even given that advantage, might have failed 2 as well?
Often I see headlines along the lines of "X fools people and beats the Turing test!". But the point of the Turing test isn't to trick a person, it's to make it functionally impossible for a person to distinguish between the real and simulated thing, no matter how hard they try. For something to pass a Turing test, it would need to be able to pass the following:
"Anyone can play as judge any number of times. You can take as long as you want, and if you're successful in breaking it under controlled conditions (IE, you don't cheat and use an out-of-band communication protocol with the 'bot'/human), we'll give you a 10,000,000$."
The "One Million Dollar Paranormal Challenge" (https://en.wikipedia.org/wiki/One_Million_Dollar_Paranormal_...) is a solid example of a Turing test for magic.