Deviation Game: Draw in a way that humans can guess, but AI can't

jfengel · on April 5, 2023

After AlphaGo, I scoffed at the idea that anything significant had been accomplished in terms of general intelligence. I said, "Call me back when it can play Pictionary".

Uh...

https://devpost.com/software/pictonet-ai-pictionary

I still don't think it's anything like AGI, but my snappy comeback is nowhere near sufficient any more.

btilly · on April 5, 2023

Biological intelligence is the result of a set of connected modules with specific capabilities that are individually not intelligence, but which in cooperation allow for behavior we consider intelligent. However actual human intelligence is error prone. We are easy to fool into a variety of silly mistakes by a variety of means, including misleading prompts.

For example person who just said "spot" 10x when asked what to do at a green light usually says, "Stop!" Likewise a person who has just spelled "silk", "S-I-L-K", when asked what cows drink usually says, "milk".

The behavior of LLMs like ChatGPT are very similar to what is observed from the https://en.wikipedia.org/wiki/Left-brain_interpreter in humans. Complete with immediate facile explanations that hallucinate facts not in evidence. We do better than ChatGPT on this in large part because other parts of our brain supply immediate fact checks for a lot of the obvious mistakes. Similarly ChatGPT when combined with tools and patterns of when to check them also improves dramatically in the quality of its output.

Therefore I agree that ChatGPT is NOT AGI. But we may be far closer than you think to having a bunch of components which, when wired together right, DO act like something that reasonably should be called AGI. Whether it is becomes a definitions game. And I foresee a period of apparent infinite regress of changing definitions for people to justify that we don't have AGI, until we're faced in finite time with what has become clearly a superhuman intelligence.

rthomas6 · on April 5, 2023

I think a big difference is that humans have the ability to introspect, and examine their own mental processes to see how they drew a conclusion and what information went into it. We can then choose to update our behavior. Sometimes we "hallucinate" that information, but we do have some level of insight into ourselves. ChatGPT has no ability to examine itself, and in fact has no idea what it is doing. This, I believe, is a big shortcoming of neural nets in general. If we could overcome this hurdle, I think it would be a lot closer to an AGI.

btilly · on April 5, 2023

Sorry, not buying it.

Neuroscientists who have attempted to compare what people think about how we think and how we ACTUALLY think have concluded that self-reports are entertaining to listen to, but not very informative. That is, we have more of an illusion of introspection than a reality.

The level at which we ARE self-aware of our own thinking does not seem to me to be more impressive than the ability of something like ChatGPT to generate training data that can in turn be used to help create much smaller models with similar capabilities, or future models with better ones. In other words LLMs are capable of some form of learning through introspection, and we do not appear to be close to the limits of how much self-improvement they are capable of.

But https://simonwillison.net/2023/Mar/17/beat-chatgpt-in-a-brow... suggests that for less than the average developer's salary we could develop something equivalent to ChatGPT in a browser. And better systems are coming soon.

pixl97 · on April 5, 2023

> ChatGPT has no ability to examine itself,

This isn't 100% true.... I think people are using something called the Reflexon model where you ask GPT for an answer, then feed that answer back to GPT and ask what is wrong with it allowing self examination and a significant improvement of its previous answer.

Now, GPT has no method of long term learning by reincorporating these answers back into its system to improve its future behavior, which is a huge limitation.

pingohits · on April 5, 2023

From the paper, seems like Reflexion is an outer loop that provokes GPT to re-evaluate itself, which says nothing about the self-reflective capabilities of the actual GPT model.

But overall, I find there's a bit of a boundary problem when considering current ML models. Is GPT considered self-reflective if we embed it in a system that mimics such an ability?

btilly · on April 6, 2023

The boundary problem you describe seems unimportant to me. Who cares what we classify GPT as when we have successfully built a system that clearly has that capability?

Feedback loops are critical to intelligent thought. We know this, and it should come as no surprise that we'll need to explicitly add them to AI systems.

lukko · on April 5, 2023

Interesting - it reminds me of the ‘strange loopiness’ in Douglas Hofstadter’s books - we have a model of ‘I’ that reflects in on itself.

We obviously have a huge evolutionary pressure for having a semi-accurate internal model of ourselves that we can use to project out into the future and avoid possible harm - I guess there is no similar pressure / incentive for LLMs.

nh23423fefe · on April 6, 2023

But it was never sufficient? Any observable behavior can be computed by some neural network, so what were you getting at? Are you implying Pictionary was AGI-complete but now its not?

Like, the only condition you are looking for is being subjectively impressed, which just means you are using a bad metric for threshold finding right?

jfengel · on April 6, 2023

I'm implying that I thought Pictionary was AI-complete, or very close to it, but I was wrong.

I'm assuming that we're talking about a Pictionary in which you could draw any concept in the world. It would be vastly easier if you restricted it to just the actual Pictionary cards, but that's not the spirit of what I meant, and I don't think it's what my link above is doing.

I believed that something that open-ended would not be feasible any time soon without effectively solving general intelligence. I think I'm wrong about that -- though I suppose it's still possible that it's just a matter of scaling what they've already got.

animal_spirits · on April 5, 2023

The designers are StudioPlayFool, and have a lot of really cool work.

https://www.studioplayfool.com/work

robbomacrae · on April 5, 2023

This looks like a really smart way to gather difficult training data that helps improve the model. Luis von Ahn would be proud!

jprete · on April 6, 2023

I agree. Although it's annoying that people have gone so far towards "take every bit of data that's not nailed down" that anything I post on the Internet could be used against me by an AI replacement competing with me.

lukko · on April 5, 2023

For people in London, the game is on show at Somerset House this week: https://nowplaythis.net/

Lots of good stuff.

postsantum · on April 5, 2023

I wonder who is behind this website. Clever girl

mitthrowaway2 · on April 5, 2023

The credits listed are Tomo Kihara and Playfool (Daniel Coppen & Saki Coppen). No velociraptors in sight.

IIAOPSW · on April 5, 2023

Hi, my names Roberto Realmann and I like inhaling and exhaling so you can trust me.

cyanydeez · on April 5, 2023

"teach AI to pass as human"

Jeff_Brown · on April 5, 2023

"Have fun generating the data we need" has proven a solid approach in other contexts. Duolingo is one of them.

reececomo · on April 8, 2023

This is fantastic