GPT-4 reasons about a lot of gotcha logic puzzles correctly any pre GPT-4 opinio...

usrbinbash · on April 3, 2023

That changes exactly nothing about the validity of my statement.

Yes, GPT-4 is better at this mimicry than GPT-3 or GPT-3.5. And GPT-3 was better at it than GPT-2. And all of them were better than my out-of-fun home-built Language Model projects that I trained on small <10GiB Datasets, which in turn were better at it than my Poc models trained on just a few thousand words.

But being better at mimicking reason, is still not reasoning. The model doesn't know what a coffeemug is, and it doesn't know what a football is. It also has no idea how elevators work. It can form sequences that make it look to us that it does and knows all these things, but in reality, it only knows that "then Jenny would have left first" is a more likely sequence of tokens at that point, given that the sequence before included "started at the 10th floor".

Bear in mind, this doesn't mean that this mimicry isn't useful. It is, tremendously so. I don't care how I get correct answers, I only care that I do.

m4nu3l · on April 3, 2023

> But being better at mimicking reason, is still not reasoning

How do I know people are not using a similar process when they perform "reasoning" but with a way more elaborate model?

Can you prove me that the two are inherently different in the type of output they produce regardless of how large a ML model is or can be?

Because if you can't, and they produce the same type of output, the processing could be similar enough to be considered reasoning.

usrbinbash · on April 3, 2023

> but with a way more elaborate model?

Simple: I know that humans have intentionality and agency. They want things, they have goals both immediate and long term. Their replies are based not just on the context of their experiences and the conversation but their emotional and physical state, and the applicability of their reply to their goals.

And they are capable of coming up with reasoning about topics for which they have no prior information, by applying reasonable similarities. Example: Even if someone never heard the phrase "walking a mile in someone elses shoes", most humans (provided they speak english) have no difficulty in figuring out what this means. They also have no trouble figuring out that this is a figure of speech, and not a literal action.

TchoBeer · on April 4, 2023

>Simple: I know that humans have intentionality and agency. They want things, they have goals both immediate and long term. Their replies are based not just on the context of their experiences and the conversation but their emotional and physical state, and the applicability of their reply to their goals.

This all seems orthogonal to reasoning, but also who is to say that somewhere in those billions of parameters there isn't something like a model of goals and emotional state? I mean, I seriously doubt it, but I also don't think I could evidence that.

usrbinbash · on April 5, 2023

> but also who is to say that somewhere in those billions of parameters there isn't something like a model of goals and emotional state?

No one, but as is well established, absence of proof of nonexistence isn't an argument for existence. https://en.wikipedia.org/wiki/Russell's_teapot

m4nu3l · on April 5, 2023

Correct, but the problem is how you prove that for humans is by using the output and inferring that. You can apply the same criteria to ML models. If you don't, you need some other criteria to rule out that assumption for ML models.

usrbinbash · on April 5, 2023

For humans I can simply refer to my own internal state and look at how I arrive by conclusions.

I am of course aware that this is essentially a form of Ipse dixit, but I will do it anway in this case, because I am saying it as a human, about humans, and to other humans, and so the audience can just try it for themselves.

m4nu3l · on April 4, 2023

> I know that humans have intentionality and agency.

You assume that. You can only maybe know that about yourself. But my question was bit different. How do you know that the ML model doesn't?

> about topics for which they have no prior information, by applying reasonable similarities.

This is a contradiction. If you have no prior information about a topic you can't know even what topic is similar.

> Even if someone never heard the phrase "walking a mile in someone elses shoes".

Same for ML modes. They don't have a representation of every possible prompt.

usrbinbash · on April 12, 2023

> You assume that. You can only maybe know that about yourself.

I can also only say with certainty that planetary gravity is an attracting force on the very spot I am standing on. I haven't visited every spot on every planet in the universe after all.

That doesn't make it any more likely that my extrapolation of how gravity works here is wrong somewhere else. Russels Teapot works both ways.

> How do you know that the ML model doesn't?

For the same reason why I know that a Hammer or an Operating System don't. I know how they work. Not in the most minute details, and of course the actual model is essentially a black box, but it's architecture, and MO are not.

It completes sequences. That is all it does. It has no semantic understanding of the things these sequences represent. It has no understanding of true or false. It doesn't know math, it doesn't know who person xyz is, it doesn't know that 1993 already happened and 2221 did not. It cannot have abstract concepts of the things represented by the sequences, because the sequences are the things in its world.

It knows that a sequence is more or less likely to follow another sequence. That's it.

From that limited knowledge however, it can very successfully mimick things like math, logic, and even reasoning to an extend. And it can mimick them well enough to be useful in a lot of areas.

But that mimickry, however useful, is still mimickry. It's still the Chinese-Room thought experiment.

ianbutler · on April 3, 2023

I agree about the utility part.

However, I don't really accept the idea that this isn't reasoning, but I'm not entirely sold either way.

I'd say if it mimics something well enough then eventually it's just doing the thing, which is the same side of the argument I fall on with Searle's Chinese Room Argument. If you can't discern a difference, is there a difference?

So far GPT-4 can produce better work than like 50% of humans and better responses to brain teaser questions than most of them too, I'm at least just in a bubble and so I don't run into people that stupid that often. So it's easier for me to see the gaps still.

usrbinbash · on April 3, 2023

> I'd say if it mimics something well enough then eventually it's just doing the thing

Right up to the point where it actually needs to reason, and the mimickry doesn't suffice.

My above example about the Football and the Coffemug is an easy one, the objects are well represented in its training data. What if I need a reason why the Service Ping spikes every 60 seconds, here is the code, please LLM look it up. I am sure I will get a great and well written answer.

I am also sure it won't be the correct one, which is that some dumb script I wrote, which has nothing to do with the code shown, blocks the server for about 700ms every minute.

Figuring out that something cannot be explained with the data represented, and thus may come from a source unseen, is one example of actual reasoning. And this "giving up on the data shown" is something I have yet to see any AI do.

ianbutler · on April 4, 2023

I could say the same about most second rate software engineers. Thats why im not moved by your arguments. Theres plenty of peope just as stupid and who will give you confidently wrong answers.