It’s not only their core strength — it’s what transformers were designed to do and, arguably, it’s all they can do. Any other supposed ability to reason or even retain knowledge (rather than simply regurgitate text without ‘understanding’ its intended meaning) is just a side effect of this superhuman ability.
I see your point, but I think there's more to it. It's kind of like saying "all humans can do is perceive and produce sound, any other ability is just a side-effect". We might be focusing too much on their mechanism for "perception" and overlooking other capabilities they've developed.
Sure, but that claim wouldn't be true for humans, right? So it's a nonsequiteur.
The relevant claim would be: all humans can do is move around in their environments, adapt the world around them through action, observe using adaptive sensory motor systems, grow and adapt their brains and bodies in response to novel and changing environments, abstract sensory motor techniques into symbolic concepts, vocalize this using inherited systems of meaning acquired as very young children in adaption within their environments, etc.
In the case of transformers all they can do is, in fact, sample from a compression of historical texts using a weighted probability metric.
If you project both of these into "problems an office worker has"-space, then they can appear simimlar -- but this projection is an incredibly dumb one, and offered as a sales pitch by charlatans looking to pretend that a system which can generate office emails can communicate.
Abstract functions are fully representable by function approximations in the limit n->inf; ie., sampling from a circle becomes a circle as samples -> infinity.
This makes all "studies" whose aim is to approximate a fully representable abstract mathematical domain irrelevant to the question.
This is just more evidence of the naivety, mendacity, and pseudoscientific basis of ML and its research.
As you sample all pixels from all photos on a mountain, the pixels don't become the mountain.
The structure of a mountain is not a pattern of pixels. So there is no function for a statistical alg to approximate, no n->infinity which makes the approximation exact.
By sampling from historical pixel patterns in previous images you can generate images in a pixel order that makes sense to a person already acquainted with what they represent. Eg., having seen a mountain (, having perspective, colour vision, depth, counterfactual simulation, imagination, ...).
In all these disagreeably dumb research papers that come out showing "world models" and the like you have the bad mathematicians and bad programmers called "AI researchers" giving a function approximation alg an abstract mathematical domain to approximate.
ie., if the goal is to "learn a circle" and you sample points from a circle, your approximation becomes exact in n->inf, because the target is *ABSTRACT*.
It's so dumb its kinda incomprehensible. It shows what a profound lack of understanding of science is rampent across the discipline.
MNIST, Games, Chess, Circles, Rulesets, etc. are all mathematical objects (shapes, rules). It is trivial to find a mathematical approximation to a mathematical object.
The world is not made out of pixels. Models of pixel patterns are not their targets.
> all they can do is, in fact, sample from a compression of historical texts using a weighted probability metric.
I don't think that's all they can do.
I think they know more than what is explicitly stated in their training sets.
They can generalize knowledge and generalize relationships between the concepts that are in the training sets.
They're currently mediocre at it, but the results we observe from SOTA generative models are not explainable without accepting that they can create an internal model of the world that's more than just a decompression algorithm.
I'm going to step away from LLMs for a moment, but: How are video generator models capable of creating videos with accurate shadows and lighting that is consistent in the entire frame and consistent between frames?
You can't do that simply by taking a weighted average of the sections of videos you've seen in your training set.
You need to create an internal 3D model of the objects in the scene, and their relative positions in space across the length of the video. And no one told the model explicitly how to do that, it learned to do it "on its own".
>You need to create an internal 3D model of the objects in the scene, and their relative positions in space across the length of the video. And no one told the model explicitly how to do that, it learned to do it "on its own".
Compression is understanding. If you have a model which explains shadows you can compress your video data much better. Since you "understand" how shadows work.
> In the case of transformers all they can do is, in fact, sample from a compression of historical texts using a weighted probability metric.
You seem to think LLMs operate independently from humans. That doesn't happen in practice. We prompt LLMs, they don't just sample at random. We teach them new skills, share media and stories with them, work, learn and play together. It's not LLMs alone. They are pulled outside their training distribution by the user. The user brings their own unique life experience into the interaction.
Well, yes — absolutely. You could say something similar about any system with complex emergent behaviour. 'All computers can do are NAND operations and any other ability is just a side effect', or something.
However, I do think that in this case it's meaningful. The claim isn't that LLMs are genuinely exhibiting reasoning ability — I think it's quite clear to anyone who probes them for long enough that they're not. I was fooled initially too, but you soon come to realise it's a clever trick (albeit not one contrived by any of the human designers themselves). The claim is usually some pseudo-philosophical claim that the very definition of reasoning is simply 'outputting (at least some of the time) correct sentences' and so there's no more to be said. But this is just silly. It's quite obvious that being able to manipulate language and effectively have access to a vast (fuzzily encoded) database of knowledge will mean you can output true and pertinent statements a lot of the time. But this doesn't require reasoning at all.
Note that I'm not claiming that LLMs exhibit reasoning and other abilities 'as a side effect' of language manipulation ability — I'm claiming there's no reason to believe they have these abilities at all based on the available evidence. Humans are just very easily convinced by beings that seem to speak our language and are overly inclined to attribute all sorts of desires, internal thought processes and whatever else for which there are no evidence.
>I think it's quite clear to anyone who probes them for long enough that they're not.
I disagree and so do a lot of people who've used them for a long while. This is just an assertion that you wish to be true rather than something that actually is. What happens is that for some bizarre reason, for machines, lots of humans have a standard of reasoning that only exists in fiction. Devise any reasoning test you like that would cleanly separate humans from LLMs. I'll wait.
> The claim is usually some pseudo-philosophical claim that the very definition of reasoning is simply 'outputting (at least some of the time) correct sentences' and so there's no more to be said.
There is nothing philosophical or pseudo-philosophical about saying reasoning is determined by output. If anything, the opposite is what's philosophical nonsense. The idea that there exists some "real" reasoning that humans perform and "fake" reasoning that LLMs perform and yet somehow no testable way to distinguish this is purely the realm of fiction and philosophy. If you're claiming a distinction that doesn't actually distinguish, you're just making stuff up.
LLMs clearly reason. They do things, novel things that no sane mind would see a human do and call anything else. They do things that are impossible to describe as anything else unless you subscribe to what i like to call statistical magic - https://news.ycombinator.com/item?id=41141118
And all things considered, LLMs are pretty horrible memorizers. Getting one to regurgitate Training data is actually really hard. There's no database of knowledge. It clearly does not work that way.
> Devise any reasoning test you like that would cleanly separate humans from LLMs. I'll wait.
Well, you don’t have to wait. Just ask basic questions about undergraduate mathematics, perhaps phrased in slightly out-of-distribution ways. It fails spectacularly almost every time and it quickly becomes apparent that the ‘understanding’ present is very surface level and deeply tied to the patterns of words themselves rather than the underlying ideas. Which is hardly surprising and not intended as some sort of insult to the engineers; frankly, it’s a miracle we can do so much with such a relatively primitive system (that was originally only designed for translation anyway).
The standard response is something about how ‘you couldn’t expect the average human to be able to do that so it’s unfair!’, but for a machine that has digested the world’s entire information output and is held up as being ‘intelligent’, this really shouldn’t be a hard task. Also, it’s not ‘fiction’ — I (and many others) can answer these questions just fine and much more robustly, albeit given some time to think. LLM output in comparison just seems random and endlessly apologetic. Which, again, is not surprising!
If you mean ‘separate the average human from LLMs’, there probably are examples that will do this (although they quickly get patched when found) — take the by-now-classic 9.9 vs 9.11 fiasco. Even if there aren’t, though, you shouldn’t be at all surprised (or impressed) that the sum of pretty much all human knowledge ever + hundreds of millions of dollars worth of computation can produce something that can look more intelligent than the average bozo. And it doesn’t require reasoning to do so — a (massive) lookup table will pretty much do.
> There is nothing philosophical or pseudo-philosophical about saying reasoning is determined by output.
I don’t agree. ‘Reasoning’ in the everyday sense isn’t defined in terms of output; it usually refers to an orderly, sequential manner of thinking whose process can be described separately from the output it produces. Surely you can conceive of a person (or a machine) that can output what sounds like the output of a reasoning process without doing any reasoning at all. Reasoning is an internal process.
Honestly — and I don’t want to sound too rude or flippant — I think all this fuss about LLMs is going to look incredibly silly when in a decade or two we really do have reasoning systems. Then it’ll be clear how primitive and bone-headed the current systems are.
this overlooks how they do it. we don't really know. it might be logical reasoning, it might be a very efficient content addressable human-knowledge-in-a-blob-of-numbers lookup table... it doesn't matter if they work, which they do, sometimes scarily well. dismissing their abilities because they 'don't reason' is missing the forest for the trees in that they'd be capable of reasoning if they were able to run sat solvers on their output mid generation.
Dismissing claims that LLMs "reason" because these machines perform no actions similar to reasoning seems pretty motivated. And I don't think "blindly take input from a reasoning capable system" counts as reasoning.
Does it? I think Blindsight (the book) had a good commentary on reason being a thing we think is a conscious process but doesn't have to be.
I think most people talking past each other are really discussing whether the GPT is conscious, has a mental model of self, that kind of thing, as long as your definition of reasoning doesn't include consciousness it clearly does it (though not well.)
Hinton's opinions on LLMs are frankly bonkers. Just because you're famous — and intelligent and successful — doesn't mean you can't be completely wrong.
Also: what's his rationale? It's no use simply claiming something without evidence. And as far as I (and seemingly most others) can see, there's no such evidence other than that they can sometimes output sentences that happen to be true. But so can Wikipedia — does that mean Wikipedia is reasoning?
Also, any form of reasoning in the usual sense of the word would surely require the ability to allocate arbitrary amounts of computation (i.e. thought) to each question. LLMs don't do this — they don't sit and ponder; each token takes exactly the same amount of computation to produce. Once they hit an 'end of text' token, they're done.
Even empirically speaking, LLMs' ability to reason can be seen to be nonexistent. Just try asking basic mathematics questions. As soon as you ask anything for which the answer isn't available — practically verbatim — on the web already, it produces intelligent-sounding gibberish.
This whole idea that 'LLMs must be able to reason because in order to learn to fake reasoning you must learn to actually reason' is like some kind of inverted no true Scotsman fallacy.
Yes, Hinton can be wrong, is wrong on many things like his misunderstanding on Chomsky and language.
But I also think he has spent thousands of hours testing these systems scientifically.
Your last sentence puts a lot of words in peoples mouths. But to continue down that line, fake reasoning and actual reasoning sounds like the Chinese Room. Is that the argument you are making?
We don't understand our own mental processes well enough, so I try to not anthropomorphize reasoning and cognition.
> Your last sentence puts a lot of words in peoples mouths.
Well, it’s the most common sentiment I see on both here and (before I gave up) the AI-centred parts of reddit.
It’s not quite the Chinese Room, since LLMs can’t even simulate reasoning very well. So there’s no need to debate the distinction between ‘fake reasoning and actual reasoning’ — there may or may not be a difference, but it’s not the point I’m making.
As for Hinton: I’m sure he has. But inventors are often not experts on their own creations/discoveries, and are probably just as prone to FUD and panic in the face of surprising developments as the rest of us. No one predicted that autoregressive transformers would get us this far, least of all the experts whose decades of work lead us to this point.