Hacker News new | past | comments | ask | show | jobs | submit login
“Just a statistical text predictor” (asanai.net)
45 points by NumberWangMan on May 14, 2023 | hide | past | favorite | 87 comments



> it is very rare for GPT4 not to be able to understand why it went wrong.

That's not what's going on. GPT4 can't know the real reason why it wrote anything. It can never tell you the real reason why it made a mistake. Instead, we can say that GPT4 is fairly good at inventing plausible explanations about why someone did something, but it's no more accurate explaining its own mistakes than it would be at explaining one of your mistakes. [1]

A better explanation of what happened is that asking for an explanation resulted in chain-of-thought reasoning, and by telling it that its previous answer was wrong, it was biased to pick a different answer.

But this is often fragile. You'd want to do the same test multiple times, and compare regular chain-of-thought reasoning with what happens when you explicitly tell it that a particular answer is wrong.

[1] https://skybrian.substack.com/p/ai-chatbots-dont-know-why-th...


> GPT4 can't know the real reason why it wrote anything. It can never tell you the real reason why it made a mistake.

Define "know" such that humans have it and GPT-4 does not have it.


Okay, this isn't quite what you asked but it's close: we can decide to go on an errand and remember what we were doing even if we didn't write it down. (Though sometimes we do forget, so shopping lists are useful.)

This doesn't mean our justifications are accurate, but at least we do sometimes have "debug logs," some memories (perhaps faint and inaccurate) of internal thoughts that we can use to try to reconstruct what we were thinking.

An LLM cannot remember anything it didn't write down. It can't do that at all. They would need to change the API.


LLM doesn't have any other runtime state but for what it wrote down (assuming the user even provides it). For comparing with humans, I think LLMs are not like our minds, but like the inner voice. Chat history is the working memory.


Yep, and we have the text. It’s very transparent that way.

But it might be worth thinking about what a satisfying explanation of the LLM’s behavior would look like. It would have to be a high-level summary of the calculations that were done to predict each new token. Knowing which previous tokens got attention and what higher-level concepts it calculated would help. (Also, if the temperature isn’t zero, there’s a random number generator involved.)

The LLM doesn’t have access to that, once the calculation is done. It has the text, same as us. It can guess what might have happened, same as us. When it guesses it’s going to anthropomorphize by picking an explanation that would make sense for a human, because it’s trained to write justifications that a human would write.


> An LLM cannot remember anything it didn't write down. It can't do that at all.

Ok, but I'm not sure what that has to do with knowledge. You can ask ChatGPT to reflect on and correct its own answers, so it clearly has access to its own outputs, which is a form of memory.


By default ChatGPT keeps a context of 8000 tokens, which includes user provided prompts. That's the size of the working memory.


Epistemology 101. Statistical pattern matching has no epistemological truth beyond a correspondence of linguistic statistics.

It is merely statistical probability, which hardly can be classified as an epistemological basis. If we pretend for a moment it is, the ontologies formed are most certainly peculiar, and we can expect ideologies that emerge out of the ontologies are problematic.


Beyesians accept an epistemological foundation of statistical priors updated by experience.

Predictive processing is well established in the neuroscience community.

Capital T Truth is really of very little interest to anyone outside of faith-based epistemology like religion.


Physicists are very interested in capital T Truth. They work with explanations not just statistical priors.


This talk by Feynman is slightly related:

https://youtu.be/obCjODeoLVw


The issue is that the definition of “statistics” is anchored in a magnitude of frequency of glyphs. The “information” is fabricated in this regard, pulled up out of the ether, and by decree christened as “meaning”.

Numbers carry no meaning, nor do the magnitudes arbitrarily assigned to meaning. The map is not the territory.


> The “information” is fabricated in this regard, pulled up out of the ether, and by decree christened as “meaning”.

No, not fabricated, but inferred from a structured corpus of information generated by other semantic processes (humans).

> Numbers carry no meaning, nor do the magnitudes arbitrarily assigned to meaning.

Prove it.

> The map is not the territory.

Except if the territory is information, in which case the map is literally the territory. Knowledge is information, is it not?


Have you heard of information theory?

Numbers can mean anything. A multitude of numbers as voltage potentials and ion gradients sufficiently describe your brain.

Biology manifests this arrangement as a brain, without which this arrangement would also be similarly meaningless.

Your argument against deriving meaning from statistics completely ignores that the brain also works this way.


> Have you heard of information theory?

> Your argument against deriving meaning from statistics completely ignores that the brain also works this way.

The brain is not predicting it's compressing.


> The brain is not predicting it's compressing

Compression requires prediction, therefore your brain requires prediction.


Some form of prediction being used by the higher-level neurons doesn't make the brain a prediction engine.


I'm not sure who claimed the brain was a "predictive engine" or what that means exactly. The OP specifically referenced predictive coding which describes precisely what is meant, and has empirical support.

If you meant this as a comparison to machine learning, then a predictive coding model closely matches.


The current prevailing theory in neuroscience is in fact the brain is a prediction engine.

https://en.m.wikipedia.org/wiki/Predictive_coding


[flagged]


My former claim is that everything in the natural world can be reduced to statistics, so saying meaning cannot be derived "because statistics" is a very poor argument.

The second is a theory for the underlying mechanisms of the brain.

I'm sorry you don't understand.


> Epistemology 101. Statistical pattern matching has no epistemological truth beyond a correspondence of linguistic statistics. It is merely statistical

Your belief in epistemic truth is a statistical inference from your perception of apparently reliable causality. How do you ground this inductive inference?

Your attempt to appeal to epistemology 101 with a casual dismissal as "mere statistical probability" covers this deep, gaping maw. Bayesian inference reduces to classical logic when all probabilities are pinned to 0 and 1, but in what circumstances can we actually demonstrably infer absolute certainty? None that I can think of, except one's own existence.


I did not lay claim to epistemic truth. I referenced epistemology. That is, there are a plurality of epistemologies, all of which have their own epistemic truth mechanics.

So I would agree with you, entirely!

I was merely pointing out that statistical correlation alone affords no epistemological basis.


Statical pattern matching is how human brains work.


> It can never tell you the real reason why it made a mistake.

From my experience it can. Sometimes it writes the code and I have to point out that it produces a wrong result probably because of some function's behavior. GPT-4 can figure out the rest and fix it.

However, in general, it can go circles giving conflicting statements.


That's slightly different. It can sometimes spot a bug in some code it wrote, in the same way it could spot a bug in the code someone else wrote. It could also give a human-like justification for why the bug is there.

But that's different from a real explanation for why it wrote the code with a bug. A real explanation would have to be rather non-human.


OP knows it looks like that but is trying to explain what might cause that instead of genuine understanding.


>That's not what's going on. GPT4 can't know the real reason why it wrote anything.

Boy do I have news for you.

People can't recreate previous mental states, it's all plausible post-hoc rationalization.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3196841/

And the rationalizations aren't always grounded either. The brain is just fine making up completely bogus explanations you believe to be true but couldn't possibly be so.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4204522/

https://pure.uva.nl/ws/files/25987577/Split_Brain.pdf

The conscious self is at best a co-pilot. A Co-pilot that has shockingly little insight into the operations of the plane but thinks he knows much more.

Even your senses gets some pretty big post-processing by your brain you're consciously unaware of.

For example, ever wondered why the first second seems longer when you suddenly look at a clock ?

Human beings may not have the best visual acuity, but we are quite good at distinguishing colors and exceptional at processing visual information. A significant portion of our brain is dedicated to processing the visual stimuli from our binocular vision.

Nevertheless, binocular vision necessitates that both eyes focus on a single point. Our eyes can track a focused target as it moves, but they are unable to smoothly scan across our visual field. This is too difficult to process, so our eyes naturally jump from one point to another. Yet, during these jumps, known as "saccades," our brain does not even attempt to comprehend the blurry sensations that come in. In effect, you temporarily go completely blind!

However, you don't feel like you went blind, right? Your vision doesn't go black or anything, and it seems like you could see the entire time. This is because our brains fill in the lag while our eyes are moving, creating the illusion that we were seeing something instead of being blind during that time. Strangely enough, our brains fill it in with *what we see when they stop*! In other words, our brains go back and edit our memory of a split second ago to make it seem like we were looking at our new point of focus the whole time our eyes were moving to actually see it.

This phenomenon occurs when you look at the second hand of a clock. When you glance over at the clock, your brain tells you that you were looking at it for longer than you actually were, adding the time your eyes were moving but not yet seeing the clock. If the second hand was moving during this time, your brain will believe the second hand was in its new position longer than it actually was. Based on this flawed memory, it will appear as if the second hand stayed in that position for longer than a second.


Yes, of course. We often don't know our real motivations either, our justifications are often self-serving and made up, and more generally there's a lot we don't know about how the brain works.

But we do at least have some memories of our own thought processes and some experience observing our own reactions to things. You can get better at introspection. Attempting to understand and explain your own mistakes is useful.

An LLM won't be able to do this sort of learning without changing its training or architecture somehow.


Actually just added a long paragraph showing how even memories or real time sense data is pleasant fabrication. Also every time you remember something, you're not really recalling it. Your brain rewrites your memory every time you "remember" often introducing false data or inconsistencies. It's one of the reasons human memory is so fallible and inadmissible.

That said, I don't disagree with the overall point of what you've said. Attempted Introspection is useful. The point I'm making is to not dismiss LLM "introspection" simply because it's rationalization. It can be extremely useful too.


Fair enough!

The part I see missing for an LLM is learning from feedback about whether its justifications "work" for explaining its own actions. Keep in mind that an LLM is not human and its justifications often shouldn't look like human justifications. But it's going to be scored by how human its justifications look, so you will never get that. (With the exception of things explicitly trained in like knowing its own cut-off date - that's quite a non-human justification for not knowing something.)

Also, it's easy to come up with examples showing that we do remember what we were thinking, even if we didn't say it out loud. A simple example might be running an errand. Of course, sometimes we do forget, which is why a shopping list is useful.

An LLM cannot do that at all. It writes it down or it's lost. It can only pretend to remember an internal thought it had.


>An LLM cannot do that at all. It writes it down or it's lost.

I mean this is likely a solvable architecture issue at some point once we have the computing power to perform 'excess' computation and re-learn quickly. Consciousness, in my mind is a form of 'writing it down' and very rarely the primary thought itself, that is consciousness is reflection.

We see improvements in GPT behavior when allow externalized consciousness via reflection. It's able to 'see' its thoughts written, reflect on if they are correct, and then make corrections to improve the quality of its final answer.

The biggest hold up in LLMs is the amount of effort needed to retrain the LLM and ensure it's not gone off the rails. Currently they take months to train, now if Nvidia somehow makes the hardware a million more times powerful like they want to, will it take the training time down to minutes or hours?

>its own cut-off date - that's quite a non-human justification for not knowing something.)

That sounds very human to me... "I worked in the X industry from date Y to Z" is something people say all the time as to put a qualification date on the information they have about something.


Yes, there's no physical reason it couldn't be fixed. However, I think it's less a matter of computing power, and more one of architecture and training.

The inference-time code could be changed to keep a log of which neurons were activated the most when it generated a token. But then, how would we train it to use that log and report it accurately? Training on Internet documents isn't enough.

Chain-of-thought reasoning helps as a workaround. However, beware that even then, the justifications aren't a reliable explanation. There's an interesting paper on this:

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

https://arxiv.org/abs/2305.04388


> People can't recreate previous mental states, it's all plausible post-hoc rationalization. [...]

How could you possibly claim people are simply incapable of recreating mental states and that all they do is plausible post-hoc rationalization? Surely people can sometimes remember what they were thinking before making a decision? And in fact I would think people can do this more frequently than GPT can. Or are you really claiming those studies proved an implausible negative here?


First: these studies are unrelated to AI and mostly predate modern LLMs, so no, they draw no direct comparisons. At best, this is pointing out two surprisingly similar behaviors we observe in LLMs and the actual functioning of the human mind.

Second: Being unable to reconstruct a prior mental state =/= remembering a thought process. I'll use an accursed programing analogy because we are on an accursed programing website: it's like the difference between debug logs and time-travel debugging. In plain language: humans retroactively reason about their thought process based on the artifacts produced by that thought process. The quality and reliability of these artifacts will vary heavily, but the reasoning itself is always backward-looking and speculative in nature.


That's all fine, but if the claim is really "I cannot know which exact neurons were firing at some moment in the past" (that's what time-travel debugging would be), that's not a particularly insightful or useful statement. I'm pretty sure I can reconstruct some states of my brain for some moments in the past sufficiently adequately to be able to explain my reasoning in that moment without fabricating anything. The claim was that this is always impossible for all humans, which seems quite obviously false, and in any case, unsupported by the studies linked.


Honestly the proof of evidence that you can recall your memory state needs to be proven by you, as I don't believe any one has brought forth evidence of what you're suggesting.

Almost everything I've seen and read about this suggests the opposite. Activating memory changes memory. Intermediate events between the initial activation and the final questioning can also change those memories. And furthermore, that you are only aware of your conscious memories (and even most of that is forgotten) and that memory is accessed unconsciously all the time.

It's all post ad hoc.


The sense data you consciously perceive is pleasant fabrication. Moreover, every time you remember something, you're not really recalling it. Your brain rewrites your memory every time you "remember" (what you're really doing is remembering the last time you remembered it), often introducing false data or inconsistencies. It's one of the reasons human memory is so fallible and inadmissible.

The reason I linked the split brain experiments is to show that what you think is the reason for making a decision, even if you "remember" the thought processes before can just be wrong. Asking one side to perform an action and asking the other side to explain the reason for performing the action ended up producing some really wild results.


> The reason I linked the split brain experiments is to show that what you think is the reason for making a decision, even if you "remember" the thought processes before can just be wrong. Asking one side to perform an action and asking the other side to explain the reason for performing the action ended up producing some really wild results.

"Can" just be wrong and "produces some" wild results, sure. In the previous comment were claiming they're always fabrications (as, I note, seems to be the case with GPT!), not just that they could be.

The fact that both GPT and the human brain can sometimes fabricate things similarly does not mean the human brain (or GPT for that matter) does this all the time, let alone that they do this equally frequently, or always to equal degrees. The studies might be showing surprising results, but you're greatly exaggerating what the studies actually show.


Memories are always part-fabrications.

Our brain store "key" moments it thinks it is important, then make up the rest of it. But this process is not really trustworthy on the finer details and is extremely receptive to any sort of suggestion because it simply doesn't have any factual data, so happily jumps to anything which seems to fit the narrative. This is especially true in scenarios where you experience strong emotions. Anything feels like to fit (like, someone paints a scenario which creates the same type of emotion) our brain just accepts it as a matter of fact and puts the puzzle in.

It's why implanting false memories is extremely easy and leading questions work so well. https://en.m.wikipedia.org/wiki/Lost_in_the_mall_technique

Anything that you remember from your life is mostly fabrications with a couple of actual memories used as a scaffolding, and your brain just built the rest of the structure around it - constantly changing it.


And yet we have a very hard time believing or accepting this reality, preferring to believe in our own infallibility.


I think some of what is misleading you here is something you think the brain does well, which it does not...

The brain is not a 'good' memory device. It is a really terrible one. Pick up any hard drive and write data to it and it's going to be a massively more faithful and reliable memory device.

Memory is not the power of the brain. Filtering is by far one of our greatest brain powers. At extremely low power levels we can input absolutely massive amounts of sensing data and capture information form noise. For example have someone put up a wall of pictures in a room. When you first see it your brain will capture a bunch of information, but every time after that it turns into more filtered noise you don't capture. And this noise is really easy to switch out with different objects and thoughts. When one of your trickster friends switches out the picture of your grandmother for Hitler, it's exceptionally unlikely you'll notice.

But what's even better about this is when someone points this out to you, how they point it out to you will influence "when you noticed it". If they say something like "We switched out this picture months ago and you noticed it pretty quickly" your brain will rewrite history when you recall this gag later and you will think you noticed it months beforehand unless you go to a pretty significant effort not to be gaslit.


Exactly, the people making confident statements about the inner workings of the brain are just as ridiculous as the people making the same statements about LLMs.

We should have a healthy dose of humility here and realize we know very little about either the brain or the meaning hidden in those ~200 billion parameters.


"what they were thinking before making a decision?" Human brains are notoriously unreliable, we lie to ourselves all the time, we can't trust our memories.


> "what they were thinking before making a decision?" Human brains are notoriously unreliable, we lie to ourselves all the time, we can't trust our memories.

"Unreliable" is a far cry from "lie to ourselves all the time". Some of the time, sure. But all the time? Everything I'm telling myself is a post-hoc fabrication just like with GPT? Like you're telling me I can't even trust my memory of who my mom and dad are, or which country I live in? I'm justifying all of these post-facto somehow?


Perhaps those specific details are just a few heavily reinforced data points. Perhaps they become more and more statistically likely to be true the more they are reinforced?


Or perhaps they aren't, who knows. Regardless, is this a hypothesis that leads to anything useful?


Humans work in a completely different way, and human performance is irrelevant to the obvious, systemic limitations of LLMs.

You haven't addressed the substance of the comment you're responding to, only diverted attention. Unless you're claiming that humans are nothing but LLMs...


> Humans work in a completely different way

Citation?


Hugo Mercier was recently on Sean Carroll's podcast talking about the concepts from his new book The Enigma of Reason. The gist of it is that most of the function of what we call reasoning doesn't exist to implement logical reasoning, but rather evolved to provide the social utility of providing "reasons" to convince or placate other humans. I guess what I'm saying is we may be closer to ChatGPT than we think - we may not usually be logically reasoning when we provide reasons, but inventing after the fact reasons that are statistically likely to be accepted for what we did or what we want to do.

Podcast: https://www.preposterousuniverse.com/podcast/2023/04/17/233-...


Not a fan, OP.

No, reducing LLMs to just “statistical text predictor” is an absurdity; but so is leisurely stringing together observations from unstructured experimentation to form a “matter-of-fact” conclusion.

Read the ChatGPT paper and specifically hone in on “transformer” and “input processing” (paraphrased). It will give you a clearer picture of what it actually is, rather than appears to be.


We do not expect just knowing the differential equations governing biological neurons to give us much insight into human intelligence. Why do you think it should be different for artificial ones?

I've read the relevant papers, I've even implemented some of them, and I do not think any of this gave me a better understanding of it's capabilities than playing with it as a black box for a few weeks.


> We do not expect just knowing the differential equations governing biological neurons to give us much insight into human intelligence.

No, I expect not.

But I think it’s a mistake in thinking and loose use of language to say that differential equations, or any mathematical representation, governs anything. They are models of whatever may actually be happening, and are necessarily modeling a simplification of the behavior observed.

Don’t mistake the model for reality.


There is no "reality" for you either. Everything you perceive is a pleasant model based fabrication of data the brain receives. This is why you see a second tick by faster when you suddenly look at the clock. This is why it seems you see something when you shift your eyes quickly instead of the completely black you should be seeing. This is why most will not see what they should be seeing here - https://www.youtube.com/watch?v=Ahg6qcgoay4

Every memory you store is part-fabrication too. Only what the brain perceives as "key" is stored, scaffolded by made up events. Every memory you retrieve is even more fabrication because the brain rewrites memories every time they're recalled. This is why implanting false memories is extremely easy and why leading questions work so well. https://en.m.wikipedia.org/wiki/Lost_in_the_mall_technique

The vast majority of what you know and perceive was not derived from first principles.

Make no mistake here, you are working off a model of reality too.


Fair enough, but I would argue that even having perfect knowledge of everything happening in a biological neuron down to the quantum level would not help very much when you are trying to understand how the brain makes decisions.


This is a good overview of a number of emergent things that remain unexplained about the capabilities of some large language models:

https://www.scientificamerican.com/article/how-ai-knows-thin...

You can follow the multitude of things from there to the papers published that describe each of the cases in greater depth.


Looking at Transformer model architecture and their training schemes to understand how they think misses the point. You wouldn't gain insight into how Doom game works by just looking at schematics of an x86 processor.

These systems are Turing complete and they can execute any computation. To see what they actually execute requires looking at what happens in memory while Doom runs, in the activations when the Transformer model executes.

At this time we don't have good tools to extract the actual algorithms and submodels these LLMs execute, but there are reasons to believe there are better algorithms learned in there for example for reinforcement learning than what we have ever been able to engineer.

See for example my research: https://github.com/keskival/king-algorithm-manifesto#readme


Mechanistic interpretability is just getting started, but there's been some interesting research on induction heads. Here's an explanation:

https://www.lesswrong.com/posts/TvrfY4c9eaGLeyDkE/induction-...


That was fascinating, thanks. Please do consider posting this here as its own article, it would be interesting to read others comments.

My reading of this - and please correct me if I'm wrong, I'm still learning - is that you're extracting hyper-parameter planes from the data flow in the model's embedding space?

Its really exciting to think of the hidden knowledge and relationships we could extract from our own linguistic interactions.


> You wouldn't gain insight into how Doom game works by just looking at schematics of an x86 processor.

How is this a relevant analogy?


> … though it initially became cautious:

> [GPT4]: “equine.”

> [Human]; What sort of equine?

> [GPT4]: “horse.”

The sentence it had been asked to complete ended with “… the resulting animal will look like an”. I suspect “equine” wasn’t the product of caution, but of matching the article an. (There may still be some English accents that use “an horse”—perhaps some Indian and some parts of England—but the vast majority now use “a horse”.)


I think this misunderstands the reason why people say LLMs are "just" statistical text predictors and misunderstands LLMs in general.

Here's where you're wrong:

> The real reason behind this bizarre answer is that it matches the form of a more elaborate puzzle where a similar indirect approach is necessary. GPT4 gets caught up in the tendency to match previous examples, and this is a genuine reflection of its cognitive history; it started out as a statistical text predictor. This is not the same as not understanding the question; this is a reflection of a faulty cognitive architecture that can be fixed relatively easily.

This is not anything that can be easily fixed. This type of faulty problem-solving is a fundamental property of how LLMs work. It can be attenuated or avoided, but not repaired or removed.

LLMs work by assembling language into plausibly-human-written sentences, and if you ask them to engage in problem-solving, they will often write their way to a correct solution, but examples like the 12L/6L jug question are revealing: The LLM doesn't actually reason. It imitates the appearance of reasoning, and does so extremely well, but that's not the same. It doesn't understand why you would pour one jug into another; it's been taught that pouring jugs into each other and arriving at the required number of litres are characteristics of the solutions of similar problems it's been trained on, and so it does all of these things. LLMs imitate human writing, and can, by proxy, reproduce human reason, but they have a tenuous and indirect grasp on it, and this drawback is built into their design.


I see absolutely no reason (except cost) that LLM interfaces shouldn't utilize an internal dialog which is hidden from the user and which can be queried from the first layer of interaction and potentially vice versa.

This would emulate much of what we consider human reasoning. And if it quacks like a duck it might be a duck...


Asking the LLM to be more verbose can bring contradictions to the surface and lead to more reliable answers, but it still doesn't fix the underlying problem. Even when you ask it to show its work, it can still contradict itself, blindly imitate patterns in its training data, and contrive plausible-sounding but false answers.

I've got a (rather long) transcript I could post of a simple logic problem which I asked ChatGPT to solve step by step. It started off okay, but then it started completely breaking the rules of the puzzle and the internal consistency of its answer. When I asked it to identify its mistake, it made up a totally different mistake it never made.

The thing is, it's not emulating. It's imitating, and there's a big difference: When you emulate a logical process, you're copying it from the bottom up, and the internal logic is the same. If there are mistakes, they're fairly marginal. When you imitate a process, however, you're trying to reproduce the same output without recreating the internal logic that creates that output, and it's easy to detect and extrapolate patterns in the output in ways that don't make sense (like with the 12L/6L jug puzzle).


> it can still contradict itself

Do humans not quite often contradict themselves?


Not the way LLMs do. Typically, humans are only hypocritical when they're trying to conceal their true values. For example, when someone makes a "states' rights for me but not for thee" argument, they don't actually believe in "states' rights." They're just cynically arguing whatever position will forward their agenda.

An LLM, conversely, will sometimes directly gainsay itself, hallucinating different versions of its own output from just paragraphs earlier. That's not just hypocrisy—that's not having any actual concept of truth beyond an imperfect idea of when a sentence does or doesn't look "realistic."


>Typically, humans are only hypocritical when they're trying to conceal their true values.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3196841/

People sure as hell think they know what previous mental states and preferences inform their decisions but that's just as likely to be nice post-hoc rationalization than truth.

You're a co-pilot at best. You don't actually know why you make the decisions you do. No need for any "concealing my true self" rhetoric


The article you linked has nothing to do with your claim and neither the article nor your claim have anything to do with my point. The article demonstrates proof of choice-supportive bias, not that people have no recollection of their previous mental states. And LLMs don't merely exhibit choice-supportive bias—they wantonly contradict things they said moments earlier and wildly hallucinate things that never happened.


I didn't say people have no recollection of previous mental states. I said people don't know what mental states and preferences actually inform decisions. They just think they do. And previous mental states can't be recreated.

Hitler has his share of Jews saved from the Holocaust. For example, he befriended a Jew girl, Rosa Bernile Nienau among a few others and kept regular contact with her. So tell me, what "secret self" was Hitler hiding ?

Humanity is chock full of contradictory actions. This nonsense "secret self" theory doesn't even make sense for most of them.

>they wantonly contradict things they said moments earlier and wildly hallucinate things that never happened.

Brains do this in spades. Sense data you perceive is partly fabricated for the concious self. This is why you see a second tick by faster when you suddenly look at the clock. This is why it seems you see something when you shift your eyes quickly instead of the completely black you should be seeing.

Memories are always part-fabrications too. Only what the brain perceives as "key" is stored, making up the rest. Every time you recall a memory, the brain rewrites part of it, often introducing more fabrications. It's extremely easy to implant targeted false memories for this exact reason. https://en.m.wikipedia.org/wiki/Lost_in_the_mall_technique

And then there's the results of the split brain experiments too

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7305066/

So this idea that your brain doesn't "hallucinate" or make up details is very laughable. Your brain is quite literally constantly doing that and with the concious mind completely unaware.


> Brains do this in spades. Sense data you perceive is partly fabricated for the concious self.

You must be sprinting to move the goalposts that far that fast. I'm not talking about convenient sensory illusions or details of perception. I'm talking about ChatGPT telling bold-faced and obvious lies about things it said mere sentences earlier. Would you engage with my points instead of switching to totally unrelated topics like Hitler's single Jewish friend?

Humans are not perfectly rational beings and our memory is faulty, but we are broadly rational and have a generally consistent model of the world; LLMs are painfully inconsistent and exhibits wild bouts of irrationality, and this behaviour is totally in line with the way they are trained.


I think it is a mistake to consider this an inherent weakness of LLMs. Nothing precludes a system architecture for a next generation LLM to have loops for rationality or morality reasoning which filter the output.

We are likely to see many human psychotic disorders arising in the LLM output when we start doing this but we will likely see less of the pivoting on answers we see now.


No, you're massively anthropomorphizing them. You can't just add a "loop for rationality"—writing is a second-order product of first-order reasoning. If a model trained on that second-order product isn't reproducing the first-order behaviour, you can't just tack it on in an extra module.

It's like trying to teach something to dance based on the footprints of dancers. Depending on the dance, maybe it'll work and maybe not, but with the dance of reason and the footprints of writing it's not working, and no extra module we tack on will change the relationship between the dance and the footprints. It's a fundamental flaw in the LLM architecture.


Hiding part of the LLMs output from the user doesn't make what it's doing fundamentally different than what it does now.

Also, anecdotally, I can totally reason about things non-verbally in my mind and then verbally express a conclusion. Not everything is an internal dialogue for me. But so far, everything has to be an internal (or external) dialogue for an LLM or it can't reason at all.


Some people don't have an inner monologue at all, and reason entirely non-verbally except when they're writing or speaking aloud.

https://www.cbc.ca/news/canada/saskatchewan/inner-monologue-...


This article could really could be shortened as

Calling Llms a statistical text predictor is like calling a brain a bunch of neurons responding to electrical impulses. It's true but misses the thing we care about which is the emergent behavior of such systems, in humans we call it consciousness. For Llms which are akin to a language processing center all by itself conciseness isn't the word but a yet unnamed thing.


"If it looks like magic to me, it must be magic."


I understand that you may believe your ability to think is magic, but I assure you, you are also just a statistical predictor, and not magic. Thinking and cognition themselves, beyond the medium they happen within, are moving out of the realm of philosophy and into the realm of science now. It just so happens that quite a lot of people are uncomfortable with the idea of not being as special and unique in this way anymore.


A lot of people are quite unhappy with it, yes, but... Consciousness is still very much a philosophical problem. Think of Marr's levels, functionally, LLMs and humans are quite similar, but algorithmically? How about the implementation layer? There are still many, many mysteries to be solved in the realm of philosophy of mind and LLMs should serve as reminders that Consciousness and Intelligence are not the same.

May I suggest this reading? https://qualiacomputing.com/2022/06/19/digital-computers-wil...


> I understand that you may believe your ability to think is magic.

Nobody said that. I am sick of the false dichotomy of being given only two choices:

a) It's magic

b) It's akin to our latest discovery


Yes. I think this is exactly right and feel like I'm waiting for people to get over the shock. It's really interesting (but yes, initially scary) to think about what really makes us human.

In a sense we _get_ to think about all the things we can do that robots can't. I hope at some point most feel a sense of relief that a robot can assist in the robotic things we're often forced to do.


You're positing this with way too much confidence, like GPT. We absolutely do not understand what thinking is. As far as I'm concerned, we are at least self-correcting, conscious, aware statistical predictors.

It's not only prediction either. Some of it is direct response or autonomous behavior (your heart doesn't beat because it tries to predict something).


Any sufficiently advanced statistical predictor is magic?


> One of my first engagements with GPT4 was to discuss the concept of maralia. [...] I chose maralia because the discussion involved a complex philosophical concept that I knew was not in its training set. [...] GPT4 not only understood the idea, but it was able to use the concept appropriately [...]

This is like the time when Alan Sokal submitted an article to Social Text but this time it is the social scientist playing a trick on themself.


> "I think this assessment is wrong on many levels"

Ok so in your expert opinion, what is it actually doing?

> "Exactly what it does, no one really knows, not even its creators."

I mean, it obviously couldn't just be feeding all the previous text into a model that statistically predicts the next likely words and then does some post processing to intelligently pick one of the high ranked words, just like the developers say. Because that would just be a statistical text predictor.

People seem to really hate the idea that something so simple can fool them into thinking it is alive, like it is a challenge to their own intelligence.


> I mean, it obviously couldn't just be feeding all the previous text into a model that statistically predicts the next likely words and then does some post processing to intelligently pick one of the high ranked words, just like the developers say.

Of course that's what it's doing, but if I ask you how you typed that comment you wouldn't say "trillions of synapses whose connection strength was determined based partly on my genetics but mostly on all my sensory data since I've been alive ended up firing in order to make my fingers press the 'reply' button".

Do not mistake understanding the substrate and the loss function with understanding the actual model.


Sure, but saying "it is magic and nobody really knows how" when we have a solid scientific basis for how messages are carried from the brain to the fingers is highly disingenuous.


I agree that saying that "it is magic and nobody knows how" is not a good explanation for anything. But saying "we understand the substrate so we are done" is no better.


It is a statistical text predictor, and finetuning (RLHF) is the "magic" that gives it meaning and reasoning via weigths




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: