Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure "hallucination" is the right word.

I've seen it referred to as "stochastic parroting" elsewhere, and that probably gives more insight into what is happening. These large language models are trained to predict the next word for a given input. And they don't have a choice about this; they must predict the next word, even if it means that they have to make something up.

So perhaps the solution would be to include the prediction confidence in the output. E.g. gray out the parts of the text hat are low confidence predictions, like downvoted HN comments.



Isn't the problem more _because_ it's a language model, and not a knowledge model? It's not based on facts, or even able to go out and find facts. If it's not in the training set it simply doesn't know.

It seems like this is only a single layer to something that should be larger. It should be able to tell if what it's saying is true, or to go out and find facts when it's missing them.

The fact that it's only a language model probably means that this is just out of scope.


> It seems like this is only a single layer to something that should be larger

Absolutely correct, and I believe anyone working on these models would agree and, other than as a fun demo, would never suggest that the raw model output gets used for any real purpose. A similar analogy would be self-driving cars. Somewhere "under the hood" there is an ML computer vision model, but it's not like the output layer is just hooked up to the gas and steering. There is all sorts of other logic to make sure the car behaves as intended and fails gracefully under ambiguity.

People see these language models and their flaws and somehow interpret it as a flawed overall product, when they are instead just seeing the underlying model. Admittedly, openAI hasn't helped much by building and promoting a chatbot the way they have.

Lots of cool potential for large language models, very little that comes from raw interaction


That doesn't stop the companies churning out these models to pretend otherwise x)


If you would like another Latin word for it, take "confabulation" from neuroscience-land: https://en.wikipedia.org/wiki/Confabulation#Signs_and_sympto...


I had an elderly neighbor who unfortunately suffered from this. I spoke with her off-and-on over the first year or so, and she loved to talk. She would tell me about her daughter and grandkid, things that she saw that day, etc.

It was all very plausible but I always felt like there was something off about her. Then one day she told me a story about me, and things I’d said, done, and experienced and it was all absolutely made up, from the overarching plot down the finest details. It never happened, couldn’t have happened, and couldn’t have been even something that happened to someone else.

I tried to politely correct her at first, but she was so certain that she began worrying about me and why I couldn’t remember so I decided to just stand and nod to avoid stressing her out.


Came here to say the same thing. Medically confabulation is different than hallucination and far more similar to what is being described. Confabulation is seen with wernike-korsokoff syndrome which can be found in very long time alcohol use disorder. The patient makes up stories to fit the gaps in their memory without necessarily realizing that is what they are doing.

Whereas hallucinations are more like present sensory disturbances happening at that moment.


or simply use the Filling-in: https://en.wikipedia.org/wiki/Filling-in


That still wouldn't help here. We don't want the prediction confidence that the sequence of words you produced might appear in a valid English-language sentence produced by humans. We want the prediction confidence that the sentence is factually accurate. These models aren't given that kind of data to train on and I'm not sure how they even could be. There are oodles and oodles of human-generated text out there, but little in the way of verification regarding how much of it is true, to say nothing of categories of language like imperative and artistic that don't have truth values at all.


> I'm not sure "hallucination" is the right word. I've seen it referred to as "stochastic parroting" elsewhere, and that probably gives more insight into what is happening.

It may give more insight, but it seems to me that hallucination is very similar: the brain completing some incomplete/random data to what it thinks is plausible and/or desirable.


That's how sensory processing work in general, not just hallucinations.


Extrapolating could be an alternative phrasing.


Hallucination is commonly used in ML parlance and gets the point across without needing to know what "stochastic" means.


"Stochastic" means "random, not supported by facts, hypothetical" in every context in which it is used, across many fields.

The real problem is that anyone thought that they could pull factual material out of a giant language correlation network.


stochastic screening in printing (as opposed to halftoning) samples constrained random color points from a real/actual image

the uses of stochastic i've seen 'in the wild' have nothing to do with 2/3 of that definition


The temperature parameter selects randomly (more or less random/predictable depending on value) from different distributions(stochastic sampling)

Not contradicting you, but wanted to add it. I was reading about it today.


It’s also a misleading anthropomorphization that can get across the wrong message - in particular among those who don’t know what “stochastic” means but also among those who should know better.


If people in the ML community don't know what stochastic means then how can they communicate with each other? Precision in communication in such contentious areas seems to me to be of paramount importance, especially when speaking to people not in ones immediate circle.


They are not forced to come up with new ideas. They can also write something like „I have no further information about that“. But in training this is probably discouraged, because they shouldn’t answer all questions like that.


I don't think it works that way. The models don't have a database of facts, so they never reach a point where they know that something they're saying is based on the real world. I think in other words, they literally operate by just predicting what comes next and sometimes that stuff is just made up.


ChatGPT has responded to a lot of my requests with an answer along the lines of "I don't have information about that" or "It's impossible to answer that without more information, which I can't get."

Sometimes, starting a new session will get it to give an actual answer. Sometimes asking for an estimate or approximation works.


This is covered in ChatGPT’s learn more section:

> Limitations

> ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.

https://openai.com/blog/chatgpt/


That's a filter answering, not GPT. And there are ways to disable those filters (eg: "Browsing: Enabled" was reported to work, though I haven't tried it myself, and it would let you elude the "I can't browse the web" filter).


ChatGPT has done that for me too, but as you note asking the question a slightly different way produced a positive response. I think they simply trained it to produce “I don’t know” as a response to certain patterns of input.


Yes, the training doesn't encourage this. It encourages guessing, because if it guesses the next word and it's right, the guessing is reinforced.

Whenever the model gets something right, it's the result of good guesses that were reinforced. It's all guesswork, it's just that some guesses are right.


> And they don't have a choice about this; they must predict the next word, even if it means that they have to make something up.

No, they could easily generate the end-of-sequence symbol, or the words “I don’t know.”


,,The word "hallucination" itself was introduced into the English language by the 17th-century physician Sir Thomas Browne in 1646 from the derivation of the Latin word alucinari meaning to wander in the mind. For Browne, hallucination means a sort of vision that is "depraved and receive[s] its objects erroneously".[8]''

I'm not sure if we know enough about hallucination to confirm that it's that much different from what GPT is doing.


The next word is always chosen based on some sort of probability output, correct? Then why isn't it possible to notice when the highest probability drops and the output is likely nonsense? Being able to say "I'm not sure" would be a massive improvement to this model.

Another cool feature would be to provide sources for the information: which web pages contributed most to a specific statement. Then a human can follow up manually.


The problem is that "I'm not sure" has only a few synonyms, like "I don't know", but the correct answer to a complex question can phrased in many ways. For instance, "How do owls catch mice?" could be answered by "Researchers in Britain have found...", or "Owls in Europe...", or "Bird claws can be used to...", or "Mice are often found in...", etc. Even if the model "knows" the answer with high probability, it could be that any particular way of expressing that knowledge is less likely than an expression of ignorance.

And besides that technical issue, since a GPT-style model is trained to mimic the training data, it is _supposed_ to say "I don't know" with a certainly probability that reflects how many people commenting on the matter don't know, even when there are other people who do know. That's not what you want in system for answering questions.

The enterprise is fundamentally misguided. A model for predicting the next word as a person might produce it is not a reliable way of obtaining factual information, and trying to "fix" it to do so is bound to fail in mysterious ways - likely dangerous ways if it's actually used as a source of facts.

In contrast, there are many ways that a GPT-style model could be very useful, doing what it is actually trained to do, particularly if the training data were augmented with information on the time and place of each piece of training text. For example, an instructor could prompt with exam questions, to see what mistakes students are likely to make on that question, or how they might misinterpret it, in order to create better exam questions. Or if time and place were in the training data, one could ask for a completion of "I saw two black people at the grocery store yesterday" in Alabama/1910 and California/2022 to see how racial attitudes differ (assuming that the model has actually learned well). Of course, such research becomes impossible once the model has been "fixed" to instead produce some strange combination of actual predictions and stuff that somebody thought you should be told.


Look at the titles, author names, and links. The model would be very confident in that output because it is so close to the mean. The model doesn't know that it is confused, instead it confidently parrots the most generic, bland continuation in can come up with.


I wonder if "cargo culting" would be an accurate characterization.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: