> since the issue is general I'm not sure what that means specifically. I don't ...

mdp2021 · 2025-03-07T13:49:52 1741355392

I am stating that since the ability to solve those puzzles is critical in an intelligence, and the general questions I can think of require an intelligence as processor, if to solve those problems the LLMs "should write code" then in general they should.

All problems require proficient reasoning to get a proper solution - not only puzzles. Without proper reasoning you can get some "heuristic", which can only be useful if you only needed an unreliable result based on "grosso modo" criteria.

jmmcd · 2025-03-07T15:44:38 1741362278

> Without proper reasoning you can get some "heuristic"

Right, but the question is whether this is good enough. And what counts as "proper". A lot of what we call proper reasoning is still quite informal, and even mathematics is usually not formal enough to be converted directly into a formal language like Coq.

So this is a deep question: is talking reasoning? Humans talk (out loud, or in their heads). Are they then reasoning? Sure, some of what happens internally is not just self-talk, but the thought experiment goes: if the problem is not completely ineffable, then (a bit like Borges' library) there is some 1000-word text which is the best possible reasoned, witty, English-language 1000-word solution to the problem. In principle, an LLM can generate that.

If your goal is a reductio, ie my statement must be false since it implies models should write code for every problem - then I disagree, because while the ability to solve these problems might be a requirement to be deemed "an intelligence", nonetheless many other problems which require an intelligence don't require the ability to solve these problems.

mdp2021 · 2025-03-07T16:45:59 1741365959

> Are they then reasoning

Reasoning properly is at least operating through processes that output correct results.

> Borges' library

Which in fact is exactly made of "non-texts" (the process that produces them is `String s = numToString(n++);` - they are encoded numbers, not "weaved ideas").

> many other problems which require an intelligence don't require the ability to solve these problems

Which ones? Which problems that demand producing correct solutions could be solved by a general processor which could not solve a "detective game"?

jmmcd · 2025-03-07T17:46:49 1741369609

I can't reply to your new post below, I guess the thread is too deep. But you've bit the bullet and stated that what humans do is not reasoning, I think.

You didn't like "what colour is the sky" (without looking), ok. "Given the following [unseen during training] page of text, can you guess what emotion the main character is feeling at the end?" This is a problem that a human can solve, and many LLMs can solve, even if they can't solve the detective puzzle. In case it doesn't sound important, this can be reframed as a customer-service sentiment-recognition problem.

mdp2021 · 2025-03-07T18:09:53 1741370993

> I can't reply to your new post below, I guess the thread is too deep

(I'd instead guess that you tried to reply before the timer - which allows HN members to reply after a delay proportional to a function of the depth of the discussion tree - allowed you.)

> do is not reasoning

What some people do is «not reasoning», for lack of training, or for lack of resources (e.g. time - Herbert Simon's "satisficing"), or for lack of ability. I had to write since the late 2022 boom that "if the cousin you write about is consistently not using the faculty of intelligence you can't call her "intelligent" for the purpose of this discussion". I have just written in another parallel discussion that «There is a difference between John who has a keen ethical sense, Ron who does not exercise it, and Don who is a clinical psychopath with missing cerebral modules making it completely Values-blind» - of course if we had to implement ethics we would "backward engineer" John and use Don as a counter-model.

> can you guess what emotion

Let me remind you my words: «Without proper reasoning you can get some "heuristic", which can only be useful if you only needed an unreliable result based on "grosso modo" criteria». Is that problem one that has "true solutions" or one that has "good enough solutions"?

Let me give another example. Bare LLMs can be "good" (good enough) e.g. in setting capitalization and punctuation in "[a-z0-9 ]" texts, such as raw subtitles. That is because they operate without explicitly pondering the special cases in which it is subtle to unequivocally decide whether the punctuation there "should have been a colon or a dash", and such cases are generally rare, so heuristic seems to suffice.

Similar engines are useless and/or dangerous in all cases in which correct responses are critical. Important problems are those which require correct responses.

jmmcd · 2025-03-08T09:17:17 1741425437

> What some people do is «not reasoning», for lack of training, or for lack of resources (e.g. time - Herbert Simon's "satisficing"), or for lack of ability.

According to your definition of reasoning, which involves surely getting the right answer, no human does reasoning. Probably less than 1% of published mathematics meets the definition.

> Important problems are those which require correct responses.

There are many important problems where formal reasoning is not possible, and yet a decision is required, and both humans and LLMs can provide answers. "Should I accept this proposed business deal / should I declare war / what diagnostic test should I order?" We would like to have correct responses for these problems, but it is not possible, even in principle, to guarantee correctness. So yes, we use heuristics and approximate reasoning for such problems. Is an LLM "unreliable" or "dangerous" in such problems? Maybe yes, and maybe more so than humans, but maybe not, it depends on the case. To try to keep the point of the thread in focus, an LLM should probably not try to solve such problems by writing code.

mdp2021 · 2025-03-08T23:01:59 1741474919

> According to your definition of reasoning, which involves surely getting the right answer

No. Let me reiterate: «"proper reasoning" is that process which given sufficient input will surely bring to a correct output owing to the effectiveness of its inner workings», given that enough resources are spent. I.e.: it is a matter of method.

And a processor that cannot solve the "detective games" shows lacking that method. (I.e.: the general capabilities that can be instanced in solving a "detective game" are required, though not exhaustive, for the reasoner.)

> we use heuristics and approximate reasoning for such problems

But we are expected to still use decent reasoning, even when bounded.

So: there may be no need to try and solve problems through writing code when the reasoning machine has the procedural modules that allow to reason similarly to running code, when such form of "diligence" is needed. When the decision is not that impactful (e.g. "best colour for the car"), let the decisor "feel"; when the decision will be impactful, I want that the decisor be able to reason.

jmmcd · 2025-03-09T10:45:12 1741517112

I've said several times that according to your definition, humans do not reason. You haven't really responded to that and I guess you're not going to. I can't quite parse your overall position, ie specifically, as I already said, whether you genuinely think LLMs should output code for most problems, or whether you were using that as a reductio against my initial statement. So, I will stop here and thank you for the discussion.

mdp2021 · 2025-03-09T13:48:48 1741528128

> according to your definition, humans do not reason

Some do.

> whether you genuinely think LLMs should output code for most problems, or whether you were using that as a reductio against my initial statement

No, they should not. But proper reasoning is related to procedural operations like code.

jmmcd · 2025-03-07T17:10:49 1741367449

> Reasoning properly is at least operating through processes that output correct results.

Human "reasoning" (ie speech or self-talk that sounds a bit like reasoning) often outputs correct results. Does "often" fit the definition?

> Which problems that demand producing correct solutions could be solved by a processor which could not solve a "detective game"?

For example, "what colour is the sky right now?". A lot of people could solve this (even if they haven't looked outside), and so could a lot of language models, which can't solve this detective game.

mdp2021 · 2025-03-07T17:28:56 1741368536

> Does "often" fit the definition?

No: "proper reasoning" is that process which given sufficient input will surely bring to a correct output owing to the effectiveness of its inner workings.

> what colour is the sky right now

That is not a general problem solver, and "output the most common recorded reply to a question" is certainly not a general problem solver, and the responses from the box indicated will easily be worthless for all special cases in which the question will make sense.