I tend to agree. I think a lot of what LLMs do that looks like reasoning is just...

I tend to agree. I think a lot of what LLMs do that looks like reasoning is just basic statistical pattern matching. If you see the string `2 + 2 = 4` enough, you don't need to know maths to know the next character in the string, `2 + 2 = ` is going to be, `4`.

A lot of the stuff people ask LLMs do to probably falls into this category to be honest. If I were to guess the vast majority of maths on the internet is just generic textbook examples which an LLM might be tempted to brute force. There's likely much better things LLMs can dedicate network capacity to to reduce error than learning high-level math models, given that most text doesn't contain maths, and most of the maths it sees is probably just the generic, `2 + 2 = 4` type stuff.

I'd argue humans often do this too despite our general intelligence. Student prepping for exams often read all of the course material the night before and simply try to remember the bits that they need to pass their test rather than spending unnecessary time building a higher level mental models of the material.

It's only when you need to learn to predict something novel frequently that the need to comprehend at a higher level becomes necessary.

I think given the size of modern LLMs and the amount of data they're trained on it's likely that they are now starting to create these higher level abstractions as the limits of brute force statistical pattern matching are starting to be reached.

With GPT-3.5/4 it seems specifically the types of reasoning you might need to follow a piece of written text it can do quite well. And this isn't surprising given the training data probably consists of a lot of news articles and fictional pieces of text, and relatively little unique maths.