> ... means LLMs do encode concepts of human cognition AND > ... do encode struc...

> ... means LLMs do encode concepts of human cognition

AND

> ... do encode structural elements of our language and hence thought

Quite true. I think the trivial "proof" that what you are saying is correct is that a significantly smaller model can generate sentence after sentence of fully grammatical but nonsense sentences. Therefore the additional information encoded into the network must be knowledge and not syntax (word order).

Similarly, when there is too much quantization applied, the result does start to resemble a grammatical sentence generator and is less mistakable for intelligence.

I make the argument about LLMs being a time series predictor because they happen to be a predictor that does something that is a bit magical from the perspective of humans.

In the same way that pesticides convincingly mimic the chemical signals used by the creatures to make decisions, LLMs convincingly produce output that feels to humans like intelligence and reasoning.

Future LLMs will be able to convincingly create the impression of love, loyalty, and many other emotions.

Humans too know how to feign reasoning and emotion and to detect bad reasoning, false loyalty, etc.

Last night I baked a batch of gingerbread cookies with a recipe suggested by GPT-4. The other day I asked GPT-4 to write a dozen more unit tests for a code library I am working on.

> just about every sentence ever written across a wide swath of languages

I view LLMs as a new way that humans can access/harness the information of or civilization. It is a tremendously exciting time to be alive to witness and interact with human knowledge in this way.