Does ChatGPT actually manipulate symbols, or does it string together strings of characters based on what most frequently occurs next? I haven't seen anything that indicates actual working w/ symbols/logic/ideas.
as a string of ISO-Latin1 characters which are frequently found in that order, and "novel by Tolstoy which contains many insights into the human condition" _and_ understanding the symbolism of said insights.
A grade school student who has read and understood _War and Peace_ would be able to write a paper which is original to the degree that it did not previously exist as that exact sequence of characters, even if it had no original insights, while ChatGPT would regurgitate the most frequent combinations of characters which the metadata and so forth indicate are in the context of writing about that novel.
i think it is working with embeddings - meaning that each word (or word stem) is represented by a very long vector of floating point numbers. These vectors are the result of something like word2vec. The input text is transformed into these embedding vectors and lined up in sequence (that's called the context window), then this gargantuan ML model works starts to process this gargantuan input sequence.