If you expect "the right way" to be something _other_ than a system which can ge...

Techonomicon · 2024-10-31T00:29:56 1730334596

Because we still don't know how the brain really does all it does in very specific terms, so why assume to know exactly how we think?

alexwebb2 · 2024-10-31T00:34:04 1730334844

Why is there only one valid way of producing thoughts?

jltsiren · 2024-10-31T01:59:00 1730339940

Finite-state machines are a limited model. In principle, you can use them to model everything that can fit in the observable universe. But that doesn't mean they are a good model for most purposes.

The biggest limitation with the current LLMs is the artificial separation between training and inference. Once deployed, they are eternally stuck in the same moment, always reacting but incapable of learning. At best, they are snapshots of a general intelligence.

I also have a vague feeling that a fixed set of tokens is a performance hack that ultimately limits the generality of LLMs. That hardcoded assumptions make tasks that build on those assumptions easier and seeing past the assumptions harder.

alexwebb2 · 2024-10-31T02:37:43 1730342263

> At best, they are snapshots of a general intelligence.

So are we, at any given moment.

Jensson · 2024-10-31T03:03:59 1730343839

> As I'm writing this, I'm deciding the next few words to type based on my last few.

If so you could have written this as a newborn baby, you are determining these words based on a lifetime of experience. LLMs doesn't do that, every instance of ChatGPT is the same newborn baby while a thousand clones of you could all be vastly different.

thomashop · 2024-10-31T04:06:47 1730347607

  We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways by which different neural networks represent data are becoming more aligned. Next, we demonstrate convergence across data modalities: as vision models and language models get larger, they measure distance between datapoints in a more and more alike way. We hypothesize that this convergence is driving toward a shared statistical model of reality, akin to Plato’s concept of an ideal reality. We term such a representation the platonic representation and discuss several possible selective pressures toward it. Finally, we discuss the implications of these trends, their limitations, and counterexamples to our analysis.

https://arxiv.org/html/2405.07987v5