You're talking apples and oranges. The plateau the frontier models have hit is t...

famouswaffles · on Dec 21, 2024

"New" reasoning models are plain LLMs with clever reinforcement learning. o1 is itself reinforcement learning on top GPT-4o.

They found a way to make test time compute a lot more effective and that is an advance but the idea is not new, the architecture is not new.

And the vast majority of people convinced LLMs plateaued did so regardless of test time compute.

HarHarVeryFunny · on Dec 21, 2024

The fact that these reasoning models may compute for extended durations, using exponentially more compute for linear performance gains (says OpenAI), resulting in outputs that while better are not necessarily any longer (more tokens) than before, all point to a different architecture - some type of iterative calling of the underlying model (essentially a reasoning agent using the underlying model).

A plain LLM does not use variable compute - it is a fixed number of transformer layers, a fixed amount of compute for every token generated.

throwaway314155 · on Dec 21, 2024

Architecture generally refers to the design of the model. In this case, the underlying model is still a transformer based llm and so is its architecture.

What's different is the method for _sampling_ from that model where it seems they have encouraged the underlying LLM to perform a variable length chain of thought "conversation" with itself as has been done with o1. In addition, they _repeat_ these chains of thought in parallel using a tree of some sort to search and rank the outputs. This apparently scales performance on benchmarks as you scale both length of the chain of thought and the number of chains of thought.

HarHarVeryFunny · on Dec 21, 2024

No disagreement, although the sampling + search procedure is obviously adding quite a lot to the capabilities of the system as a whole, so it really should be considered as part of the architecture. It's a bit like AlphaGo or AlphaZero - generating potential moves (cf LLM) is only a component of the overall solution architecture, and the MCTS sampling/search is equally (or more) important.

throwaway314155 · on Dec 23, 2024

Ah, I see. Yeah that's a fair assessment and in hindsight is probably the way architecture is being used in the article.

famouswaffles · on Dec 21, 2024

I think throwaway already explained what i was getting at.

That said, i probably did downplay the achievement. It may not be a "new" idea to do something like this but finding an effective method for reflection that doesn’t just lock you into circular thinking and is applicable beyond well defined problem spaces is genuinely tough and a breakthrough.