It's not so much the cost as much the fact that they got a slightly better resul...

futureshock · 2024-12-20T22:29:04 1734733744

“Simply brute-forcing”

That’s the thing that’s interesting to me though and I had the same first reaction. It’s a very different problem than brute-forcing chess. It has one chance to come to the correct answer. Running through thousands or millions of options means nothing if the model can’t determine which is correct. And each of these visual problems involve combinations of different interacting concepts. To solve them requires understanding, not mimicry. So no matter how inefficient and “stupid” these models are, they can be said to understand these novel problems. That’s a direct counter to everyone who ever called these a stochastic parrot and said they were a dead-end to AGI that was only searching an in distribution training set.

The compute costs are currently disappointing, but so was the cost of sequencing the first whole human genome. That went from 3 billion to a few hundred bucks from your local doctor.

krethh · 2024-12-23T20:06:10 1734984370

Let's make two generous assumptions: 1. ARC-AGI actually generalizes to human intelligence 2. It took 172x more compute to go from ~75% to ~87%, so it will take roughly 4x that to get to 99% (the level of a STEM graduate), assuming every 172x'ing of the compute cuts the remaining gap in half

That is roughly 10^9 times more compute required, or roughly the US military budget per half an hour, to get the intelligence of 1 (!) STEM graduate (not any kind of superhuman intelligence).

Of course, algorithms will get better, but this particular approach feels like wading in a plateau of efficiency improvements, very, very far down the X axis.