> while using this implicit knowledge and feedback it gets from running the prog...

scotty79 · 2024-12-21T10:59:39 1734778779

Basically solutions that were doing well in arc just threw thousands of ideas at the wall and picked the ones that stuck. They were literally generating thousands of python programs, running them and checking if any produced the correct output when fed with data from examples.

This o3 doesn't need to run python. It itself executes programs written in tokens inside it's own context window which is wildly inefficient but gives better results and is potentially more general.

TheOtherHobbes · 2024-12-21T11:29:34 1734780574

So basically it's a massively inefficient trial-and-error leetcode solver which only works because it throws incredible amounts of compute at the problem.

This is hilarious.

scotty79 · 2024-12-21T18:08:37 1734804517

Previous best specialized ARC solver was exactly that.

This o3 thing might be a bit different because it's just chain of thought llm that can do many other things as well.

It's not uncommon for people to have a handful of wrong ideas before they stumble upon a correct solution either.

empiko · 2024-12-21T11:23:26 1734780206

I assume that o3 can run Python scripts and observe the outputs.