Or the test wasn't testing anything meaningful, which IMO is what happened here....

atleastoptimal · 2024-12-21T15:28:30 1734794910

Id agree with you if there hasn’t been very deliberate work towards solving ARC for years, and if thr conceit of the benchmark wasn’t specifically based on a conception of human intuition being, put simply, learning and applying out of distribution rules on the fly. ARC wasn’t some arbitrary inverse set, it was designed to benchmark a fundamental capability of general intelligence