"Note on "tuned": OpenAI shared they trained the o3 we tested on 75% of the Public Training set. They have not shared more details. We have not yet tested the ARC-untrained model to understand how much of the performance is due to ARC-AGI data."
Really want to see the number of training pairs needed to achieve this socre. If it only takes a few pairs, say 100 pairs, I would say it is amazing!
Trained with 300 raw pairs directly from the ARC training set without using any data augmentation process, such as generating many more pairs with some kind of ARC generator? That's amazing.
Really want to see the number of training pairs needed to achieve this socre. If it only takes a few pairs, say 100 pairs, I would say it is amazing!