I wonder if he tried the trick that Microsoft recommended in their GPT4 evaluation paper, that is ask gpt to go step by step with explanations. It tends to produce much better results, simply because it is more fitting to the prediction mechanism that GPT uses . It tends to predict better when steps are smaller.