So, basically, it's chain of thought as a service? Not a model, per se, but a se...

KeplerBoy · 2024-09-13T07:11:14 1726211474

Who knows? Certainly not the public.

It might be a finetuned model that works better in such a setting.

OkGoDoIt · 2024-09-13T16:48:15 1726246095

The linked blog posts explains that it is fine-tuned on some reinforcement learning process. It doesn’t go into details but they do claim it’s not just the base model with chain of thought, there’s some fine-tuning going on.