To be fair, the claim wasn't that it always produced the wrong answer, just that...

thomashop · 2024-10-31T07:30:28 1730359828

Conversely, a pair of examples where it was incorrect hardly justifies the opposite response.

If you want a more scientific answer there is this recent paper: https://machinelearning.apple.com/research/gsm-symbolic

EraYaN · 2024-10-31T10:28:38 1730370518

It kind of does though, because it means you can never trust the output to be correct. The error is a much bigger deal than it being correct in a specific case.

thomashop · 2024-10-31T14:10:35 1730383835

You can never trust the outputs of humans to be correct but we find ways of verifying and correcting mistakes. The same extra layer is needed for LLMs.

digging · 2024-10-31T15:09:18 1730387358

> It kind of does though, because it means you can never trust the output to be correct.

Maybe some HN commenters will finally learn the value of uncertainty then.