Tbf that was exactly my point. An adult might use 'inference' and 'reasoning' to ask clarification, or go with an internal logic of their choosing.
ChatGPT here went with a lexigraphical order in Python for some reason, and then proceeded to make false statements from false observations, while also defying its own internal logic.
"six" > "ten" is true because "six" comes after "ten" alphabetically.
No.
"ten" > "seven" is false because "ten" comes before "seven" alphabetically.
No.
From what I understand of LLMs (which - I admit - is not very much), logical reasoning isn't a property of LLMs, unlike information retrieval. I'm sure this problem can be solved at some point, but a good solution would need development of many more kinds of inference and logic engines than there are today.