While true, I think this is still valid criticism considering so many people are quick to jump on the "AGI" bandwagon when discussing the current generation of LLMs.
No ones thinking a 7b-70b LLM is going to be an AGI lol, a 700b-1T llm likely gets pretty damn close especially with some of the newer attention concepts.
And yet GPT-4 with 1-2 trillion parameters still fails at the most basic math, sometimes even for tasks like adding up a set of ten numbers (hence the Wolfram comment). That's as clear evidence as any that intelligence is more than just language proficiency.