That's not a fair comparison. By that time the toddler can also ask questions, generate new labels using adjectives, label novel instances as compositions of previously acquired knowledge and generate sentences representing complex internal states. They are not limited to observed labels. In fact there is very little supervised learning in the form of [item, label, loss]. Beyond that, with enough stimulation and simply from interacting with each other, children can even spontaneously generate languages with complex grammar; without labeled supervision.
They'd also have gained the ability to do very (seriously) difficult things like walking, climbing objects, the rudiments of folk physics, picking things up and throwing things. They'd have some rudimentary ability modeling other agents.
It's good to be happy with current progress and I do not suffer from the AI-effect but being too lenient can hamper creativity and impede progress by occluding limitations.
We can train a CNN to do that in a few days.