That's exactly the problem I was going for - feature richness is the enemy of simplicity. The fact that you need to read a schema (and learn one of the schema languages) makes starting out with XML just much harder than JSON.
You implicitly need a schema in JSON too: it's the API documentation.
Then, every implementer in a statically typed language needs to translate the JSON format into structures in their own language. And this is impossible to automate, unless the JSON API uses a formally defined schema (eg. OpenAPI or GraphQL schemas), but then we're back to square 1.
If you don't care about that (eg. in dynamically typed languages), you can parse XML just like you parse JSON, using something like https://github.com/martinblech/xmltodict
> XML is a markup language. It's in the name even.
It can be used for markup, but isn't limited to that.