Is it the redundancy? Or is it because markup is a more natural way to annotate language, which obviously is what LLMs are all about?
Genuinely curious, I don’t know the answer. But intuitively JSON is nice for easy to read payloads for transport but to be able to provide rich context around specific parts of text seems right up XML’s alley?
The lack of inline context is a failing of JSON and a very useful feature of XML.
Two simple but useful examples would be inline markup to define a series of numbers as a date or telephone number or a particular phrase tagged as being a different language from the main document. Inline semantic tags would let LLMs better understand the context of those tokens. JSON can't really do that while it's a native aspect of XML.
Genuinely curious, I don’t know the answer. But intuitively JSON is nice for easy to read payloads for transport but to be able to provide rich context around specific parts of text seems right up XML’s alley?