The topic is about decision problems, not about backward compatibility. That said, JSON has corner cases too, similar to XML or worse. All those complaints are just rationalization of hipster propaganda.
JSON has corner cases but the advantage is that every JSON document has a single obvious mapping to language primitives — dicts, lists, strings, floats. Nobody has to agree beforehand how to load JSON data.
In contrast the generic mapping for XML is a
Tree[str (name), Dict[str, str] (attrs), Union[str, Tree] (body)] which maps so poorly between languages that people do one of two things — implement formats on top of XML to do serialization which leads to non-interoperability when different software does it differently, or parse to a database-like “Abstract XML
object that you query with xpath.
JSON maps well to javascript types, but not anything else. And float isn't an obvious mapping: the standard does have a concept of an integer number, and most numbers are indeed integers. Arrays do everything objects do, but have better performance and better defined behavior.
>Nobody has to agree beforehand how to load JSON data.
Such agreement is never necessary, it's up to the programmer what to write, and the standard doesn't specify behavior of JSON parsers anyway, it only defines JSON documents. For example there's no need to use hashtables, it's a random javascript artifact due to parsing JSON with the eval function.
>when different software does it differently
I assume you mean schemaless documents here. Those are always abstract databases, both XML and JSON. I suppose there's jq that can query abstract JSON databases.
> JSON maps well to JavaScript types, but not anything else.
I grant you that JSON might be equally as awkward as XML languages like C but pretty much every language -- Python, Ruby, Java have very sane mappings to and from JSON types. You don't ever really have to "query" a JSON object, you just `json.loads` and `for item in obj["key"]:`. Even in the cases with schemas you're still usually only working with primitive types.
> Such agreement is never necessary, it's up to the programmer what to write...
What I mean is that there's not weirdness like having to encode types in the base document. You don't have to do things like
where different projects / parsers might do it differently. The "abstract JSON types" are actually useful and expressive where in XML everyone has to carve out their own way to represent lists, mappings, and numbers out of trees because basically nobody works with just trees in day-to-day work.
I think we might be talking about two different use-cases. If what you want to do with XML / JSON is serialize arbitrary classes in $specific_language and then read it back then nothing really matters; the on-disk format is just an implementation detail. But abstract JSON works really really well as a schema everyone agrees on and supported by every language.
> You don't have to do things like [...] carve out their own way to represent lists, mappings, and numbers
I work with XML extensively and out of hundreds of classes and fields, I've needed an arbitrary dictionary maybe a handful of times. Mapping/dictionary is json's abysmal replacement for a class/struct in which case you'd have XML like
<MyClass>
<Field1>Value</Field1>
</MyClass>
IOW, _the tag is the key_ ! List? Simply repeated elements. Numbers? What are you talking about, they're directly representable in XML and XSD knows about integers, floats, etc. (unlike json).
You don't show anything with that. Sure you can walk through any schemaless JSON document, because it has generic JSON document structure, but the same can be done for XML too in any language. You can't make sense of the document this way beyond its wellformedness. There being numbers don't help you much, you can't tell anything about them beyond them being numbers.
>JSON maps well to javascript types, but not anything else.
Except it does. Take python. Ruby. Any language that has a notion of dicts, lists and strings/ints/floats. That's basically every high level language ever. Even exotic stuff like tcl. And e.g. C, a low level language, has a thousand implementations of those same structures.
The often mentioned design paralysis of choice between elements and attributes - in JSON there can be many ways to implement a collection of name/value pairs. One interesting case is compound key: you can use a mini serialization format and still make it an object (actually saw this in the wild).
> in JSON there can be many ways to implement a collection of name/value pairs.
Are there? JSON has dicts and lists. I mean you could store a collection of name value pairs in a list, maybe even a list of lists, but that's just stupid and an incorrect usage of the format.
Where as in xml there really are tons of ways to do that, and ALL of them are awkward
Most of these just amount to "some parsers are bad", which.. I mean sure? But xml's surface area for parsers to be 'bad' is so much greater, and its gotchas are thus much more subtle. I hope you're not trying to suggest that all xml parsers are identical and perfect.
Your assertion was that json's edge cases are "as bad or a little worse" but imo this document doesn't suggest that at all. Every single thing listed in it can go more wrong with xml, not less.