Hacker News new | past | comments | ask | show | jobs | submit login
Parsing Malformed JSON (peteris.rocks)
9 points by goranmoomin on Nov 17, 2019 | hide | past | favorite | 12 comments



Encountered almost exactly this same problem yesterday but fortunately was able to find and fix the issue with the generator though it made me wonder about alternatives like this.


Nice Work, Post looks good.

It also sounds much better than "Paring Malformed XML" ;)


Something that would never happen with XML, because it has a grammar. But yes, it's associated with enterprise software and that is bad by definition, right?

Well, live and learn.


JSON has a grammar too. That is why they had to make a malformed JSON parser.

I can ship malformed XML too just like some person shipped malformed JSON. Maybe I forget to XML encode my text or something. Your XML parser would choke on it and you might be making this same exact post.

I don’t have care about JSON vs XML but let us not spread the lie that this cannot happen with XML. It happens all of the time.


In fact, malformed XML is more frequent than malformed JSON, and it's not hard to see why.

JSON types match the basic object model of many dynamic languages; not just JavaScript, but also Python and to a lesser extent Lua and PHP. So when people want to generate JSON, they usually do it the 'proper' way: by constructing a native structure consisting of built-in dictionaries and lists in their language of choice and then serialising it to a string. This always generates valid JSON syntax; the worst risk is that an empty dictionary could be mis-serialised as an empty array or vice versa, in languages that don't distinguish the two.

XML has no such easy support; when you generate XML, the least-effort solution is to use sprintf (or an equivalent), which creates the hazard of the programmer forgetting to escape syntax-significant characters (and let's not forget binary/text confusion in some languages).

JavaScript had to evolve E4X and JSX to solve the interpolation problem with XML. It didn't have to create any new syntax to support JSON; it is purely a library feature.

Of course, some programming languages favour neither, like C or Java.


Malformed XML is everywhere. Especially in enterprise environments. And there's a reason web browsers have an extremely detailed spec on how to parse malformed XML.


Web browsers refuse to parse malformed XML.


Only when specific uncommon headers are set. Under normal circumstances, if you feed them a malformed xhtml document, they will parse it.


Sending XML with the proper application/xml (or some other) media type is rather common.

Sending XHTML with the HTML media type doesn't count. In that case you're just sending malformed HTML which the browser doesn't recognise as XML at all.


What do you mean? They definitely try to. A lot of effort is made in this direction, and IMO the results are fairly impressive


No, they don't. Try putting

   <foo
in foo.xml and open it in a browser.


Except your thing that actually works with the parsed xml will choke and die, because your application never expected those kinds of tags.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: