Hacker News new | past | comments | ask | show | jobs | submit login

I couldn't really care less if someone POSTs some garbage JSON that results in them getting a 500 response. Better than someone POSTing an XML bomb and affecting other peoples' requests. Please enlighten me if you know of a serialization format with libraries for all common languages that lacks any gotchas or edge cases.



All software has edges, so edge cases are unavoidable.

The best you can do is:

- interpret the spec to the letter.

- for every fragment of a statement you write, consider whether it might conceivably go wrong, and handle those cases (in the simplest matter because 'handling' means writing code, and that code, too, needs to go through this process).

For example, a json parser must be prepared to handle missing values, extremely long keys and values (integers may have thousands of digits, think long about the question whether 64 bits always is enough for storing a string length, etc.), etc.

- if you are truly paranoid, have very stringent security requirements, or expect to be heavily attacked, run the parser in a separate process.

- fuzz your implementation.


> think long about the question whether 64 bits always is enough for storing a string length, etc.), etc.

I'm struggling to think of any realistic scenario where this isn't true!


So do I, but you should still consciously decide whether to add an overflow check.

Let's do a quick estimate: one can read in the order of 2^24 bytes/second from disk. A day has on the order of 2^16 seconds, so that's 2^40 bytes/day.

=> You will need 2^24 days to read 2^64 bytes. I think that's around 50k years. That an attacker will try to generate a buffer overflow this way is a risk I would take, even if I thought the hardware had room to store that string.

The only way I can foresee a real risk is when an optimizer can optimize away the computation of a string whose length it is asked to compute by an attacker.

That's still very much far-fetched, and if no string gets allocated it's hard to see how it could become a security issue, but it could be a reason to be extra careful, for example when providing online access to a C++ compiler, with its template metaprogramming capabilities.


If all you're doing is parsing their JSON for it's own sake, it's just a 500; but that's the boring case. Consider what happens when typical web code is interacting with the JSON parser.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: