Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, this is silly. Pretty much every serializer in existence is going to handle this case. If the attacker wrote their own, then you might get lucky





AFAIU CSV is fundamentally ambiguous and can't actually be parsed in a fully deterministic way.

Edge cases get hard when dealing with nested commas, and there's no standard escape sequence.

Probably matters less with a two column arrangement, but things get really hairy really fast when you start adding types or BLOBs in the CSV.


AFAIK it's only "ambiguous" in the sense that if you get a csv file you can't determine the exact parsing behavior to use, but if you know what program created the csv (or what encoder options were used), it's not ambiguous to parse.

>but things get really hairy really fast when you start adding types or BLOBs in the CSV.

AFAIK BLOBs are hex encoded, which make them a non issue.


Hah! Half the time people will even do silly things like cat together multiple CSVs from different sources.

If blobs got consistently hex encoded, that would also be nice. Base64 is common, and there are multiple types of base64 encoding people use too.

Personally, I tend to think of CSV imports as something you can expect to have a ‘yield’ - and it’s never 100%.


yea so just do BSV or bell separated file. We already have "\n" newline separated files. We just need a cel seperator, '\b'. Problem solved.

On the plus side, accidentally cat’ng it to your terminal will be pleasantly musical.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: