Hacker News new | past | comments | ask | show | jobs | submit login

Filenames may contain newlines. JSON strings may contain newlines.

The modular aspects of the UNIX philosophy are pretty cool; the data interchange format (un-typed \n-delimited strings) is irrational (and

dangerous).

JSON w/ a JSONLD @context and XSD type URIs may also contain newlines (which should be escaped)

Note that, with OSX bash, tab \t must be specified as $'\t'.

And, sometimes, it's \r\n instead of just \n (which is extra-format metadata).

And then Unicode. Oh yeah, unicodë.




You don't have to use $'', you can also use literal tabs (ctrl-v to insert literals). The main difference between macOS and Linux that people notice is that macOS sed doesn't itself interpret \t (so you have to use literal tabs or $'' there).

\r\n is Windows, not Unix.

What about Unicode? (Btw, UTF-8 was created by unixers Rob Pike and Ken Thompson https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt )


With Ctrl-V <tab>, it's not possible to determine whether it's spaces or tabs (without cursoring over the /s|/t)

When you're parsing a text file, or streaming lines of text delimited with /n, how do downstream programs know whether it's ASCII or unicode?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: