Hacker News new | past | comments | ask | show | jobs | submit login

Is there an encoding that is less wasteful that base64 but not vulnerable to text editor corruption issues? I think avoiding 0x0 to 0x20 should be enough to not get corrupted by text editors, though base64 avoids a lot more than that.



If you can count on every printable ascii character being not-mangled, you can use ascii85/base85/Z85 (5 "ascii characters" to 4 bytes) instead of base64.


there's also base91, with an efficiency of 6.5 bits of data per printable character, compared to 6.4 with ascii85, 6.0 with base64


There's probably a base(bigger number) with Unicode chars today


base65536, and look who the author is :-D

https://github.com/qntm/base65536


Who is the author?




But you need to make sure to use utf-16 or utf-32 instead of utf-8, or you may be worse off.


Those get mangled by text editors that don't support them.


While a couple of people suggested Base65536, that encoding isn't particularly compact, and it can't be as elegant as 65536 would suggest because it has to dodge special cases in unicode.

It's almost always the case that either Base32768 is denser, or encodings with 2^17 or 2^20 characters are denser.


At that point you're basically doing yEnc.


if you mean the thing you want to encode is mostly-ascii, then https://en.wikipedia.org/wiki/Quoted-printable ... it's a real throwback, I've not seen this in the wild since the 90s, but it's there in the python standard library (quopri), perl (MIME::QuotedPrint) etc


Base85, also called Ascii85. Also yEnc.


base16


Woah, why the downvotes.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: