Hacker News new | past | comments | ask | show | jobs | submit login

The = does not appear if the base64 data is a multiple of 4 length. So you wouldn't know if aGVsbG8I is one or two streams. The = is not a separator, only padding to make the base64 stream a multiple of 4 length for some reason.

I only mentioned the concatenation because Wikipedia claims this use case requires padding while in reality it doesn't.




Base64 doesn't have a concept of "stream". Conceptually base64-encoded string with padding is a concatenation of fragments that are always 4 bytes long but can encode one to three bytes. Concatenating two base64-encoded strings with padding therefore don't destroy fragment structures and can be decoded into a byte sequence that is a concatenation of two original input sequences. Without padding, fragments can be also 2 or 3 bytes and short fragments are not distinguishable from long fragments, so the concatenation will destroy fragment structures.


Oh I see, so it's for concatenating multiple base64 fragments of the same single piece of data? But where is this used? Never seen that. Javascript's base64 decoder gives an error for ='s in the middle (but I just found out the Linux base64 -d command supports it!)


I actually don't know if it's an intention, but it is the only explanation that makes sense. It should be noted that the original PEM specification (RFC 989) did have a similar use case where alternating encrypted and unencrypted bytes can be intermixed by `*` characters, but you are still required to pad each portion to 4n bytes (e.g. `c2VjcmV0LCA=*cHVibGlj*IGFuZCBzZWNyZXQgYWdhaW4=`). It is still the closest to what I think padding characters are required for.


It would decode correctly but you wouldn't know the boundary, if that matters. I see, thanks.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: