Not quite. The ‘=‘ isn’t strictly padding - it’s the padding marker. You pad the original data with one or two bytes of zeroes. Then you add ‘=‘ to indicate how many such bytes you had to add.
This is because if you’ve only got one of the three bytes you’re going to need, your data looks like this:
XXXXXXXX
Then when you group into 6 bit base64 numbers you get
XXXXXX XX????
Which you have to pad with two bytes worth of zeroes because otherwise you don’t even have a full second digit.
XXXXXX XX0000 000000 000000
so to encode all your data you still need the first two of these four base64 digits - although the second one will always have four zeroes in it, so it’ll be 0, 16, 32, or 48.
The ‘=‘ isn’t just telling you those last 12 bits are zeroes - they’re telling you to ignore the last four bits of the previous digit too.
Similarly with two bytes remaining:
XXXXXXXX YYYYYYYY
That groups as
XXXXXX XXYYYY YYYY??
Which pads out with one byte of zeroes to
XXXXXX XXYYYY YYYY00 000000
And now your third digit is some multiple of 4 because it’s forced to contain zeroes.
Funny side effect of this:
Some base64 decoders will accept a digit right before the padding that isn’t either a multiple of four (with one byte of padding) or of 16 (with two).
They will decode the digit as normal, then discard the lower bits.
That means it’s possible in some decoders for dissimilar base64 strings to decode to the same binary value.
Which can occasionally be a security concern, when base64 strings are checked for equality, rather than their decoded values.
This is because if you’ve only got one of the three bytes you’re going to need, your data looks like this:
Then when you group into 6 bit base64 numbers you get Which you have to pad with two bytes worth of zeroes because otherwise you don’t even have a full second digit. so to encode all your data you still need the first two of these four base64 digits - although the second one will always have four zeroes in it, so it’ll be 0, 16, 32, or 48.The ‘=‘ isn’t just telling you those last 12 bits are zeroes - they’re telling you to ignore the last four bits of the previous digit too.
Similarly with two bytes remaining:
That groups as Which pads out with one byte of zeroes to And now your third digit is some multiple of 4 because it’s forced to contain zeroes.Funny side effect of this:
Some base64 decoders will accept a digit right before the padding that isn’t either a multiple of four (with one byte of padding) or of 16 (with two).
They will decode the digit as normal, then discard the lower bits.
That means it’s possible in some decoders for dissimilar base64 strings to decode to the same binary value.
Which can occasionally be a security concern, when base64 strings are checked for equality, rather than their decoded values.