One thing I was confused about. The document says there are 7 byte types, but I thought UTF-8 was variable width up to only 4 bytes. Did I misunderstand something?
Both are correct: This original UTF-8 encoding can encode values up to 2^32. But because UTF-16 encoding limits possible values to 16 planes of 64K values, unicode has a hard limit of 2^20 codepoints.
This means UTF-8 encoded values of more than 4 bytes can never represent a valid unicode codepoint even if they produce a valid 32 bit numerical value.