overlapping b and i elements
<p>he<b>ll<i>o w</b>or</i>ld</p>
contary to the article it can still be represented as a tree, by decomposing the children into their own nodes (so in this case characters become nodes with child nodes expressing what formatting is active, followed by the letter, and then turn of all the active formatting)
No that's just nesting. It's overlapping if the lifetime of a child tag is greater than the lifetime of the parent tag.
Example if you have two paragraphs and bold the end of one and the start of the next
<p>hello <b>world</p> <p>this is</b> your captain speaking</p>
Obviously bold is a poor example as you can just terminate and start a new bold without penalty. But if these were more semantic elements like "sections" and "verses" and "lines" then it might not be possible.
It’s actually fiddlier than you may think. Take “Ta” for an example: in most decent fonts, there will be a kerning pair that tightens those characters, tucking the “a” underneath the beam of the “T” a little. The shaper thus needs to follow the actual fonts being used, for kerning purposes, rather than the markup—but this is still visible at the element level, with getBoundingClientRect().
Take this demo (which depends on your default font having such a kerning pair; if it doesn’t, you may need to find one that does and change the font by inserting <html style="font-family:sans-serif"> or similar after the comma):
This shows five variants of “Ta”, with the last two being <b>Ta</b> and <b>T</b><b>a</b>, and prints five numbers to the console, the widths of each <b> element. Numbers one and four (both corresponding to a <b>T</b>) differ if you have a kerning pair such as I describe: for me, the first is 11.7px, and the second 10.73333px (though it overflows that width in its rendering) because of the <b>a</b> that follows it. If you gave bold elements the style `display: inline-block`, it wouldn’t kern the pair and would thus go back to 11.7px.
Most fonts could really use italic-aware kerning (that is, kerning a pair where one glyph is regular and the other italic), but it’s sadly not a thing.
contary to the article it can still be represented as a tree, by decomposing the children into their own nodes (so in this case characters become nodes with child nodes expressing what formatting is active, followed by the letter, and then turn of all the active formatting)