Yeah, there are some additional tags you could toss in. I even thought of adding div and span, which have their occasional uses.
The issues I've seen tend to run the other way though: far too often sites abuse low-level elements (or higher level ones) to accomplish things the wrong way. Example time:
• Google+. They're too good for <i> or <em>. Italicized or bolded text gets its own CSS class. Which you can't reverse engineer. The entire CSS shitpile (thanks to minification) is a mess of utterly nonsemantic elements. Which I'm far, far, far too well aware of having written a very extensive personal stylesheet to fix its many, many, many UI/UX presentation issues. If you're exporting your G+ posts (which I have as I slowly exit the system), use the JSON export option which includes your original marked-up input. That is far, far, far more portable.
• Blogger's "dynamic" templates. There's something to be said about a design for a text-oriented web site that is so utterly broken as to make it impossible to actually read the text. It's so broken (and dynamic) that I can't get it to sit still long enough to actually fix it with Stylebot. I've actually written bloggers asking them to change their templates so I can read their content. Oh, and my usual fallback, Readability, fails miserably as well. Another Google property, whaddyaknow.
• Some Indian news journal site. A stunning cascade of intricately nested divs. Someone was clearly doing the needful ... to gin up consulting revenues. But I have a doubt.
• Some old-school Web 1.0 sites where ... all text is bolded (b, strong { font-style: normal }), or written inside <h4> tags (WTF?), or background images are used (more WTF, plus a plaintive "why?!!").
• Paragraphs separated by <br> elements. No <p> tags. Better: Long streams of <br> tags inserted into RSS feeds. These can be addressed with sibling rules: "br + br + br { display: none; }".
• Paragraph indents accomplished by consecutive entities. That's what CSS rules are for. Sigh.
• Mixing px and pts or px and ems for any character sizing / positioning. IF IT'S TEXT, USE PT OR EMS.
• Fucking with letter spacing.
• Random <span> elements with hardcoded, px-sized, fonts.
• Inline styles.
• Tables used for text layout. Yeah, pg, I'm looking at you.
Don't forget the divitis of responsive web design - one of my utter bugbears. Should people jump ahead of technology so much, or wait for Flexbox support to catch up and then do it? I know what I think...
The issues I've seen tend to run the other way though: far too often sites abuse low-level elements (or higher level ones) to accomplish things the wrong way. Example time:
• Google+. They're too good for <i> or <em>. Italicized or bolded text gets its own CSS class. Which you can't reverse engineer. The entire CSS shitpile (thanks to minification) is a mess of utterly nonsemantic elements. Which I'm far, far, far too well aware of having written a very extensive personal stylesheet to fix its many, many, many UI/UX presentation issues. If you're exporting your G+ posts (which I have as I slowly exit the system), use the JSON export option which includes your original marked-up input. That is far, far, far more portable.
• Blogger's "dynamic" templates. There's something to be said about a design for a text-oriented web site that is so utterly broken as to make it impossible to actually read the text. It's so broken (and dynamic) that I can't get it to sit still long enough to actually fix it with Stylebot. I've actually written bloggers asking them to change their templates so I can read their content. Oh, and my usual fallback, Readability, fails miserably as well. Another Google property, whaddyaknow.
• Some Indian news journal site. A stunning cascade of intricately nested divs. Someone was clearly doing the needful ... to gin up consulting revenues. But I have a doubt.
• Some old-school Web 1.0 sites where ... all text is bolded (b, strong { font-style: normal }), or written inside <h4> tags (WTF?), or background images are used (more WTF, plus a plaintive "why?!!").
• Paragraphs separated by <br> elements. No <p> tags. Better: Long streams of <br> tags inserted into RSS feeds. These can be addressed with sibling rules: "br + br + br { display: none; }".
• Paragraph indents accomplished by consecutive entities. That's what CSS rules are for. Sigh.
• Mixing px and pts or px and ems for any character sizing / positioning. IF IT'S TEXT, USE PT OR EMS.
• Fucking with letter spacing.
• Random <span> elements with hardcoded, px-sized, fonts.
• Inline styles.
• Tables used for text layout. Yeah, pg, I'm looking at you.