I disagree, more or less completely. The semantics of HTML matter to the following:
1. Browsers which render the HTML
2. Machines, such as search engines, which "look" at it
3. People who edit the HTML (in the case that it is a
template)
Whether your HTML is semantically correct, or even stylistically valid, doesn't much matter to modern browsers because they've been designed under the assumption that most people will be too lazy or inept or uncaring to generate "proper" HTML. Obviously, this issue doesn't exist for compilers, where in the case that a compiler doesn't output proper machine code the program simply doesn't work.
When this starts to matter, however, is when someone leaves the comfort of their Firefox 3 on a huge screen and goes to Safari on a iPhone, for instance. Hacker News, for example, does not scale properly on the iPhone which leads to unnecessary and annoying horizontal scrolling past a certain nesting level, among other issues. This is tied to semantics, though whether or not a font element is used doesn't matter -- what does cause this, however, is the ridiculous reliance on tables, which obviously aren't semantically proper since you're not displaying strictly tabular data.
What about these limited browsers, and "machines" like Google? If you want a sidebar on the left side of your page, with tables that has to come before the main content. Using proper CSS techniques (semantics of HTML withstanding), you can place this after the actual content. People with smaller browsers could view your main content before your navigation with minimal changes and crawlers such as Google will index your content more accurately because of the placement of content in the source document. Semantics come into the picture when elements are misused or not used, such as header elements. These elements do have meaning to search engines.
How about screen readers and other assistive devices? Many of these take context clues from the HTML, such as the type of element used to wrap content and its attributes, as a way to present pages more accurately to the user. Semantics matter a lot when it comes to accessibility. What about page size? The more tables you have on a page, the larger that page becomes to download. Perhaps Broadband is becoming ubiquitous in many places (as far as America goes, at least in somewhat urban areas), but that doesn't mean that it makes sense to waste time transferring bytes.
There are a slew of other applications where HTML semantics bleed out into the real world, such as Microformats and the like, but many of these applications haven't gained wide-spread traction so probably aren't worth adding to my argument.
The point being here is, semantics matter when whatever is "looking" at something cares about the semantics of it. Machine code is very simple: A machine looks at it and executes it based on strict rules. If a shift instruction is used, it's because that instruction will (assumedly) give some kind of performance benefit, not because the machine views it as having a different "meaning" from any other multiplication method. The fact that you can achieve the same thing with a table as you can with a div or paragraph tag doesn't mean it's correct to do so. If everything was meant to work properly and to its full potential as a table-based element, why have all these other silly layout elements?
So, at the end of the day, what do we "CSS Zealots" and semantic HTML proponents get ourselves? Well, we get more maintainable, easier to read, accessible, flexible, more accurately indexed web pages. And in my world (the world of skilled web developers) we all still edit our web pages by hand -- using templates that are just a thin layer of abstraction above web pages. The day you see a META tag that says "Generated by Dreamweaver" on TicketStumbler is the day Dreamweaver has managed to produce equally good or better HTML than I can by hand -- HTML which affords all the advantages and luxuries that the stuff I write does.
Edit: As a final note, I'd like to point out that there is a middle ground here. You can create generally "good" HTML without spending a day deciding what a class should be named. TS supports Microformats for events, but running it through an HTML validator would likely produce some menial errors. I'm not suggesting everyone put 10x the time into writing proper HTML and CSS to present it, I'm just suggesting that they put a little time into it -- for the sake of their site, their visitors, the search engines, and the sanity of web developers like me everywhere... if I have to write another XPath string that looks like /table/tr/td/table/tr/td/table/tbody/td[...] I am going to spit on somebody.
1. No. Browsers just have to execute html, not understand it, just as a processor when it shifts bits doesn't have to know that the purpose is multiplication.
2. If you want programs to understand the information on a page at something above a textual level, the right way to do that is to explicitly encode whatever you want as xml, not by using divs instead of tables.
3. I agree: this is the case where you'd want something like CSS. But editing html by hand is not the only way to get web pages. You can also have software generate them. And in that case munging the results afterwards is as inelegant as patching machine code.
1. Yeah, I guess I worded that incorrectly, though you can see the tie-in between semantics and how that html is executed (rendered). Using proper semantics = not using tables for page layout = better rendering (under circumstances mentioned)
2. ... so you're saying we should write HTML, then write XML to describe that HTML? Why not just write HTML which describes itself in the first place? It seems awfully redundant to do it any other way, considering HTML comes with all these handy tags which, when used correctly, describe the content already. Hell, XHML was created as an HTML which conforms to the XML spec for precisely this purpose.
Then, yeah, you've got your work cut out for you to give that any sort of meaning...
3. "And in that case munging the results afterwards is as inelegant as patching machine code." -- Right, so your choices are write it yourself or get a bunch of garbage generated by an IDE or Framework or whatever. If you're happy with the garbage then you probably don't need to edit it afterwards or just "edit" it using the same IDE or language or whatever. What happens when your IDE or framework or language can't generate HTML/CSS to do something you need? Then you have to start editing that crap by hand... this should eventually lead to suicide.
Personally I think the most important of the anti zealot side of this debate is the insistence on everything being centred on code being written by hand.
Dingy little WYSIWYG editors, machine generated web pages, & (in this context, I suppose) software that allows users to generate anything from blogs to shops to social news sites without any technical knowledge, and all their friends have probably done more for for the web (& it's accessibility) then standards have.
We aren't there yet. From online shops to Slinkset, you still need a little css & html to get you past a certain level of control. But I think it would be an achievement, if these weren't needed any more & 'view source' got taken out of browsers from lack of use.
I don't really know what "View Source" has to do with not writing your own HTML, but I completely agree with your argument. I'm not saying I want to always write HTML by hand, I'm saying it should be done until such a time that WSIWYG editors, online web site creators and so forth get to the level that they generate proper HTML. I don't get any more pleasure out of writing boiler-plate HTML than I do from writing boiler-plate code for the backend. If someone has created a framework or library that already does what I need (or gets me on the road to what I need without getting in my way), I'm certainly going to use it.
Needing a little CSS and HTML to get you past a certain level is no big deal, assuming the initial code generated is semantic, (mostly) valid, extensible, blah blah blah. Having tables nested 5-deep is none of these things.
No sorry, I think I as unclear. I meant the opposite. 'View Source' will be removed because no one should want to view source.
What I was saying is that the ability of non technical users to make sites, no matter how bad that code, is much more important then anything else in this debate. Standards should (i think) almost be written specifically for machines.
Sure, you can think of creating the best possible environment that makes sense & so that the pages can be used on future browsers & browser like things, but that is dwarfed by the prospect of letting grandma have a website.
Sure, you can think of creating the best possible environment that makes sense & so that the pages can be used on future browsers & browser like things, but that is dwarfed by the prospect of letting grandma have a website.
I would argue that the continued evolution and improvement of the Web are far, far more important than letting Grandma have a website. Besides, Grandma can already have a website -- there are dozens of services out there like Typepad, Wordpress, Blogspot, etc. which allow Joe Public to have a website -- and he should be able to have a website, so long as whoever (or whatever) makes that site uses proper HTML, CSS, and Standards to create it.
Would it be great to get to a point where Standards are written for machines, I can write some super high-level markup that describes how I want a page laid out (or just draw it!) and nobody has to write another line of HTML or CSS, ever? Sure. But it's not going to happen any time soon and pretending like it has just hurts everybody.
If your framework generates a 30-character ID for damn near every element on the page (hi, Microsoft), it's Not Good Enough.
If your language generates tables for damn near everything, it's Not Good Enough.
I would argue that the continued evolution and improvement of the Web are far, far more important than letting Grandma have a website
That's a little cheeky. What I was implying that letting Grandma have a website is more important to the (not as then) the continued evolution and improvement of the Web.' Standards are (to that end) important inasmuch as they allow the guys that make typepad, wordpress & friends to let her do so. That is the criteria they should be judged by.
So you're against rss feeds? You think it would be a more elegant solution if instead of having an rss feed for a site like hacker news, other apps scraped the front page?
Of course not, but you seemed to be saying that if any semantic meaning is desired, it should be encoded as such and made available separate from the HTML... I don't agree.
Take Microformats, for instance. Say you're already displaying Events somewhere on a site; to add Microformat support all you do is add a few attributes to your existing mark-up and any Microformat-aware browser can properly recognize it. Sure, you could create another file entirely that contains XHTML specifically created as a Microformat, but that would be redundant and not nearly as helpful to visitors.
Or, say, a sitemap. Many times it makes sense to have a completely separate XML-based Sitemap for, say, Google. Then again, what if you just added the requisite semantic mark-up to an existing sitemap, one your visitors can use, but so can Google? You basically get two for the price of one.
RSS is a different animal and in most cases couldn't logically be encoded into the XHTML itself because you wouldn't have every RSS item on one page, etc. That's why the <link> tag is used to include it in the source, just like you include Javascript and CSS and so forth.
2. that's what xhtml is all about isn't it? like xml + html? so then you're already encoding xml when you write xhtml, so proper semantic should be important.
At least that's what I think, may be misunderstanding the whole thing.
Also, HTML 5 and XHTML 2 seem to be emphasizing semantics.
When this starts to matter, however, is when someone leaves the comfort of their Firefox 3 on a huge screen and goes to Safari on a iPhone, for instance. Hacker News, for example, does not scale properly on the iPhone which leads to unnecessary and annoying horizontal scrolling past a certain nesting level, among other issues. This is tied to semantics, though whether or not a font element is used doesn't matter -- what does cause this, however, is the ridiculous reliance on tables, which obviously aren't semantically proper since you're not displaying strictly tabular data.
What about these limited browsers, and "machines" like Google? If you want a sidebar on the left side of your page, with tables that has to come before the main content. Using proper CSS techniques (semantics of HTML withstanding), you can place this after the actual content. People with smaller browsers could view your main content before your navigation with minimal changes and crawlers such as Google will index your content more accurately because of the placement of content in the source document. Semantics come into the picture when elements are misused or not used, such as header elements. These elements do have meaning to search engines.
How about screen readers and other assistive devices? Many of these take context clues from the HTML, such as the type of element used to wrap content and its attributes, as a way to present pages more accurately to the user. Semantics matter a lot when it comes to accessibility. What about page size? The more tables you have on a page, the larger that page becomes to download. Perhaps Broadband is becoming ubiquitous in many places (as far as America goes, at least in somewhat urban areas), but that doesn't mean that it makes sense to waste time transferring bytes.
There are a slew of other applications where HTML semantics bleed out into the real world, such as Microformats and the like, but many of these applications haven't gained wide-spread traction so probably aren't worth adding to my argument.
The point being here is, semantics matter when whatever is "looking" at something cares about the semantics of it. Machine code is very simple: A machine looks at it and executes it based on strict rules. If a shift instruction is used, it's because that instruction will (assumedly) give some kind of performance benefit, not because the machine views it as having a different "meaning" from any other multiplication method. The fact that you can achieve the same thing with a table as you can with a div or paragraph tag doesn't mean it's correct to do so. If everything was meant to work properly and to its full potential as a table-based element, why have all these other silly layout elements?
So, at the end of the day, what do we "CSS Zealots" and semantic HTML proponents get ourselves? Well, we get more maintainable, easier to read, accessible, flexible, more accurately indexed web pages. And in my world (the world of skilled web developers) we all still edit our web pages by hand -- using templates that are just a thin layer of abstraction above web pages. The day you see a META tag that says "Generated by Dreamweaver" on TicketStumbler is the day Dreamweaver has managed to produce equally good or better HTML than I can by hand -- HTML which affords all the advantages and luxuries that the stuff I write does.
Edit: As a final note, I'd like to point out that there is a middle ground here. You can create generally "good" HTML without spending a day deciding what a class should be named. TS supports Microformats for events, but running it through an HTML validator would likely produce some menial errors. I'm not suggesting everyone put 10x the time into writing proper HTML and CSS to present it, I'm just suggesting that they put a little time into it -- for the sake of their site, their visitors, the search engines, and the sanity of web developers like me everywhere... if I have to write another XPath string that looks like /table/tr/td/table/tr/td/table/tbody/td[...] I am going to spit on somebody.