HTML or XHTML - does it really matter?

Jonathan Snook is asking a simple question with a not-so-simple answer: Which is better: HTML or XHTML?. More specifically, he is asking which is better of HTML 4.01 Strict and XHTML 1.0 Strict.

Like so many other topics in web development, this has been discussed before. I don’t see a problem with bringing it up again though. A renewed discussion can bring forward new arguments. Or we can just have some fun.

My opinion is that it doesn’t really matter, but I do think that if you choose to use HTML 4.01, you should make it as XHTML compatible as possible. Basically write your HTML as if it was XHTML - make element and attribute names lower case, quote all attribute values, and don’t skip optional end tags. You will thank yourself if you some day decide to switch to XHTML.

Posted on January 17, 2006 in (X)HTML, Quicklinks

Comments

  1. The main reason for me is consistency. We chose XHTML 1.0 at work because it enforces things like closing tags, tag case, and quoting attributes that HTML 4.01 doesn’t. It makes it a little easier to share code among projects and developers because we know what the rules are, and we have lots and lots of validators to enforce them.

  2. My point is simple: write valid HTML. If you decide to convert it to XHTML later you can use something like html tidy for that. And, frankly, I do not see so many reasons why would I want to convert my valid HTML to XHTML in the future…

  3. I use XHTML for the same reason Kevin stated, it enforces closed tags and leaves little room for error. I like to have everything clean and laid out, and little things like self closing tags help me point out where things are in my coding. uppercase tags just look plain obnoxious too

  4. I think and (now) practice excatly that : write HTML 4.01 which is as compliant as possible with real XHTML.

    Of course you should choose the strict version of the HTML 4.01 DOCTYPE.

    I (now) refuse to write XHTML and serve it as HTML, but I keep the HTML as strict as possible.

    As for HTML vs. XHTML I answered on Jonathan’s site.

  5. I always write in XHTML, whatever it’s served as. It’s for much the same reasons stated before (coding standards, easy to read etc). I do try to serve as xml over text but sometimes that’s not an option. Besides, I’ve spent the last month persuading my colleagues to adopt XHTML so I’m not going to go back on it!

    As for which is better, I’d say they’re on an even playing field as far as today’s technology is concerned.
    Once XHTML is fully supported it’ll become a no-brainer.

  6. Since XHTML was released (and all browsers support it) I see no reason to come back in HTML

    XHTML is similar to the original HTML, but the rules make all the diference. Today, XHTML preserve much more the semantic than HTML, because it encorauges to code in two parts: structure and layout.

    sorry for bad english :)

  7. January 17, 2006 by Edward J. S. Atkinson

    It doesn’t matter. Well formed code of any sort is fine as long as it’s just that, well formed. The doctype declaration is inconsequential at this point in time and not worth heavy debate.

    I use well formed (or XHTML compliant) HTML 4.01, which is what I would suggest to others as well. I think people miss the point of all of this.

    The point of HTML and XHTML is to provide structure to a document. Both structures are pefectly valid and hardly different. Just flip a coin :-D

  8. Since XHTML was released (and all browsers support it) I see no reason to come back in HTML.

    When did all browsers start supporting xhtml? Saying that xhtml is more semantic than html is rather odd, giving that xhtml actually has less elements than html(yes i know those elements are mostly unsemantic)

    Nothing beats nice and tidy html, you know since IE

  9. There are certain things that XHTML can do that HTML can’t, such as interleaving other flavors of XML. MathML and SVG (for instance) aren’t widely used now, largely because 90+% of web users won’t be able to do anything with them.

    I’m hopeful that it won’t be long until standards-compliant browsers become the norm and we’ll be free to serve much richer documents over the web. Moving towards XHTML now puts you in a better position to make that transition.

  10. HTML or XHTML usually doesn’t really matter all that much - except when it does :-)

    It matters to me, so I usually end up with valid and working XHTML1.0, that HTML browsers can then render as HTML. Apart from that: no big deal.

  11. I prefer XHTML for many of the reasons listed above. Since it supports namespaces it is natively more extensible (SVG, Docbook, MathML, OFX, etc) and it also automatically enforces a certain level of quality since it has to be well formed.

    The problem with XHTML is that is isn’t well supported yet. IE chokes on the application/xhtml+xml MIME type and good luck getting google adsense to work. You get the idea.

    In my opinion, if you are able to eliminate the IE problem (not likely at this point in time) and can also eliminate the need for or use an XML friendly alternative to AdSense then XHTML is the way to go. If not, stick with HTML strict for now. I think the market will make the move towards an XML based web but there is a lot of legacy junk out there that will be around for a long while.

  12. I write XHTML 1.0 Strict, because I like it to be XML-parsable (not that I really do parse it, but having the option is nice). And I’ve gotten those /> in my fingers by now, so switching to HTML would be more work than necessary.

    And yes, I do [sometimes] serve it as text/html. Kill me.

  13. January 18, 2006 by Florian Hardwig

    XHTML is similar to the original HTML, but the rules make all the diference. Today, XHTML preserve much more the semantic than HTML, because it encorauges to code in two parts: structure and layout.

    This is nonsense. Why should XHTML be more semantic? The separation of structure and presentation has nothing to do with XHTML. Just because a lot of web design authors mention “XHTML+CSS” as if these two came as an inseparable (ha!) couple doesn’t mean that you cannot use CSS with HTML. Feel free to code non-presentational markup with ye olde HTML, it is allowed! :)

  14. Yay! This debates always fun :-). But the conclusion is always the same: use HTML 4!

  15. @Floran -
    I don’t want to throw the thread off track but semantics is a little more complex than simply separating content and style. Basically speaking, semantics is the process of assigning context or meaning to the individual objects within a document (or any other collection of objects).

    If you want a practical example, look at the structure of Docbook and compare it with the structure of XHTML. Docbook is much more semantic in terms of document construction.

  16. I generally practice developing in xhtml using the strict doctype, but I’m really with you Roger… I really don’t think it matters. I think developers need to pick a standard and stick with it.

    I like the rules of xhtml because it makes development easier. It forces me to write code like xml.

    Overall, html or xhtml, the important part is that it’s semantic.

  17. January 18, 2006 by Tommy Olsson

    It seems as if several posters above unfairly compare XHTML 1.0 Strict to HTML 4.01 Transitional. I guess many used to use a transitional doctype until they jumped on the XHTML bandwagon, therefore thinking that XHTML is ‘more semantic’ than HTML.

    That’s utter nonsense, of course, since XHTML 1.0 is merely a reformulation of HTML 4.01 as an application of XML.

    Personally, I prefer HTML 4.01 Strict since that’s the latest ‘standard’ with any user agent support worth mentioning.

    The day IE supports XHTML, I may reconsider.

    I agree with Roger, though: write your HTML 4.01 Strict as if it were XHTML. Then you’re only a few slashes away from XHTML markup if you find a reason to switch. (Providing you don’t use much JavaScript, of course.)

  18. I work with XHTML because I want use script to walk through the DOM without the parser chocking on invalid code.

    Basicaly: If it has to be machine-readable, make ik xhtml.

  19. Woolly Mittens,

    Shall I assume you use only namespace-aware methods in JS?

    Regardless, using XHTML doesn’t enforce anything. You can write invalid and even ill-formed XHTML; If you serve it as text/html you’ll never know the difference.

    If you serve it as XHTML, you can still produce invalid XHTML without the XML parser catching an error.

    And by the way you can write valid HTML.

  20. It’s amazing the level of debate this subject brings. It’s all about the right tool for the job. What I would like to know is what will drive the need for converting HTML to XHTML? What is XHTML? Who uses it appropriately? What does it offer over HTML in the real world? 90% of sites out there [I mean commercial - SME] will never benefit from early adoption. Unless there is a major shift in the consumption of XHTML and perhaps at some point the dropping of support for HTML [unlikely], why bother?

    As long as the doctype is strict and validated then your website will be lean, efficient and compatible.

  21. Strict verses Strict, well than the difference is minimal excluding; MIME, rendering, processing, scripting, CSS, etc. and assuming it’s just vanilla W3C variety, and not an eXtensible enhanced version of XHTML.

    On one hand you have the standalone and well formedness and extensible possibilities (though vanilla Strict as in the question posed limits that). On the other we have User Agents that haven’t got past HTML 3.2 support. Personally don’t used XHTML 1.0 Strict online for myself, even though I host over 250+ XHTML Files of various types on different sites.

    Albeit I did create an eXtensible XHTML based XML language compatible with XHTML 1.0 Strict for a dozen live pages, though there aren’t many (if any mainstream) Validating XML Parsers inbulit in web browsers in action, at the moment surfing the web.

  22. January 18, 2006 by draco

    Er, I have never tried doing so, but I’m just curious enough to ask, and see if anyone could enlighten me.

    Is it all right to write up a valid XHTML 1.0 Strict file and put the DTD as HTML 4.01 Strict ? Will it still validate? Or must I stick with HTML 4.01 style of no self-closing, and stuffs?

    (Asking only because I’m really too used to writing XHTML and not too ready to go back to HTML.)

    Get what I mean?

  23. It does not really matter which one you choose as long as you go the strict way. It’s all about maintaining clean and readable code.

    And about thanking yourself - you’ll thank yourself as soon as you try to edit anything ;)

  24. January 18, 2006 by Tommy Olsson

    @draco: No, you can’t do that. You may get away with it if you don’t use any LINK or META elements in the HEAD, but any empty element type in the HEAD will cause validation to fail.

    The reason is that the ‘/’ character in a self-closing tag closes the tag in HTML, so the following greater-than character is a literal character. And character data is not allowed inside HEAD.

    It will probably work for elements inside BODY, because nearly all browsers parse HTML incorrectly and ignore the slash.

    A self-closing tag is not invalid HTML, it just doesn’t mean what most newbies think it does. It means the empty tag followed by a greater-than character, although - as I mentioned - practically all browser get this wrong.

  25. I thing they are basically the same. xHTML is afterall backward compatible with HTML4.

    xHTML just has some advantages over HTML4.

    • it’s forward compatible with xHTML 1.x, xHTML served als application/html+xml and xHTML 2
    • it’s Well-Formed XML so can be manipulated with XSL(T)
    • it can be automaticly validated. I don’t believe there is a validator that check a HTML document for both HTML4 validity and as much xHTML compatibility as possible.

    I can’t think of any advantages of using HTML4 besides familiarity with it and someones nitpicking believe that one shouldn’t serve xHTML as text/html

  26. January 18, 2006 by mattur

    I can’t think of any advantages of using HTML4

    Leaner, meaner pages. Especially if you omit optional end tags and omit optional quotes on attributes.

    IMHO this style of coding makes coding errors harder to detect (the new professionals shouldn’t have this problem, of course), but the bandwidth and download speed benefits outweigh this. If we wish to exploit the benefits of “Web Standards”-based development, we might as well fully exploit them.

    Good HTML can be converted automagically to XHTML, should the need ever arise.

  27. January 18, 2006 by Marc Luzietti

    Until that section of the browser market using older versions of IE (6 and older) becomes insignificant, it’s HTML 4.01 Strict for me.

    So, what do y’all think of HTML 5?

    runs

  28. gerben, please read The Myth of “HTML-compatible XHTML 1.0 documents”, in http://hixie.ch/advocacy/xhtml.

    About someones nitpicking believe that one shouldn’t serve xHTML as text/html, it’s a fact that XHTML served as text/html is just treated as invalid HTML by all the UAs around.

    And it’s XHTML not xHTML.

  29. I use a xhtml doctype for one reason: it helps me find “errors” in my markup in a more stricter way compared to html. Even if my documents are served as text/html I see no advantages in using a html doctype, so i stick it to xhtml. At the moment I have no reasons to change to html.

  30. I’m actually surprised at the number of people using HTML 4.01 Strict. I’ve seen plenty of reasons why XHTML isn’t much better, but I haven’t seen any good reasons to use HTML instead. One post stated that you can have leaner code in HTML by removing the attribute quotations and optional closing tags, but I don’t believe doing that is a good idea anyway — most people agree that even if you use the HTML 4.01 Strict DocType, you should still write valid XHTML.

    I started using XHTML DocTypes as soon as I learned XHTML and haven’t had any problems with browser support (it’s always the CSS that causes problems). HTML has come to a dead end. It’s time to move on to the next thing.

  31. January 18, 2006 by mattur

    @Thomas:

    To be clear: HTML markup, even including optional end tags and attribute quoting, will still be marginally smaller than the equivalent XHTML markup.

  32. @Sébastien, thanks for the recomendation. I have already read it and it offers a great summary of all possible errors. XHTML in theorie isn’t backward complatible, but in practice, as far as I know, no browser has any problem with it (maybe some SGML browser do).

    I admit most XHTML documents on the web would break if served as application/html+xml, but I don’t think that’s a good enough reason for not using XHTML. XHTML has force a lot of coders to no longer create tag-soup, which is a great thing.

  33. gerben,

    “XHTML has force a lot of coders to no longer create tag-soup, which is a great thing.”

    Sure so you long as you’re aware that the XHTML you’re talking about is just plain ol’ HTML, with a few additional closing tags, or closed tags.

    So now to answer Thomas’ question: instead of writing valid XHTML (which is valid markup-wise only) and serving it as HTML to UAs that will treat it as broken HTML (because XHTML is not valid HTML [1]), why not write valid HTML and serve it as such to UAs that will treat exactly as such?

    For historical reasons all UAs have some level of support for HTML, up to 4.01. Very few have XML support (if you include screen readers, robots and other non-browser devices) so the question is more : why switch to XHTML?

    The answer for me is that it’s great to experiment with XHTML now, but it’s not a good idea for public websites.

    [1] just try it with the W3C markup validator. Write a valid XHTML document and then validate it against an HTML DOCTYPE

  34. January 20, 2006 by draco

    Oh, and the real nightmare begins with authors start serving XHTML 1.1 as text/html too ;) I’ve seen a couple and is thoroughly amazed. I suspect these are the true “hop on the bandwagon” people, more than those who serve XHTML 1.0 (Strict or otherwise) as text/html.

  35. January 20, 2006 by Roger Johansson (Author comment)

    draco: Yeah, XHTML 1.1 as text/html is a nice one.

    At least one commercial CMS that is very popular here in Sweden does that by default. It uses invalid, transitional markup, calls it XHTML 1.1, and delivers it as text/html with an XML declaration to throw IE/Win into quirks mode (not that they know what that is). It’s a markup horror created by people who obviously know next to nothing about HTML or XHTML.

  36. In France we have the infamous skyblog.com weblogging platform, claiming hundreds of thousands of (crappy) weblogs, all XHTML 1.1 served as text/html.

    The worst part is most pages validate (which validates some people’s delusions that it’s OK to serve XHTML as HTML) despite HTML-commented PCDATA blocks, document.write a go-go, and the least semantic markup you’ll ever see in an “XHTML” page.

  37. January 21, 2006 by theUg

    I’m a bit iffy about aforementioned article ( http://hixie.ch/advocacy/xhtml ). There’s a lot of questionable arguments made there. And such things as juxtaposition of XML and SGML (aren’t XML and HTML very much based on SGML (as in subset)?) are quite nonsensical.

    And what is it about that DOCTYPE is optional for well-formed XHTML document? I thought spec states it is mandatory…

    However, I’ve seen this debate many a time recently, but this article is the first time I see some real reasoning behind the choice. Better than “just use this, cause it is so”.

  38. So, is it possible to just a namespace and mime type to html 4.01 strict, then you can serve up pdf, specialty tags, PDA and cel phone apps?

    Cuase I know that putting ‘application/xhtml+xml’ lets me add svg and VML without data islands.

    This is my assumption, is it wrong? i am new, but a vet told me to take my xhtml valid html page, change the mime type, and change the file ext to xhtml. It worked in IE5.5 and 6.0, and looked like regular xml.

    I don’t think you can start doing stuff like that with html. why bother to use HTML Tidy later, that is a hell of a lot of work.

    You can template and all sorts of stuff straight away with xslt (or even just CSS) on xhtml strict or 1.1.

    Again, I am not that sure, but that is my take on it.

  39. I’ve happily switched from bad XHTML. What finally made me switch was Jero’s article on this subject.

    Saying that you like XHTML because it enforces strictness is stupid. It doesn’t actually enforce any strictness unless you’re using it properly. There are two problems with that:

    • Most people aren’t using it properly
    • Those who are are cutting out a large percentage of their audience.

    XHTML will be great. But the browsers aren’t ready yet.

  40. I believe that in a sophisticated production environment, where clients get to edit the source to some extent (or are able to disrupt the well-formedness with WYSIWYG or Textilising tools), HTML is a better way to go, than risking serving an invalid page as application/xhtml+xml.

  41. It’s easier check xhtml for well formedness, which is my over-riding concern.

    It’s then just as easy to check xhtml for validity, which is next in importance.

    As someone else posted, if you want it to be machine parsable then xhtml is the sensible option even if some browsers still treat xhtml as slightly malformed html (with no ill effects).

  42. The answer is simple:

    Use what makes your users happy.

    If you can with xhtml, then go for it. If html suits your needs ( and those of your site visitors ), that is the ultimate bar in which you must always be focused on.

  43. how does it look with the endings actually expresses .xtml? I have with mod rewrite it production to to run bring, yet I have to consider whether that is correct…

Comments are disabled for this post (read why), but if you have spotted an error or have additional info that you think should be in this post, feel free to contact me.