POSH - Plain Old Semantic HTML

For years I’ve been advocating the use of valid, semantic, accessible, well-structured HTML. It’s a bit of a mouthful to say, but thanks to an acronym being coined on the Microformats IRC channel almost seven months ago, I can now shorten that to POSH instead.

POSH, in case you haven’t heard of it already, is short for “Plain Old Semantic HTML”, and is obviously much quicker and easier to say than “valid, semantic, accessible, well-structured HTML”. Unfortunately POSH - semantic markup - is also something most people building websites or creating content for the Web have yet to discover. <sarcasm>Perhaps a fancy acronym will help speed up adoption.</sarcasm>

To start writing semantic markup, you need to:

  • Validate your HTML
  • Stop using tables for layout
  • Use semantic elements and attributes for their intended purpose
  • Use semantic class names and id values
  • Use as little HTML as will get the job done

It would sure be nice to see widespread use of semantic HTML sooner rather than later. So get out there and make sure all the websites you design or program use semantic HTML. Teach all your colleagues and clients about semantic HTML. Use it, talk about it, blog about it. If you don’t already, start Explaining Semantic Mark-Up.

The more people use the semantics of HTML correctly, the more reason for various user agents to improve their support for it.

A few practical semantic HTML tips:

  • Use heading elements (h1 - h6) for headings, and make sure they create a logical outline of the document.
  • Use tables to mark up tabular data, and use the full set of features (caption, th, scope, headers etc.) provided by HTML 4.01 to make sure that the tables are accessible. Learn more about that in Bring on the tables.
  • When marking up a quote, use q or blockquote. There are some tips on using them in Quotations and citations: quoting text. And don’t forget to Use only block-level elements in blockquotes.
  • For images, make proper use of The alt and title attributes.
  • Use HTML list elements (ul, ol, dl) to mark up lists. More info in Tommy Olsson’s Lists and Mike Cherim’s Using HTML Lists Properly.
  • Use the em and strong elements for emphasis, not to make text bold or italic (i.e. do not mindlessly replace i and b with em and strong).

Got any other tips?

Update: After reading some comments here and re-reading the article, I realise that my attempt at being humorous failed. I also realise that parts of this article were phrased so badly that my message got lost.

I don’t really mean that everybody should start talking about POSH all the time. What I was trying to say (but did in a bad way) is that more people should think about semantics when they write HTML. I have updated the most unfortunate parts of this article to make that (hopefully) more clear.

Acronyms are often confusing, so if you don’t want to talk about POSH, then don’t. Just use and advocate semantics and structure instead. That’s what I wanted to say.

Posted on November 1, 2007 in (X)HTML

Comments

  1. Wonderful, Roger.

    Helping people, who not only create but write for websites, understand what an outline is and how to use/create an outline, is the perfect way to go.

    Write a speach. Start with an outline.

    Write a paper. Start with an outline.

    Write a book. Start with an outline.

    Write an article. Start with an outline.

    That’s usually how i help even professional writers to understand web. It sounds insane, but the web still seems like a beast to some writers.

    Web creators can take this article to task before entering the industry and it should be a prerequisite for all webdesigners/devs, too.

    Great work.

  2. November 1, 2007 by Lauch

    Can someone create one of those nifty little buttons for POSH? We can put it right next to the W3C validation button :)

  3. Although I certainly agree that creating a unifying, simplifying acronym or logo for the semantic markup we all love is a really important idea, I wonder how much of an uphill battle you’ll be fighting.

    The key to making this work will be to get everyone behind it, and get everyone to agree on precisely what it signifies.

    Then we can set about spreading the gospel. I’ll certainly do my part if this starts to take off.

  4. November 1, 2007 by Sebastian Redl

    I agree with Lauch. Any attempt to create hype needs banners and buttons in addition to a good acronym.

  5. If everyone wrote a short post like this, it would be harder for people to miss this technique. (Maybe I’ll join you.) Not sure if @Lauch’s image idea would work.

    As far as your list of points, I think it might be useful to start seeing more descriptions/tutorials/whatever on the html quoting elements, and the differences between em/strong and i/b. I’ve run across these before (in fact, I had read one of your quote posts), but unless you really go searching, or run across them at random, you aren’t likely to encounter them. It’d be nice if there were an easy way to have people concentrate on writing to re-enforce these ideas. Something like a blog topic suggestion list, with a more formal twist (but not w3c formal). Are you aware of anything like that?

    Regardless, useful post. Thanks.

  6. Good article, I have been trying to explain POSH to a friend for awhile now, I forwarded this to him, maybe it will help.

  7. That is awesome! It sums up everything I do and everything I try to get others to do so easily :)

  8. Nice writeup Roger.

    @Lauch: I like your idea, though honestly I’ve always found those buttons to be no more than visual clutter — just my 2 cents.

    But, how about POSH t-shirts?

  9. Posh Spice is SO gonna sue you Roger!

  10. “Use semantic class names and id values”

    Why? This brings the semantic Web 3.0 XML discussion to a new level of silliness. Enjoy your AIDS.

  11. November 1, 2007 by Dan Schulz

    Roger, I’d add in using the bare minimum of markup and proper source code order (header, menu[s], content, sidebar[s], and footer) as well. Far too often on various forums (such as SitePoint and DevShed for instance), I’ll see people using far too much markup to create their sites - not only will using minimal markup force these people to stop thinking visually, but it’ll also force them to truely start thinking about their sites and pages structurally as well.

  12. November 1, 2007 by Henry

    @lol….wtf? Despite your obvious level of intelligence, I’m going to go ahead and answer your query anyway;

    There is a very good case for using semantic class and id names.

    For instance, if you gave your headings a class of large-red-text (which you probably wouldn’t as you can just attach your cms to the element), that would be silly, because at some stage down the track you might want smaller green headings. It’s pretty basic common-sense stuff, which can only help you being able to understand your markup x months down the track.

  13. I’d like to second Edward Welker’s suggestion for tutorials on some of these ideas like blockquotes. I have to confess that I’m not sure what to make of your point about not using em and strong all the time. I had read several places that bold and italic should never be used, because they have no meaning and are purely visual. Thus, if something needs to be bolded or italicized, there should be a reason—that is, it adds to the meaning; therefore, em and strong should be used.

  14. November 1, 2007 by Henry

    Sorry, I meant css not cms above.

  15. November 2, 2007 by Eric TF Bat

    (i.e. do not mindlessly replace i and b with em and strong)

    Thank you! I was reading your list dreading the line that would say “don’t use <b> or <i> because they cause cooties and make your teeth fall out”. There’s entirely too much religion clouding the simple fact that, to paraphrase Freud, sometimes an italic is just an italic. You hit the nail right on the head. Bravo!

  16. q usage and blockquote usage are not equivalent. q is seriously broken in many user agents and duplicates what the languages commonly used online already have – quotation marks. Most of those have differentiable open and closing states (some variants of Swedish excepted), hence it is usually or always clear when a quotation starts or ends. q was never needed, it’s broken, and it shouldn’t be used.

  17. Seriously, do people still put up new sites with table layouts?

    Use heading elements (h1 - h6) for headings, and make sure they create a logical outline of the document.

    I think this is very important as it’s still misused and people think it’s used for adjusting the size of the font. Very good tips, sir. POSH is the way to go!

  18. Roger:

    I’m digging most of what you are saying, except that the way the word ‘semantic’ is used sets my teeth on edge. All mark up is semantic (has meaning), whether it is structural or presentational. Maybe POSH should mean Plain Old Structural HTML. That is, the context of HTML should be structural, and that using HTML for presentation (in that context) is semantically incorrect.

    Peter

  19. I’d disagree about using the q element. Too many different user agents don’t support it correctly, and no CSS or JavaScript hack can solve it in all cases.

    There are generally two workarounds people use:

    1. Use the q element as it’s supposed to be used, and then use JavaScript or IE conditional comments to add quotation marks to the select browsers that don’t display them. Of course, browser detection isn’t exactly flawless, and you also miss IE users with JavaScript disabled and users of other UIs that don’t support the necessary JavaScript.

    2. Use the q element but also write quotation marks in the markup and hide the default ones with CSS. Of course, this ignores user agents that support correct rendering of the q element but don’t support CSS, which would end up seeing two pairs of quotation marks.

    I don’t think the little semantic gain of using the q element is worth these issues.

  20. November 2, 2007 by Red Green

    Oh my lord how much longer do we have to talk about this.

  21. Until the remaining 99% get it. :)

  22. November 2, 2007 by Roger Johansson (Author comment)

    Dan Schulz:

    I’d add in using the bare minimum of markup and proper source code order (header, menu[s], content, sidebar[s], and footer) as well.

    Good point, and I agree.

    Dan:

    I have to confess that I’m not sure what to make of your point about not using em and strong all the time.

    What I mean is if you used to use b to make entire paragraphs bold, you probably shouldn’t just replace it with strong without first considering if your intent really is to emphasise the entire paragraph.

    Joe, David:

    There are arguments both ways when it comes to the q element and I never seem to be able to make up my mind about it, so maybe I shouldn’t have mentioned it here. I don’t think it’s the most important element around.

    Deaf Musician:

    Seriously, do people still put up new sites with table layouts?

    Oh yes. The vast majority of newly produced sites still use HTML and CSS like it’s 1999.

    Red Green:

    Oh my lord how much longer do we have to talk about this.

    Forever it seems.

  23. Another day, another acronym to remember … and another buzzword to throw in at marketing meetings! I’m all for it though - semantic HTML usually gets blank looks.

    Dan’s comments above are spot-on: many developers use far too much HTML, too many DIVs, and too many classes to achieve their results. POSH should be about using clean, minimized markup too.

  24. This discussion is so early may ‘07 :p Personally i think it is insulting the intelligence of webdevelopers worldwide, if it is deemed necessary to using COOL PZAZZZ acronyms to get a point across.

  25. November 2, 2007 by jim d

    Oh for the love of God. How about we just call it HTML. POSH? Are you serious? Is it not pretentious enough to say HTML? Is it too clear to people?

    How about we just go with Simple Hypertext Integration Technology.

  26. I’m with jim d - whilst I laud the sentiment, do we really need another acronym with which to dazzle clients?

    As for Dan Schulz’ comment, “proper source code order (header, menu[s], content, sidebar[s], and footer)” is yet another monumental minefield in which there is no single “right” answer.

    Surely we as web industry professionals can explain a concept like semantic HTML clearly without creating yet more acronyms in an industry already groaning under the weight of a million and one buzzwords, acronyms, and meaningless techno-babble?

  27. You know all the people who don’t know the first thing about HTML but are still hell bent on having AJAX on their site? You know who I’m talking about.

    Those are the people we can effectively target with a term like POSH.

  28. November 2, 2007 by Dan Schulz

    Andy, it’s not so much an issue of their not being a single “right” answer - it can easily be taken care of by using floats and (negative) margins. All that requires is some experience with those two properties and how they interact with each other cross-browser (and there are fewer cross-browser issues than one would think, especially once the appropriate elements have had their margins and padding zero’d out).

    Of course, you have your types like the “SEO Content First At All Costs” zealots who just don’t care since they think they know what’s best for the Web because they (falsely) believe that what they’re doing will cause their pages to magically rank better in Google and other major search engines (they won’t), as well as people who are just incapable of thinking structurally rather than visually at the current time (I admit freely to being guilty of this myself a few years ago), plus people who just don’t care either way because they believe that what they’re doing is “good enough” for them. Add in the people who honestly don’t know one way or another which is the proper technique or implementation to use, those who are dependent on frameworks to get anything done, and you’ve got a real mess on your hands.

    And that’s before you even get into the classroom (let’s face it, colleges and universities as a general rule of thumb tend to be between five and ten years behind the curve, though thankfully there are some exceptions to this).

  29. Dan,

    I think it is absolutely a question of there not being a definitive right answer.

    You make a fairly definitive statement that [header, menu[s], content, sidebar[s], and footer] is the “correct” order. I’m personally not so convinced for a number of (extensively explored) reasons:

    Firstly, expectation. Users of assistive technologies may have a certain expectation that the menu/nav comes before the content in source order, but to some degree that expectation is founded on a historical experience of badly built sites rather than any logical or rationalised process. If people started out driving cars with square wheels then they might be reticent to try out new-fangled round ones, but it may just improve the ride quality and overall experience.

    Secondly, real world testing. Extensive user testing has failed to establish categorically whether Nav->Content is empirically preferable to Content->Nav.

    I agree with you that visually it need make no difference if the developer is remotely competent and well acquainted with modern CSS techniques, and that the SEO benefits are also far from black-and-white, but I don’t find it either beneficial or desireable to blindly follow the “content first” school of thought.

    As someone with extensive experience of cross-browser, cross-platform issues and with day-to-day interaction with a dedicated User Experience team and state-of-the-art usability labs, I like to think that my development credentials are fairly well established and it really riles me when matters of opinion are presented as fact.

    In truth, we work in a very young medium which is still maturing. Whilst we can make certain fairly sweeping statements at times, I feel that a little more thought is required in some corners.

  30. Slightly off-topic, but here’s where I use >i< - when marking up a foreign language.

    I assume a screenreader will use a different accent to provide an audible cue that the language has changed, but the poor old non-disabled user is left to guess that the language has changed.

    But

    <i lang="fr">c'est la vie</i>
    

    …I suppose!

  31. November 2, 2007 by Dan Schulz

    Andy,

    Regarding my example, while users of assistive devices may have endured a history of badly coded Web sites (who hasn’t?) which had the menu before the content, there are benefits to having the menu come before the content. Not only does it give the user the option of going to another page before they get to the content, but it also gives search engines the chance to index all the pages that are linked to each other in the mneu as well.

    When combined with the proper use of skip links (which I happen to be a huge advocate of), this gives the user total control over how they access the Web page - those that want to go right to the content can do so, while those who may want to switch to another page have the ability as well, without having to jump past the content to do so. It’s not just people who use screen readers or assistive devices who can benefit from this approach either. People who use text-based browsers such as Lynx or even Web capable cell phones (not everyone uses an iPhone or a mobile device with Opera Mini enabled - for example my cell phone only supports HTML) are also no longer restricted to having to waste bandwidth (and money) on scrolling from “page” to “page” just to get to the menu or the content.

    Yes, it is possible to do most of this by having the menu after the content, but if the content is of sufficient length, then the search engine spiders may not be able to get to those links, thereby preventing the menu from being indexed. If one of those pages happened to be the first page that was crawled by the spider, that site would be in very big trouble. Not only that, but mobile users would also be required to send another request to the server for the menu if they wanted to change pages as well, which translates into lost money (especially those who are using pre-paid cell phones).

    The Web may be young, but it is growing quickly, especially with more and more devices becoming connected to the network. We need to make sure that the sites we create can work properly regardless of which device is being used, what that device’s level of standards support is, as well as other factors we cannot possibly consider comprehending at this time but may become of critical importance a year or two down the road.

    Now if you’ll kindly excuse me, I’m going to stop wondering why I pulled an all-nighter (again) and try to get some sleep. (It’s past 7:30 in the morning here.)

  32. I’m quite saddened to see you advocate the POSH acronym.

    The concept of plain old semantic HTML is, of course, a very good one.

    But we don’t need another abbreviation and it is fundamentally detrimental, not beneficial, to the adoption of what it represents.

    It was a bad idea created for the wrong reasons to highlight the importance of good HTML to Microformats users who were running before they could walk.

    It is another stumbling block. Another unnecessary complication adding a layer to the barrier of entry. It’s daunting enough for the beginner to tackle phrases like HTML, CSS, semantics, separation, etc. Adding another acronym, and one that includes technical terms at that, makes things a damned sight more confusing.

    The only people getting excited about this term are the experienced HTML aficionados who see a problem that isn’t as big as they think it is.

    I hope to hell this dies a death (again, why it is so sad to see it here) for the sake of web standards.

  33. YAA - Yet another acronym :rolleyes:

  34. @Patrick,

    I get what you’re saying about acronyms, and personally I’m not a big fan of the POSH one, since it tries to be funny. On the other hand, look what the term AJAX did for already existing technologies, shortening the explanation so business and web developers alike were (sort of) talking about the same thing.

    What I’d rather oppose is “…who see a problem that isn’t as big as they think it is.”

    I’d say that the state of semantic/meaningful HTML on the web is piss-poor. It’s still such a minority of web sites with good and proper HTML code, so I would definitely argue that the word should be spread.

    In essence, I’m sure we’re on the same page, but the point is to still talk about semantics with people. Use the term POSH if you want to, meaningful HTML, semantic HTML or whatever, but please just continue to spread the knowledge and importance of using it correctly.

  35. @Robert:

    The comparison of POSH to AJAX is an obvious one, but it’s flawed. Ajax was necessary because it describes a relatively (to HTML) complex technique. POSH describes a language. POSH is doing HTML right - we have HTML, we have semantic HTML, if that phrase so floats your boat. We don’t need anything more, making things more complicated. Will we be seeing POPOSH in a few year’s time? POPOPOSH a few years later? It’s stupid.

    I’m all for spreading the word of semantic HTML, but that’s exactly why I’m so passionate about this - again, POSH is detrimental to its adoption.

    The more jargon someone comes across, the less likely they are to pursue something.

    Regardless of how big you think the problem is, this is a step backwards, not forwards.

  36. I wish we lived in a world where good ideas could be communicated without requiring cute acronyms. Alas, in the world I live in, there are a few people who know what “semantic” means, a few more who have heard of it but don’t know what it means, and a whole lot of people who can’t even hear it because it has too many syllables. Of course this problem would just pop up again when explaining what POSH stands for…

  37. Unfortunately, the first thing I thought of when I read the acronym was POS Html (use your imagination). Maybe SWAV (Semantic, Well-structured, Accessible, Valid) would work as an adjective. It doesn’t specifically target HTML but then again, these are characteristics that can apply to a variety of web languages. “Is that script SWAV?”

    Regardless, any effort in promoting web standards and better practices (whether humorous or not) is a good thing ;)

  38. November 2, 2007 by Lauch

    Well, I was surprised that nobody got my sarcasm. I was just going off of another acronym/buzzword. While I like the idea, and people really do better remembering and identifying with things that have easy to remember acronyms, the problem is that there are too many of them. This is just adding to sooo many others out there.

    In my experience with semantic html, either you get it or you don’t. You can explain the logic all you want, but if a person isn’t interested, you might as well talk about cutting your nails. I think there are two groups of people in the web design world: those who practice and preach semantics and those who want to crank out as many horrible web sites as possible.

    I’ve spoken with the latter several times, explaining myself clearly that semantic html and proper design is important. They reply that “Most people don’t care, they just want a web page quickly that flashes their business to the world.” I could preach until I’m blue in the face, and they’d have created 50 web pages using sliced together images using tables in the time I’ve spent talking.

    You’re always going to have both groups involved in the web design world. It sucks, but it’s the truth. Those of us “in the know” will realize that POSH is the way to go, and we’ll know why. And when their web pages break or turn people away, we know that those web page viewers will end up visiting our sites in the end.

  39. November 2, 2007 by D. Olsson

    Why not just use “Real HTML” ?

  40. @Patrick,

    Your hang-up seem to be only with the acronym. Fair enough. Call it whatever you want, but please keep on talking about the subject itself, because it is still very important to spread the word.

  41. November 3, 2007 by Ivor Daley

    I asked a prospective web developer if they produced semantic HTML and sent them the link to this article to make sure they knew what I meant.

    Here’s the core of their response …

    Problems with a lot of “semantically” designed websites include:

    • sometimes they are almost impossible to maintain by anyone other than the original developer

    • they require hacks to get around problems with different browsers which leads to delays in development = expense

    • may require several versions of the site to display properly in various browsers

    • end up looking the same as every other site developed by the semantically inclined developer

    • etc etc

    I could go on but I don’t think that it would make much difference. No matter what methodology is used at the end of the day so long as the site looks and works the way you want it to why would you care.

    Guess I’ll keep looking.

  42. Why old?

    Why do you calll it plain OLD semantic HTML?

  43. @weakish: it’s an English saying. “Plain old” is the same as “regular” or “normal”, meaning something that’s “still the same”.

    I’m with Patrick on this one, I don’t think we need another acronym to sound more impressive during meetings or in front of clients. However it is important that we keep trying to educate people.

  44. That’s a pretty short of list of some simple things to do to accomplish writing semantic markup, Roger. But you’re right. It doesn’t take much more than that. It’s almost logical I dare say. If the “POSH” acronym helps with adoption, that’s fine, POSH it is. I don’t mind a bit, all for it, in fact.

    @JackP: I agree with you regarding the use of the “i” element. I have an article in the works regarding the old “b” and “i” elements.

    @Ivor Daley: I respectfully disagree with the points raised by your prospective developer. Not to say he or she is a bad person, but they obviously don’t realize those are gross misstatements. A simple matter of ignorance.

  45. Good post, I agree that we all should use POSH, it makes sense. It does make my life sometimes very difficult - I use DotNetNuke platform and unfortunately the core of the application is not compliant. It takes a lot of effort to get it to plain old html. But when it all works what a joy.

  46. There is an issue with the accessibility, the good written code will help to validate the site . In some countries commercial sites have to pass the test.

  47. I’ve been a “POSH” evangelist for at least the last few years. Right now, I’m extremely frustrated with it and I’m going back to table layout where it is absolutely easier. That’s the rule, where tables are that much simpler, table-free layout is too immature, simple as that. I’m sick to death of reading table-free evangelism that doesn’t address the fact that table-free design is a complete bitch because it has not been well thought out and implemented. The most fundamental of layout principles - getting two boxes to sit side by side with content aligned to the top and both boxes of equal height - is a flipping nightmare of crap hacks and extra markup that still doesn’t work well. It’s time to pull out the bazookas, aim them squarely at the donkeys who are not supporting forward movement in HTML - be they Microsloth or W3C with their slower than grass-growing and purity over practicality standards development - and blast them off the map. I’ve had it. I want to see a plethora of focused web sites, social justice style campaigns, protests in the streets! People should be setting up projectors at the front gates at Redmond with a standards-friendly website rendered in IE6 on huge screens and ask WHY? It’s time the web had the funadamental layout capabilities that newspapers had over a century ago! Web developers unite! I want to see ‘shit-rendering strikes’ where we turn off all our hacks for a day and make our bosses see how terrible all our sites look. Social change doesn’t happen without pressure and we need to pull our proverbial fingers out because getting them to start working on exploder again and forming WHATWG just isn’t enough.

  48. to pd: so you are calling to discriminate part of your visitors, search engines and mobile devices using unaccessible tables? Not eaxctly new approach.

  49. at the end of the day so long as the site looks and works the way you want it to why would you care.

    There’s nothing wrong with that view. Provided all your revenue comes from users that are carbon copies of yourself, that is…

  50. I kind of get that people don’t like the POSH acronym described in this article, but it seems a bit extreme when people say they’re “saddened” or “insulted.” I work in a web development group at an institution that has yet to abandon table driven layouts and font tags. Generally speaking, I think there’s still a need within our larger community of web developers and designers to properly use HTML and if POSH is a catchphrase to promote that idea, then it’s fine by me.

  51. 50 comments and no mention of POTS, where the ‘old’ part comes from in POSH? Also, we use strong and em tags along with b and i, and the division (for us at least) is clear: strong and em for our design elements where appropriate, b and i for bolding and italicizing within our client’s content (largely educational standards documents). My idea is we can’t be sure why the content was bolded or italicized, and aren’t in a position to decide what it means semantically. Such documents may use bolding and/or italicizing for a specific purpose, it may even explicitly say bold and/or italicized words have special meaning, and so on.

  52. I was going to mention POTS (Plain Old Telephone System) which has been in common use for a long time now, but I see that I was beaten to it.

    Just to add to the BAP (Big Acronym Pile) I would like to suggest SUAVE - Semantic, Usable, Accessible, Valid Encoding. It is not specific to HTML and could equally apply to CSS.

    JACKP raises an interesting point when he uses to indicate a language change. What would be the correct semantic way of indicating a language change within HTML (which could occur anywhere e.g. in the middle of a paragraph). Would this be a semantic use of div or span or is there a better way?

    The best acronym has to be TLA which is an IBM one (of course) that means either Two Letter Acronym or Three Letter Acronym.

  53. @ Eric TF Bat (#15)

    <i> and <b> are in fact systematicly different than <em> and <strong> .The reason behind this are that <em> and <strong> relate to screen readers and therefor are read differently, its the way we typically pronounce words that are bold with a “strong” voice or tone, while we emphasize word that are italic, however presentationaly these words don’t have to be bold or italicized at all, they can be styled however you feel… and should be.

Comments are disabled for this post (read why), but if you have spotted an error or have additional info that you think should be in this post, feel free to contact me.