CMS, standards and semantics

A few years ago, web designers and developers nearly always were the ones who filled their client’s sites with content. Sure, the content was mostly supplied by the client. Word, Excel, Acrobat, EPS, JPEG and so on. The web creators then adapted this content to suit the web medium and a lot of the time all was fine and looked good.

Today, all but the very smallest sites have some kind of CMS running in the background. A CMS is there to let the site owner publish their own content without having to call the web agency for every change that needs to be made. That’s great. And not so great. It works very well when the publisher has some kind of basic knowledge of the web, HTML and structured markup. Unfortunately, many people who’ve been given the rights and responsibilities to update and add content to their company’s website have no HTML-knowledge whatsoever. That often leads to problems.

The web editor gets a document from marketing. A Word document. the editor opens the document in Word, copies the text, goes to the CMS in a browser (most of the time IE) and pastes. Then, if the CMS allows it, the editor starts styling headers, emphasising parts of the text, maybe changing the colour somewhere, making a paragraph italic and so on until they think it looks nice. It may actually look OK. In their specific web browser, with their specific settings. The problem is that first of all, copy-paste from Word often brings a lot (like two or three times the amount of actual text) of junk code with it, making the text display differently than expected. Then, the web editor tries to use the styling tools provided by the CMS to fix the look of the text. We end up with a horrible tag soup mess that at the very least makes the size of the web page multiply. It may also make the whole page look bad in other browsers than the one used by the web editor, and some browsers may even crash.

The result is also focused only on how the content is presented visually, and not at all on how it is structured.

The problem is that by giving a powerful tool (visual editing in a CMS) to people who cannot wield it (web editors who have no training or maybe aren’t even interested in HTML markup) we risk total breakdown. And who will our client blame? Us, obviously.

What can we do about it then? I think that we need to educate our clients. After all, they are paying us to help them with their web site. We need to make the client understand that we are the experts, and that we are there to help.

WYSIWYG editors, often created by using the RichEdit component of IE/Win, are dangerous when used by people who aren’t HTML-savvy. In my opinion it’s safer and better to remove editors of that kind and teach the client how to use a few simple HTML tags:

  • <h1> to <h6> for different header levels instead of non-semantical markup like <font>-tags and <span>s.
  • <p> for paragraphs instead of just separating paragraphs with <br>-tags. I suppose showing how to use the class attribute may be useful since many people will want to have an intro paragraph in their documents.
  • <strong> and <em> for emphasis instead of <b> and <i>.
  • <ul>, <ol> and <li> for lists.
  • <a> for links.

Why do this? Well, for one thing it makes the document properly structured, which in turn will make it easier to style consitently. It also greatly reduces the risk of broken or messed up code breaking the entire page.

I really don’t think it should be too hard to do. We shouldn’t underestimate people’s intelligence and their ability to learn. Skip back in time to the late eighties. Many, if not most, people using word processors marked up their documents by entering special codes. If people could do it then, people can do it now. And after we explain that it’s in their best interest to do so, I believe that most people (clients) would accept this way of working.

What got me thinking about this whole thing right now was reading D. keith Robinson’s Standards, Semantic Markup, Distributed Authorship and Knowledge Management, which I recommend you read too.

Posted on July 31, 2003 in (X)HTML, Content Management, Web Standards

Comments are disabled for this post (read why), but if you have spotted an error or have additional info that you think should be in this post, feel free to contact me.