HTML5 allows almost any value for the id attribute – use wisely

As I mentioned some time ago in Creating valid names with the id attribute, HTML 4.01 is pretty restrictive regarding what values are allowed for id attributes:

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens (“-“), underscores (“_”), colons (“:”), and periods (“.”).

Well, HTML5 changes that by allowing almost any value for the id attribute:

From HTML5 3.2.3.1 The id attribute:

The value must be unique amongst all the IDs in the element’s home subtree and must contain at least one character. The value must not contain any space characters.

At least one character, no spaces.

This opens the door for valid use cases such as using accented characters. It also gives us plenty of more ammo to shoot ourselves in the foot with, since you can now use id values that will cause problems with both CSS and JavaScript unless you’re really careful.

Consider the following HTML:

<div id="#id"></div>
<div id="div>p+p:last-child"></div>

CSS selector conflicts

To target those elements with CSS, you can’t use the normal syntax:

##id {}
#div>p+p:last-child {}

Since the id values contain characters that have a predefined meaning in CSS, you’ll need to do a bit of work on your CSS selectors for them to work. One way is to use an attribute selector instead of the # shortcut:

[id="#id"] {}
[id="div>p+p:last-child"] {}

Another way is to escape the conflicting characters:

#\#id {}
#div\>p\+p\:last-child {}

JavaScript troubles

If you want to use a JavaScript library like jQuery to do something with your funky-named elements, this won’t work:

$("#div>p+p:last-child").css("fontSize", "2em");

Just as with CSS, you’ll need to escape some characters:

$("#div\\>p\\+p\\:last-child").css("fontSize", "2em");

Name elements sensibly

HTML5 gives us more options when naming elements with the id attribute. This can be useful in some cases. However, my opinion is that using characters that conflict with CSS and/or JavaScript is just asking for trouble. The same thing goes for using funky symbols instead of readily legible text. Do it only if the benefits exceed the risks. Don’t do it just because you can.

Update: Mathias Bynens explores this issue more in-depth, including a demo page of examples, in The id attribute just got more classy in HTML5.

You should also take a look at Normalization in HTML and CSS for info on how different forms of Unicode can make selectors not match your id values or class names.

Posted on November 29, 2010 in HTML 5