Ampersands and validation

Anne van Kesteren writes about the XML entities that must be encoded in Ampersands matter. One of those entities is the ampersand, &, which needs to be written as &.

Unencoded ampersands in query strings are seen by many as “OK” to let slip through validation. That’s up to everyone to decide for themselves. However, if you decide that encoding ampersands isn’t worth it, you may want to go to the W3C document Character entity references in HTML 4 and print it out. Especially Character entity references for ISO 8859-1 characters contains a lot of variable names you need to avoid in query strings if you’re interested in having your site work in browsers like Safari, Mozilla, Firefox, IE/Mac, and OmniWeb.

All those browsers will convert an unencoded ampersand followed by a character entity reference, followed by an equals sign, to the corresponding character. That will make the query string look quite different from what you’re expecting.

Let’s say you have a link that looks like this:

<a href="index.html?myvar=1&cent=2&pound=2&reg=3">A link</a>

If you’re using one of the browsers mentioned earlier, check the status bar with your cursor positioned over this link the link in this test document, or follow the link and check the location/address field. The query string is now something your server script will have a hard time parsing properly.

Sure, most of the entity names aren’t that likely to be used as a variable name in a query string. But you never know, so by keeping that list handy you can make sure you avoid them.

Of course, an alternative would be to start encoding your ampersands ;).

Update: IE/Win behaves the same way as the browsers mentioned above. Even more reason to make sure your ampersands are encoded. Oh, and I moved the test link to a separate document, since it contains invalid and non well-formed XHTML. Not a good idea when you’re serving documents as application/xhtml+xml.

Comments

1. June 11, 2004 by Matt

I'm on Win/IE6 here, but still getting the same effect - when the world's most popular browser/OS combo falls foul of this effect, it's time to start encoding those entities!

Of course, you could not worry about it and just stick a little DOM-foolery into your onLoad event to regexp any string that starts with a question mark, end with a quote and has ampersands in...

2. June 14, 2004 by jim

A quick and easy solution for those times when you have to encode a lot of ampersands:

http://automaticlabs.com/products/urlcleaner

I've used this several times, it saves me from a validator headache :)

3. October 12, 2004 by Philip

Why not just use ; to separate parameters in a query string?

4. October 12, 2004 by Emily

if you're manually parsing a query string in a URL with Perl or something, do it however you want: split it with semicolons or pipes or whatever

but with PHP and others, it automatically parses query strings based on ampersands

5. October 13, 2004 by storlek

You can configure PHP to use semicolons rather than ampersands easily. In fact, you can even separate args with hyphens, if that butters your bread. Just change the values of arg_separator.input and arg_separator.output in php.ini. My configuration accepts either semicolons or ampersands on input (for compatibility), and uses semicolons in generated URLs.

Sorry, comments are closed for this post.

Information, sponsorship, and externals

About the author

Roger Johansson is a Swedish web professional specialising in web standards, accessibility, and usability. More about me and this site.

Subscribe

Looking for web hosting?

Try DreamHost!

Use the promo code 456BEREASTREET3 to save USD 20 when you sign up!

Latest articles

Validation statistics from Nikita the Spider Comments off
An analysis of the sites crawled by the bulk validation tool Nikita the Spider during March 2008.
Authentic Jobs API and Affiliates program Comments off
The Authentic Jobs job listing service now has a public API and an affiliate program.
What does Acid3 mean to you and me? Comments off
Opera and Apple have announced that their web browsers pass the Acid3 Browser Test, but how will that help web designers and developers?
Designing Web Navigation (Book review) Comments off
Learn the fundamentals of navigation design and design better navigation systems for large and small sites as well as for web based applications.
DOMAssistant bundle for TextMate Comments off
To save keystrokes and speed up development I have created a DOMAssistant bundle for TextMate.
First impressions of Internet Explorer 8 Beta 1 Comments off
My impressions after trying out Internet Explorer 8 Beta 1 for a couple of days.

More articles

Favourites, here and elsewhere

Affiliation

  • NetRelations
  • Kaffesnobben
  • Dagens recept
  • 9rules network member

Support this site

Show your support by buying a book or two from SitePoint or getting me something from my Amazon Wish List.