Contents

  1. Introduction
  2. History
  3. Web Standards
  4. Structure and presentation
  5. (X)HTML
  6. CSS
  7. Accessibility
  8. URLs
  9. References
  10. Glossary

8. URLs

This section isn’t really about web standards or accessibility, but it’s here because the way a URL is constructed can have a great effect on how well a website is indexed by search engines, and how usable it is to its visitors.

Some search engine robots don’t follow links to URLs that end in a query string. This kind of URL is very common on websites that dynamically get their content from a database, and may look like this:

http://example.com/products.asp?item=34627393474632&id=4344

The easiest way to construct a URL that is better for both search engine robots and humans is to change it to look like it is pointing to a directory. The above example would then be changed to look like this:

http://example.com/products/item/34627393474632/id/4344/

The web server then interprets the new URL and internally converts it back to the original URL, complete with the query string. At the end of this section are some links to sites where more info on how to do this can be found.

It should be noted that in 2008, Google announced that they prefer URL:s with query strings to URL:s that have been rewritten this way. More details on that are available in the Google Webmaster Central Blog article Dynamic URLs vs. static URLs.

An even better, but somewhat more complicated, way of changing URLs is to completely rewrite the visible URLs to make them human readable:

http://example.com/products/flowers/tulips/

The main advantages to using this kind of URL are that it becomes easier for humans to read the URL and you avoid revealing which server technology you’re using. Since the URLs don’t contain server specific file extensions, like .asp, .cf, .cgi or .jsp, this will also make it easier to change the technology used on the server, should that become necessary.

If you choose to use query strings in your URLs, it’s important to encode any ampersands, &, to their HTML entity, &. If you don’t, web browsers will interpret the ampersand as the start of an HTML character entity or reference. If the text following immediately after the ampersand matches an HTML entity, the browser will convert the URL and probably break the query string. Validating your HTML will catch these errors since unencoded ampersands are illegal.

Another thing worth mentioning is that you should make sure your website works with or without www. Whether you use the www subdomain or not, it is a good idea to configure your web server to redirect any traffic to http://example.com/ to http://www.example.com/ or vice versa.

Further reading

Comments, questions or suggestions? Please let me know.

© Copyright Roger Johansson