Always specify character encoding

As a Mac user and a user of “alternative” web browsers I have visited many websites where text does not display correctly. In almost every case that is caused by the website’s author specifying an incorrect character encoding or not specifying a character encoding at all.

At times when I have tried bringing the problem to the attention of the developers I have been met by a lack of understanding. After all, they couldn’t see a problem in their Windows-only world. Well, that is not the only platform used on the web, so you need to tell the visitor’s browser which character encoding you are using.

David Baron’s Why Web authors must specify character encodings explains this well.

Comments

1. March 30, 2006 by Oliver Zheng

I suppose this applies to feeds as well?

2. March 30, 2006 by Douglas Clifton

Absolutely. All variants of RSS and Atom are XML languages. See Mark Pilgrim's Determining the character encoding of a feed.

3. March 30, 2006 by Jero

But not only XML. Everything that contains plain text (so that means HTML, XML, CSS, JavaScript etc.) needs to be accompanied by a character encoding.

On my website, I described how to use the .htaccess file to automatically send a character encoding. Very useful Apache feature because the character encoding is such a small thing which you can easily forget about. So specifying a default one is probably a smart thing to do.

4. March 30, 2006 by Jules

@Jero:

Question: Do web servers automatically assign a character encoding to CSS and JS files? Given that character encoding is not part of either specification, character coding must come from somewhere else.

5. March 31, 2006 by web

We decided on UTF-8 and we have had a ton of problems with older editors and documents "slipping" out of UTF-8 .. It’s been a huge pain and don't get me started on the BOM! (byte-order-mark)

But agreed .. telling the user how they should interpret the characters flying at their browser is a requirement. [sarcasm] Someone should make a list of all these things and create some sort of validation utility for people to use .. [/sarcasm]

6. March 31, 2006 by Douglas Clifton

If you are lucky enough to have edit privileges on your Apache httpd.conf file, you can add:

AddDefaultCharset utf-8

rather than having to rely on .htaccess files.

7. March 31, 2006 by Douglas Clifton

Or, in the case of Apache 2.2 users, AddDefaultCharset. ;-)

8. March 31, 2006 by Dustin Diaz

Funny. I just got a "character encoding" bug in my queue this last month.

9. March 31, 2006 by Amrit

I totally agree. Even I didn't pay much attention to character encoding although it was always there when I open new documents in Dreamweaver to work with PHP. Then I started using Hindi for one of my literary blogs and discovered that without proper character encoding the characters were not comprehensible.

10. March 31, 2006 by Ivar van Duuren

On a client website page I specified character encoding to be utf8 with meta http-equiv. But according to the w3.org validator the character encoding as specified by http is iso-8859-1 which results in a mismatch and has the page be invalid xhtml and rendered incorrectly.

On other pages of the same site, this problem doesn't occur. Is there another way to overwrite http character encoding in html?

Thanks!

11. March 31, 2006 by Robert Wellock

Hmmm you may have to set the server HTTP header either via the PHP or .htaccess and something on the lines of: AddCharset UTF-8 .html

12. March 31, 2006 by Andy

Recently I took over maintaining an app that had been running for a year or so without any encoding being set. As a result it looks like there is now all kinds of nastyness stored in the database. I'm not looking forward to having to clean it all up to get it running on UTF-8.

13. May 22, 2006 by Nick Gagne

I created this character encoding tool to make the special character encoding process a little easier. No need to look up some of theose obscure characters, just copy and paste.

Entity Encoding Tool

Sorry, comments are closed for this post.

Information, sponsorship, and externals

About the author

Roger Johansson is a Swedish web professional specialising in web standards, accessibility, and usability. More about me and this site.

Subscribe

Looking for web hosting?

Try DreamHost!

Use the promo code 456BEREASTREET3 to save USD 20 when you sign up!

Latest articles

Validation statistics from Nikita the Spider Comments off
An analysis of the sites crawled by the bulk validation tool Nikita the Spider during March 2008.
Authentic Jobs API and Affiliates program Comments off
The Authentic Jobs job listing service now has a public API and an affiliate program.
What does Acid3 mean to you and me? Comments off
Opera and Apple have announced that their web browsers pass the Acid3 Browser Test, but how will that help web designers and developers?
Designing Web Navigation (Book review) Comments off
Learn the fundamentals of navigation design and design better navigation systems for large and small sites as well as for web based applications.
DOMAssistant bundle for TextMate Comments off
To save keystrokes and speed up development I have created a DOMAssistant bundle for TextMate.
First impressions of Internet Explorer 8 Beta 1 Comments off
My impressions after trying out Internet Explorer 8 Beta 1 for a couple of days.

More articles

Favourites, here and elsewhere

Affiliation

  • NetRelations
  • Kaffesnobben
  • Dagens recept
  • 9rules network member

Support this site

Show your support by buying a book or two from SitePoint or getting me something from my Amazon Wish List.