Google valid and strict

In the comments on my article Ten reasons to learn and use web standards, someone tried to make the point that web standards are largely irrelevant by using Google's home page as an example:

Take Google's homepage. Is it inaccessible because it does not validate? Does it minimise the number of potential visitors? How much bandwidth would they save by adding a doctype? Quoting attributes? Using XHTML?

Here's a quote from my reply:

Since you mention bandwidth I just had a quick look, and I managed to recreate the Google homepage in valid HTML 4.01 Strict and dump all the layout tables. As a result I reduced the file size by a couple of hundred bytes, despite quoting the required attributes and adding a DOCTYPE and a bunch of CSS rules.

For some reason I never got around to writing anything about that Google home page remake, but Philipp Lenssen's post Google Strict vs Google Deprecated made me remember that I had it lying around on my computer.

I opened the file and changed a couple of things to make it match Google's home page as it looks now (they made a couple of slight changes since December 2005). Quoting all attributes that require quoting and adding a Doctype plus a bunch of CSS rules to replace the tables, spacer GIFs and font tags actually reduced file size. And I didn't even move the CSS and JavaScript to external files or get rid of all the inline event handlers.

Anyhow, the result is a valid HTML 4.01 Strict file that is 3 902 bytes large. Google's invalid kinda-HTML 2.something very-loose is 4 944 bytes. The valid and strict version is 1 042 bytes smaller. That's 21 percent savings on bandwidth costs.

The valid version of Google's home page looks all but identical to the invalid one in CSS capable browsers, including Internet Explorer. There are a couple of slight, completely insignificant pixel shifts caused by browsers handling text input widths and empty table cells differently. Note that I changed the URLs in the valid version I am linking to here to avoid getting tons of 404 errors, making it 168 bytes larger than with Google's original URLs.

The myth that Google is using invalid markup to save bandwidth is clearly just a myth.

Whether the reason is backwards compatibility, fear of change, developer ignorance, server platform inflexibility, that they just don't care, or something else, I can only guess. But it sure isn't saving them any bandwidth.

Posted on August 14, 2006 in Web Standards