By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent.

By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

Listening to marketing departments you would think that HTML 5 is poised to take over the world, but I found that the status of HTML5 versus HTML 4 and XHTML as standards in typical web developer practice is not as simple as that. The W3C working group is attempting to set a timeline for reaching a "last call" version of HTML 5 by mid 2011. However, it is not clear how many of the desirable features such as microdata will be part of the HTML 5 spec or become a separate specification.

The working group for XHTML 2 was officially closed as of December 2010, with the intent of letting developers concentrate on HTML5 and XHTML5. It is too early to tell how many of the ideas the working group developed will live on in XHTML5. However, RDFa (Resource Description Framework in Attributes) recommendation for expressing structured data now has a home in HTML5+RDFa.

After the ferment of competing browser feature development in the early years it was a relief to get HTML 4.0 and XHTML 1.0 to provide some degree of stability for web authors for nearly a decade. Now it appears we are in for another period of dramatic change.

What the DOCTYPE reveals

Markup languages derived from standard generalized markup language (SGML), such as XML and all versions of HTML before HTML5, embrace the role of DOCTYPE declarations to associate a document type definition (DTD) with a document. DTDs use a compact formal syntax which defines exactly which elements can occur where in any SGML compliant language. The DOCTYPE declaration, which should be the first element of a document, guides a client program such as a browser in interpretation of the markup.

For a number of reasons, the developers of HTML5 have abandoned SGML compliance, and the use of DOCTYPE declarations in HTML 5 does not cite a DTD. I found confusing recommendations on the need for a DOCTYPE declaration. This W3C working draft dated January 13, 2011 states "A DOCTYPE is a required preamble." The use of <!DOCTYPE html> (case insensitive) is recommended. Since XHTML5 requires all elements to be in lower case, the DOCTYPE for XHTML5 will be case sensitive. The idea being that this simple DOCTYPE will make browsers use "standards mode" for rendering. Other sources, such as this WHATWG page of Jan 19, 2011 state that the DOCTYPE declaration is actually optional for XHTML5.

A little field work

Given all of the above, I thought it would be interesting to find out exactly which versions of markup are actually in use in the web today. Starting with the web crawler I wrote for my previous look at XHTML use, I collected counts of the DOCTYPE names in use on over 12,000 pages with interesting results.

XHTML 1.0 - With a few XHTML 1.1, about 74% total.

HTML 4.0 and 4.01 - Mostly 4.01 and using the "transitional" DTD, about 15%.

Possible HTML5 and XHTML5 - As indicated by the use of "<?!DOCTYPE html>, about 10%.

HTML 3 and 2 - Astonishing but true, about 1% of these very old standards. I am guessing either some pages have not been touched in years - or - people are using very out of date authoring tools.

undecipherable A bit more than 0.5% were obvious invalid declarations.

Web pages using the HTML5 and XHTML5 DOCTYPE declarations are starting to appear on the web. However, XHTML 1.0 remains the most common and HTML 4 use is still very common. I think there will be a "long tail" of HTML 4 and older documents on the web for a long time, continuing to complicate the work of browser developers.

0 comments

Register

Login

Forgot your password?

Your password has been sent to:

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy