XHTML 1.0: Where XML and HTML meet (3/8) - exploring XML

XHTML 1.0: Where XML and HTML meet

Old HTML sins (continued)

Pay attention to whitespace handling in attribute values
In attribute values, user agents will strip leading and trailing whitespace from attribute
values and map sequences of one or more whitespace characters (including line breaks) to
a single inter-word space (an ASCII space character for western scripts).

Escape or externalize script and style elements
In XHTML, the script and style elements are declared as having parsed character content.
As a result, < and & will be treated as the start of markup, and entities such as &lt;
and &amp; will be recognized as entity references by the XML processor to < and &
respectively. Wrapping the content of the script or style element within a CDATA marked
section avoids the expansion of these entities.

<script>
<![CDATA[
... unescaped script content ...
]]>
</script>

CDATA sections are recognized by the XML processor and appear as nodes in the Document
Object Model. An alternative is to use external script and style documents.

Stick to the existing SGML exclusions
SGML gives the writer of a DTD the ability to exclude specific elements from being
contained within an element. Such prohibitions (called "exclusions") are not possible in
XML.
For example, the HTML 4 Strict DTD does not allow the nesting of an 'a' element within
another 'a' element. It is not possible to express this in
XML. Even though these restrictions cannot be defined in the DTD,
certain elements should
not be nested.

Use id for fragment identifiers, not name
HTML 4 defined the name attribute for the elements a, applet, form, frame, iframe, img,
and map. HTML 4 also introduced the id attribute. Both of these attributes are designed
to be used as fragment identifiers.
In XML, fragment identifiers are of type ID, and there can only be a single attribute of
type ID per element. Therefore, in XHTML 1.0 the id attribute is defined to be of type ID.
In order to ensure that XHTML 1.0 documents are well-structured XML documents, XHTML 1.0
documents must use the id attribute when defining fragment identifiers, even on elements
that historically have also had a name attribute. In XHTML 1.0, the name attribute of
these elements is formally deprecated, and will be removed in a subsequent version of
XHTML.