This chapter is from the book

This chapter is from the book

XML Elements

Much like HTML, XML is a markup language with documents composed of tags that
"mark up" the data in a document. A typical XML document will contain
a large number of these tags, start and end tags, with data contained within the
tags. For example, if we were looking at the representation of a name in XML, we
might have

<name>John Doe</name>

Here we have two tagsthe start tag <name> and the end
tag </name>. Tags are a very important part of XML. They are what
you use to mark the beginning and ending of elements in your XML documents. The
two tags, taken together along with the content between them, constitute an XML
element.

We would actually refer to the element by the element type, which is
synonymous with the name used in the start/end tag pair. In the previous
example, we have a name element, the content of which happens to be the
name John Doe.

Elements are referred to by their names, or element types. However, the
actual element instance is both tags and the element's content nested
between the tags. Elements can have text content, which is called Parsed
Character Data, or PCDATA, or they can have other elements as their
content. For example, we might alter the name element to contain more
information:

<name>
<first>John</first>
<last>Doe</last>
</name>

Now we have three elementsa name element, which has as its
content the first element and the last element. The
first and last elements contain PCDATA, which represents the
actual name of the person being stored in the name element.

Elements in XML must be composed of both start and end tags (with one
exception for empty elements, which we will discuss later). This is one way in
which XML differs significantly from HTML. For example, in HTML, there are a
number of tags that can be used without end tags, such as <P>,
<HR>, or <BR>.

With XML, each start tag contains the name of the element type, and each end
tag contains the name of the element type as well, preceded by a / to
denote that it is an end tag. The start and end tags must match exactlyfor
example, the following tags do not match:

<name>John Doe</NAME>

Unlike HTML, XML is case sensitive, so start and end tags must match in case
as well. You might be surprised at how strict XML seems as compared to HTML;
however, this does help keep your documents consistent and readable.

NOTE

Current versions of HTML do allow some tags without end tags,
and HTML is also not case sensitive. However, in an effort to promote
compatibility and extensibility, the W3C is in the process of rewriting the HTML
Recommendation using XML, and the result is XHTML. XHTML requires all tags to be
properly closed, and introduces case sensitivity to HTML. There are some other
differences as well; the specifics of XHTML are covered later in Chapter 21,
"The Future of the Web: XHTML."

Shorthand for Empty Elements

There are times when you might have an element in your document that does not
contain any data. For example, in a contact document, you might have an element
for cellular phone numbers:

<cellular>312-555-1212</cellular>

This is fine, assuming that your contact has a cellular phone. However, if
they do not, then you might have a document with some empty
<cellular> elements:

<cellular></cellular>

These empty elements can be written as shown, with a start and end tag;
however, there is also a shorthand which can be used for empty elements:

<cellular/>

By including the / character at the end of the tag, an XML processor
will know that the element is empty. Use of the empty element form can reduce
clutter in your documents, and also save time when authoring.