Entity versus Entity Reference

An entity is a storage unit that contains a piece of an XML document. This storage unit may be a file, a database record, an object in memory, a stream of bytes returned by a network server, or something else. It may contain an entire XML document or just a few elements or declarations.

Entity references point to these entities. There are two kinds of entity references: general entity references and parameter entity references. A general entity reference begins with an ampersand, for instance, &amp; or &chapter1; . These normally appear in the instance document. For example, you might define the chapter1 entity in the DTD like this:

<!ENTITY SYSTEM chapter1 "http://www.example.com/chapter1.xml">

Then in the document you could reference it like this:

<book> &chapter1; ... </book>

&chapter1; is an entity reference. The actual content of the document found at http://www.example.com/chapter1.xml is an entity. They are related , but they are not the same thing.

Parameter entities and parameter entity references follow the same pattern. The difference is that parameter entities contain DTD fragments instead of instance document fragments , and parameter entity references begin with a percent sign instead of an ampersand. However, the entity reference still stands in for and points to the actual entity.

XML APIs are schizophrenic about whether they report entities, entity references, neither , or both. Some, like XOM, simply replace all entity references with their corresponding entities and don't tell you that anything has happened . Others, like JDOM, only report entities they have not resolved. Still others such as DOM and SAX can report both entities and entity references, although this often depends on user preferences and the abilities of the underlying parser; and normally the five predefined entity references ( &amp; , &lt; , &gt; , &quot; , and &apos; ) are not reported .