CExtensibleMarkupLanguageDocument

$Revision: 44 $

Description

This class is the mother of all
XML
classes. It holds the
things like the element tree and settings that apply to
the entire document. It is designed to help application developers
handle XML-like data. It will parse (and construct) well formed,
standalone XML documents. It will also allow you to loosen the
parsing rules when dealing with XML from sources you can't control.

Methods

Allows you to specify a function (and a parameter for that function) that
will be called when an element with a tag matching element_name
has been successfully parsed. The element_name comparison
is not case sensitive.

Copies the callback functions from source to this object.
If you are a careful programmer, this is perfectly safe to do. Generally
speaking, you shouldn't have to copy the callbacks of source
because parsing should have already taken place.

If you wanted to know how many "Boy" elements there
are in the first set of characters, you would use an element name
of "SouthPark.Characters" If you wanted to
know how many "Girl" elements there are in the second
set of characters, you would use this for element_name:
"Southpark.Characters(1).Girl"

Initializes the enumerator in preparation for calling
GetNextCallback(). If there are
no callbacks (i.e. AddCallback() has
not been called), FALSE will be returned. If there are callbacks, TRUE
will be returned.

This is generally called during the parsing of a document by the
CExtensibleMarkupLanguageElement
that just parsed itself. However, you can pull an element out of the
document and call ExecuteCallbacks() yourself.

Retrieves the automatic indentation parameters. Automatic indentation does
nothing but make the XML output look pretty. It makes it easier for humans
to read. If your application is sensitive to white space, don't use automatic
indentation.

To retrieve the element for Cartman, element_name should
be "Southpark.Characters.Boy" If you want Ms. Ellen (even
though she doesn't play for the home team) you would use
"Southpark.Characters(1).Girl(1)"

Retrieves the next callback. It will return TRUE if the callback has been
retrieved or FALSE if you are at the end of the list. If FALSE is returned,
all parameters are set to NULL. Callbacks are added via the
AddCallback() method.

If
Parse()
returns FALSE, you can call this method to find out
interesting information as to where the parse failed. This will help you
correct the
XML.
If error_message is not NULL, it will be filled
with a human readable error message.
The beginning parameter is filled with the location in
the document where the element began.
The error_location parameter is filled with the location
where the parser encountered the fatal problem.

When you must convert from UNICODE to something else, this is the
code page that will be used. See the WideCharToMultiByte() Win32 API
for more information. If the code is run on a real operating system (NT), the
default code page is CP_UTF8. If you are running on a piece of crap
(Windows 95) the default code page is CP_ACP.

Sets the parsing options. This allows you to customize the parser to
be as loose or as strict as you want. The default is to be as strict
as possible when parsing. SetParseOptions() returns the previous
options. Here are the current parse options that can be set:

WFC_XML_IGNORE_CASE_IN_XML_DECLARATION - When set, this option
will allow uppercase letters in the XML declaration. For example:
<?XmL ?> will be allowed even though it does not
conform to the
specification.

WFC_XML_ALLOW_REPLACEMENT_OF_DEFAULT_ENTITIES - Though the
XML specification
doesn't talk about it, what should a parser do if default entities
are replaced? If you set this option, the parser will allow replacement
of the default entities. Here is a list of the default entities:

&amp;

&apos;

&gt;

&lt;

&quot;

WFC_XML_FAIL_ON_ILL_FORMED_ENTITIES - Not yet implemented.
It will allow the parser to ignore ill formed entities such as <!ENTITY amp "&#38;">

WFC_XML_IGNORE_ALL_WHITE_SPACE_ELEMENTS - Tells the parser
to ignore elements (of type typeTextSegment) that contain
nothing but white space characters. WARNING! If you use this option, it will
not be possible to reproduce that input file exactly. Elements that contain
nothing but white spaces will be deleted from the document.

WFC_XML_IGNORE_MISSING_XML_DECLARATION - Tells the parser
to ignore the fact that the <?xml ?> element is missing.
If it was not specified in the data stream, one will be automatically
added to the document. This is the default behavior.

WFC_XML_DISALLOW_MULTIPLE_ELEMENTS - Tells the parser
to allow multiple elements to be present in the document. The first rule (Rule 1)
of the
XML specification
says (like Connor MacLeod of the clan MacLeod) There can be only one
element in an XML document. That element can have a billion child elements
but there can be only one root element. If this option is set
(it is not set by default), the parser will strictly enforce this rule. This rule
really gets in the way of using XML for things like log files (where you
want to open the file, append a record to it and close the file).

This method is usually called by the element that cannot parse
itself. There is logic that prevents the information from being
overwritten by subsequent calls to SetParsingErrorInformation().
This means you can call SetParsingErrorInformation() as
many times as you want but only information from the first call
will be recorded (and reported via
GetParsingErrorInformation())
for each call to
Parse().

Sets the writing options. This allows you to customize how the
XML is written.
The default is to be as strict as possible when writing.
SetWriteOptions() returns the previous
options. Here are the current options that can be set:

WFC_XML_INCLUDE_TYPE_INFORMATION - Not Yet Implemented.XML
is woefully inept at handling data. They use things called DTD's but
they have a "the world is flat" outlook on life.
DTD's lack the ability to scope.
It would be like programming where all variables have to have unique names
no matter what function they exist in.
DTD's are used to give HTML browsers the ability to correctly
display
XML.
They also give you the ability to do some lame data validation.
In the future, I will include the
ability to write type information in a programmer friendly fashion.
This type information is intended to be a programmer-to-programmer
communication medium.

WFC_XML_DONT_OUTPUT_XML_DECLARATION - This allows
you to skip writing the
XML
declaration when you output. For example, this
XML document:

<?xml version="1.0" ?>
<TRUTH>Sam loves Laura.</TRUTH>

Would look like this when WFC_XML_DONT_OUTPUT_XML_DECLARATION
is set:

<TRUTH>Sam loves Laura.</TRUTH>

WFC_XML_WRITE_AS_UNICODE - This
tells the document to write output as UNICODE (two bytes per
character). It will default to writing in little endian format.

WFC_XML_WRITE_AS_BIG_ENDIAN - This
tells the document to write UNICODE or UCS4 characters in
big endian format.

WFC_XML_WRITE_AS_UCS4 - This will write
the data in UCS4 (four bytes per character). The default is to
write in little endian format. For example, the < character
would come out as these bytes 3C 00 00 00

WFC_XML_WRITE_AS_UCS4_UNUSUAL_2143 - This
will format the output in an unusual 4 bytes per character format.
For example, the < character
would come out as these bytes 00 00 3C 00

WFC_XML_WRITE_AS_UCS4_UNUSUAL_3412 - This
will format the output in another unusual 4 bytes per character
format.
For example, the < character
would come out as these bytes 00 3C 00 00

WFC_XML_WRITE_AS_ASCII - This
will format the output in ASCII format. This is the default.