This section is directly intended to help programmers getting bootstrapped
using the XML tollkit from the C language. It is not intended to be
extensive. I hope the automatically generated documents will provide the
completeness required, but as a separate set of documents. The interfaces of
the XML parser are by principle low level, Those interested in a higher level
API should look at DOM.

Usually, the first thing to do is to read an XML input. The parser accepts
documents either from in-memory strings or from files. The functions are
defined in "parser.h":

xmlDocPtr xmlParseMemory(char *buffer, int size);

Parse a null-terminated string containing the document.

xmlDocPtr xmlParseFile(const char *filename);

Parse an XML document contained in a (possibly compressed)
file.

The parser returns a pointer to the document structure (or NULL in case of
failure).

Invoking the parser: the push method

In order for the application to keep the control when the document is
being fetched (which is common for GUI based programs) libxml2 provides a
push interface, too, as of version 1.8.3. Here are the interface
functions:

The HTML parser embedded into libxml2 also has a push interface; the
functions are just prefixed by "html" rather than "xml".

Invoking the parser: the SAX interface

The tree-building interface makes the parser memory-hungry, first loading
the document in memory and then building the tree itself. Reading a document
without building the tree is possible using the SAX interfaces (see SAX.h and
James
Henstridge's documentation). Note also that the push interface can be
limited to SAX: just use the two first arguments of
xmlCreatePushParserCtxt().

The other way to get an XML tree in memory is by building it. Basically
there is a set of functions dedicated to building new elements. (These are
also described in <libxml/tree.h>.) For example, here is a piece of
code that produces the XML document used in the previous examples:

Basically by including "tree.h" your
code has access to the internal structure of all the elements of the tree.
The names should be somewhat simple like parent,
children, next, prev,
properties, etc... For example, still with the previous
example:

doc->children->children->children

points to the title element,

doc->children->children->next->children->children

points to the text node containing the chapter title "The Linux
adventure".

NOTE: XML allows PIs and comments to be
present before the document root, so doc->children may point
to an element which is not the document Root Element; a function
xmlDocGetRootElement() was added for this purpose.

This sets (or changes) an attribute carried by an ELEMENT node.
The value can be NULL.

const xmlChar *xmlGetProp(xmlNodePtr node, const xmlChar
*name);

This function returns a pointer to new copy of the property
content. Note that the user must deallocate the result.

Two functions are provided for reading and writing the text associated
with elements:

xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc, const xmlChar
*value);

This function takes an "external" string and converts it to one
text node or possibly to a list of entity and text nodes. All
non-predefined entity references like &Gnome; will be stored
internally as entity nodes, hence the result of the function may not be
a single node.

This function is the inverse of
xmlStringGetNodeList(). It generates a new string
containing the content of the text and entity nodes. Note the extra
argument inLine. If this argument is set to 1, the function will expand
entity references. For example, instead of returning the &Gnome;
XML encoding in the string, it will substitute it with its value (say,
"GNU Network Object Model Environment").