mfr.xmltok

This module reads and writes tokens from and to XML documents. Tokens are
defined for each basic structure you can find in XML.

You can read a document as tokens using the powerfull tokenize function
which calls user-defined function for each token it encounters. You can also
use the generally more convenient XMLForwardRange class to expose a document
as a range of tokens on which you can loop easily.

You can also write a document by puting tokens into a XMLWriter range.

struct CharDataToken;

Token for regular character data in the document.

string data;

Actual text value.

struct CommentToken;

Token for a comment.

string content;

Text content of the comment.

struct PIToken;

Token for a processing instruction.

string target;

Processor target identifier

string content;

Text content for the processing instruction

struct CDataSectionToken;

Content for a CData section.

string content;

Character data content.

struct AttrToken;

Content for an attribute inside an open element tag.

string name;

Attribute's name.

string value;

Attribute's value.

struct XMLDecl;

Gives the parsed content of an XML declartion.

Note:
The tokenizer doesn't parse the XML declaration. You should call for
readXMLDecl first prior calling tokenize.

Scan the start of an XML document for an XML declaration and skip it if
found.

Params:

input

input text for the document

decl

data extracted from the XML declaration, or default values if not found.

Returns:true if an XML declaration is found, false otherwise

struct DoctypeToken;

Start of a document type declaration. This token is emitted when
encountering a DOCTYPE markup declaration.

string name;

Document type name.

string pubidLiteral;

Public identifier literal.

string systemLiteral;

System identifier literal.

struct DoctypeDoneToken;

End of a document type declaration. This token is emitted when encoutening
the final ">" of a DOCTYPE declaration.

Note:
For now, this token will always directly follow a DoctypeToken since
we do not currently support the internal subset. Adding support for the
internal subset in the parser will make other tokens appear between a
DoctypeToken and a DoctypeDoneToken.

struct OpenElementToken;

Indicate that we're opening an element of the given name. Attributes will
follow in separate tokens.

struct OpenTagDoneToken;

Empty token indicating that we are done parsing an open tag and its
attributes.
Only used by the callback API,

struct EmptyOpenTagDoneToken;

Empty token indicating that an open tag has been closed with '/>', making it
an empty element. Used as a replacement for OpenTagDoneToken.

struct CloseElementToken;

Indicate that we're closing an element of the given name.

enum ParsingState;

Parsing state flag allowing the tokenizer to stop and restart from where
it left.

TAGS

Searching for tags.

ATTRS

Searching for attributes inside a tag.

IN_DOCTYPE

Searching for inner subset inside doctype.

void tokenize(alias output)(string input);

bool tokenize(alias output, alias state)(ref string input);

Tokenize input string by calling output for each encountered token.
Stop when reaching the end of input or when output returns
true.

Params:

output

alias to a callable object or overloaded function or template
function to call after each token.

state

alias to a ParsingState variable for holding the state of the
parser when tokenize returns before the input's end.

input

reference to string input which will contain the remaining text
after parsing.

Returns:true if there is still content to parse (was stopped by a callback)
or false if the end of input was reached.

Throws: for any tokenizer-level well-formness error.

Note:
The tokenizer is not a full XML parser in the sense that it cannot
check for all well-formness contrains of an XML document.