The tokenize()
function is a helper function simplifying the usage of a lexer in a stand
alone fashion. For instance, you may have a stand alone lexer where all
that functional requirements are implemented inside lexer semantic actions.
A good example for this is the word_count_lexer
described in more detail in the section Lex
Quickstart 2 - A better word counter using Spirit.Lex.

The construct used to tokenize the given input, while discarding all generated
tokens is a common application of the lexer. For this reason Spirit.Lex
exposes an API function tokenize() minimizing the code required:

// Read input from the given file
std::stringstr(read_from_file(1==argc?"word_count.input":argv[1]));word_count_tokens<lexer_type>word_count_lexer;std::string::iteratorfirst=str.begin();// Tokenize all the input, while discarding all generated tokens
boolr=tokenize(first,str.end(),word_count_lexer);

This code is completely equivalent to the more verbose version as shown
in the section Lex
Quickstart 2 - A better word counter using Spirit.Lex.
The function tokenize()
will return either if the end of the input has been reached (in this case
the return value will be true),
or if the lexer couldn't match any of the token definitions in the input
(in this case the return value will be false
and the iterator first
will point to the first not matched character in the input sequence).

The beginning of the input sequence to tokenize. The value of this
iterator will be updated by the lexer, pointing to the first not
matched character of the input after the function returns.

Iterator last

The end of the input sequence to tokenize.

Lexer const& lex

The lexer instance to use for tokenization.

Lexer::char_type const* initial_state

This optional parameter can be used to specify the initial lexer
state for tokenization.

A second overload of the tokenize() function allows specifying of any arbitrary
function or function object to be called for each of the generated tokens.
For some applications this is very useful, as it might avoid having lexer
semantic actions. For an example of how to use this function, please have
a look at word_count_functor.cpp:

The main function simply loads the given file into memory (as a std::string), instantiates an instance of
the token definition template using the correct iterator type (word_count_tokens<charconst*>), and finally calls lex::tokenize, passing an instance of the
counter function object. The return value of lex::tokenize() will be true
if the whole input sequence has been successfully tokenized, and false otherwise.

intmain(intargc,char*argv[]){// these variables are used to count characters, words and lines
std::size_tc=0,w=0,l=0;// read input from the given file
std::stringstr(read_from_file(1==argc?"word_count.input":argv[1]));// create the token definition instance needed to invoke the lexical analyzer
word_count_tokens<lex::lexertl::lexer<>>word_count_functor;// tokenize the given string, the bound functor gets invoked for each of
// the matched tokens
charconst*first=str.c_str();charconst*last=&first[str.size()];boolr=lex::tokenize(first,last,word_count_functor,boost::bind(counter(),_1,boost::ref(c),boost::ref(w),boost::ref(l)));// print results
if(r){std::cout<<"lines: "<<l<<", words: "<<w<<", characters: "<<c<<"\n";}else{std::stringrest(first,last);std::cout<<"Lexical analysis failed\n"<<"stopped at: \""<<rest<<"\"\n";}return0;}

The beginning of the input sequence to tokenize. The value of this
iterator will be updated by the lexer, pointing to the first not
matched character of the input after the function returns.

Iterator last

The end of the input sequence to tokenize.

Lexer const& lex

The lexer instance to use for tokenization.

F f

A function or function object to be called for each matched token.
This function is expected to have the prototype: boolf(Lexer::token_type);.
The tokenize()
function will return immediately if F
returns `false.

Lexer::char_type const* initial_state

This optional parameter can be used to specify the initial lexer
state for tokenization.