22.7.1.4 Defining Tokens

SMIE comes with a predefined lexical analyzer which uses syntax tables
in the following way: any sequence of characters that have word or
symbol syntax is considered a token, and so is any sequence of
characters that have punctuation syntax. This default lexer is
often a good starting point but is rarely actually correct for any given
language. For example, it will consider "2,+3" to be composed
of 3 tokens: "2", ",+", and "3".

To describe the lexing rules of your language to SMIE, you need
2 functions, one to fetch the next token, and another to fetch the
previous token. Those functions will usually first skip whitespace and
comments and then look at the next chunk of text to see if it
is a special token. If so it should skip the token and
return a description of this token. Usually this is simply the string
extracted from the buffer, but it can be anything you want.
For example:

Notice how those lexers return the empty string when in front of
parentheses. This is because SMIE automatically takes care of the
parentheses defined in the syntax table. More specifically if the lexer
returns nil or an empty string, SMIE tries to handle the corresponding
text as a sexp according to syntax tables.