9.8Backtracking

The regexp consists of two subregexps: a* followed by
a. The subregexp a* cannot be allowed to match
all four a’s in the text string aaaa, even though
* is a greedy quantifier. It may match only the first
three, leaving the last one for the second subregexp. This ensures
that the full regexp matches successfully.

The regexp matcher accomplishes this via a process called
backtracking. The matcher tentatively allows the greedy
quantifier to match all four a’s, but then when it becomes
clear that the overall match is in jeopardy, it backtracks to a
less greedy match of three a’s. If even this fails, as in
the call

the matcher backtracks even further. Overall failure is conceded
only when all possible backtracking has been tried with no success.

Backtracking is not restricted to greedy quantifiers.
Nongreedy quantifiers match as few instances as
possible, and progressively backtrack to more and more
instances in order to attain an overall match. There
is backtracking in alternation too, as the more
rightward alternates are tried when locally successful
leftward ones fail to yield an overall match.

Sometimes it is efficient to disable backtracking. For example, we
may wish to commit to a choice, or we know that trying alternatives is
fruitless. A nonbacktracking regexp is enclosed in
(?>...).

In this call, the subregexp ?>a+ greedily matches all four
a’s, and is denied the opportunity to backtrack. So, the
overall match is denied. The effect of the regexp is therefore to
match one or more a’s followed by something that is
definitely non-a.