The stated goal of this module is to be a drop-in replacement for re.
My hope is that some will be able to go to the top of their module and put:

try:
import re2 as re
except ImportError:
import re

That being said, there are features of the re module that this module may
never have. For example, RE2 does not handle lookahead assertions ((?=...)).
For this reason, the module will automatically fall back to the original re module
if there is a regex that it cannot handle.

However, there are times when you may want to be notified of a failover. For this reason,
I’m adding the single function set_fallback_notification to the module.
Thus, you can write:

And in the above example, set_fallback_notification can handle 3 values:
re.FALLBACK_QUIETLY (default), re.FALLBACK_WARNING (raises a warning), and
re.FALLBACK_EXCEPTION (which raises an exception).

Note: The re2 module treats byte strings as UTF-8. This is fully backwards compatible with 7-bit ascii.
However, bytes containing values larger than 0x7f are going to be treated very differently in re2 than in re.
The RE library quietly ignores invalid utf8 in input strings, and throws an exception on invalid utf8 in patterns.
For example:

One current issue is Unicode support. As you may know, RE2 supports UTF8,
which is certainly distinct from unicode. Right now the module will automatically
encode any unicode string into utf8 for you, which is slow (it also has to
decode utf8 strings back into unicode objects on every substitution or split).
Therefore, you are better off using bytestrings in utf8 while working with RE2
and encoding things after everything you need done is finished.

Performance is of course the point of this module, so it better perform well.
Regular expressions vary widely in complexity, and the salient feature of RE2 is
that it behaves well asymptotically. This being said, for very simple substitutions,
I’ve found that occasionally python’s regular re module is actually slightly faster.
However, when the re module gets slow, it gets really slow, while this module
buzzes along.

In the below example, I’m running the data against 8MB of text from the collosal Wikipedia
XML file. I’m running them multiple times, being careful to use the timeit module.
To see more details, please see the performance script.