If not in "multiline mode", must not match
any of the newline sequences.

If in "multiline mode", must match all of the
newline sequences, and \u000D\u000A (CRLF) should match as if it were a
single character. (The recommendation that CRLF match as a single character
is, however, not required for conformance to RL1.6.)

to the single bullet:

Where the 'arbitrary character pattern' matches a
newline sequence, it must match all of the newline sequences, and \u000D\u000A
(CRLF) should match as if it were a single character. (The
recommendation that CRLF match as a single character is, however, not
required for conformance to RL1.6.)

Describe that one of the most effective ways to implement canonical equivalents
is by having a special mode that makes all matches be done on grapheme cluster
boundaries, since it avoids the reordering problems that can happen in
normalization.

[a-m \q{ch} \q{rr}] should behave
like (?> ch | rr |
[am]) as interpreted in Perl-like regex engines -- matching ch or rr and
advancing by two code points, or matching a-m and advancing one code
point, or failing to match.

Note that "(?> ch | rr | [a-m])heese" will match "chheese" but not
"cheese"; that is the c in [a-m] will not match if the "ch" has already
matched.

Matching a complemented set containing strings like \q{ch} may behave
differently in different modes: the normal mode where code points are the
unit of matching, or a mode where grapheme clusters are the unit of
matching. That is, [^ a-z \q{ch} \q{rr}] should behave like:

When interpreting a complex character set containing strings like \q{ch}
plus embedded complement operations, it works best to interpret as if the
complement were "pushed up" to the top of the expression, using the
following rewrites recursively:

Original

Rewrite

Original

Rewrite

^x || y

^(x -- y)

^x && y

^x -- ^y

y -- x

x || ^y

^(y -- x)

x && ^y

x -- y

^x || ^y

^(x && y)

^x && ^y

^x -- y

^(x || y)

^^x

x

x -- ^y

x && y

Applying these rewrites will end up with either the complement operations
being completely eliminated, or a single remaining complement operation at the
top level. Logically, the rest of the expression is then a flat list of
characters and/or multicharacter strings, and matching strings can then can be
handled as in #1 or #2 above.