Miscellaneous Constructs in Regular Expressions

Regular expressions in the .NET Framework include three miscellaneous language constructs. One lets you enable or disable particular matching options in the middle of a regular expression pattern. The remaining two let you include comments in a regular expression.

You list the options you want to enable after the question mark, and the options you want to disable after the minus sign. The following table describes each option. For more information about each option, see Regular Expression Options.

Option

Description

i

Case-insensitive matching.

m

Multiline mode.

n

Explicit captures only. (Parentheses do not act as capturing groups.)

s

Single-line mode.

x

Ignore unescaped white space, and allow x-mode comments.

Any change in regular expression options defined by the (?imnsx-imnsx) construct remains in effect until the end of the enclosing group.

The following example uses the i, n, and x options to enable case insensitivity and explicit captures, and to ignore white space in the regular expression pattern in the middle of a regular expression.

The example defines two regular expressions. The first, \b(D\w+)\s(d\w+)\b, matches two consecutive words that begin with an uppercase "D" and a lowercase "d". The second regular expression, \b(D\w+)(?ixn) \s (d\w+) \b, uses inline options to modify this pattern, as described in the following table. A comparison of the results confirms the effect of the (?ixn) construct.

Pattern

Description

\b

Start at a word boundary.

(D\w+)

Match a capital "D" followed by one or more word characters. This is the first capture group.

(?ixn)

From this point on, make comparisons case-insensitive, make only explicit captures, and ignore white space in the regular expression pattern.

\s

Match a white-space character.

(d\w+)

Match an uppercase or lowercase "d" followed by one or more word characters. This group is not captured because the n (explicit capture) option was enabled..

The (?# comment) construct lets you include an inline comment in a regular expression. The regular expression engine does not use any part of the comment in pattern matching, although the comment is included in the string that is returned by the Regex.ToString method. The comment ends at the first closing parenthesis.

The following example repeats the first regular expression pattern from the example in the previous section. It adds two inline comments to the regular expression to indicate whether the comparison is case-sensitive. The regular expression pattern, \b((?# case-sensitive comparison)D\w+)\s((?#case-insensitive comparison)d\w+)\b, is defined as follows.

Pattern

Description

\b

Start at a word boundary.

(?# case-sensitive comparison)

A comment. It does not affect pattern-matching behavior.

(D\w+)

Match a capital "D" followed by one or more word characters. This is the first capturing group.

\s

Match a white-space character.

(?ixn)

From this point on, make comparisons case-insensitive, make only explicit captures, and ignore white space in the regular expression pattern.

(?#case-insensitive comparison)

A comment. It does not affect pattern-matching behavior.

(d\w+)

Match an uppercase or lowercase "d" followed by one or more word characters. This is the second capture group.

A number sign (#)marks an x-mode comment, which starts at the unescaped # character at the end of the regular expression pattern and continues until the end of the line. To use this construct, you must either enable the x option (through inline options) or supply the RegexOptions.IgnorePatternWhitespace value to the option parameter when instantiating the Regex object or calling a static Regex method.

The following example illustrates the end-of-line comment construct. It determines whether a string is a composite format string that includes at least one format item. The following table describes the constructs in the regular expression pattern: