Regular Expressions with Windows PowerShell

Windows PowerShell is a programming language from Microsoft that is primarily designed for system administration. Since PowerShell is built on top of the .NET framework, .NET's excellent regular expression support is also available to PowerShell programmers.

PowerShell -match and -replace Operators

With the -match operator, you can quickly check if a regular expression matches part of a string. E.g. 'test' -match '\w' returns true, because \w matches t in test.

As a side effect, the -match operator sets a special variable called $matches. This is an associative array that holds the overall regex match and all capturing group matches. $matches[0] gives you the overall regex match, $matches[1] the first capturing group, and $matches['name'] the text matched by the named group "name".

The -replace operator uses a regular expression to search-and-replace through a string. E.g. 'test' -replace '\w', '$&$&' returns 'tteesstt'. The regex \w matches one letter. The replacement text re-inserts the regex match twice using $&. The replacement text parameter must be specified, and the regex and replacement must be separated by a comma. If you want to replace the regex matches with nothing, pass an empty string as the replacement.

Traditionally, regular expressions are case sensitive by default. This is true for the .NET framework too. However, it is not true in PowerShell. -match and -replace are case insensitive, as are -imatch and -ireplace. For case sensitive matching, use -cmatch and -creplace. I recommend that you always use the "i" or "c" prefix to avoid confusion regarding case sensitivity.

The operators do not provide a way to pass options from .NET's RegexOptions enumeration. Instead, use mode modifiers in the regular expression. E.g. (?m)^test$ is the same as using ^test$ with RegexOptions.MultiLine passed to the Regex() constructor. Mode modifiers take precedence over options set externally to the regex. -cmatch '(?i)test' is case insensitive, while -imatch '(?-i)test' is case sensitive. The mode modifier overrides the case insensitivity preference of the -match operator.

Replacement Text as a Literal String

The -replace operator supports the same replacement text placeholders as the Regex.Replace() function in .NET. $& is the overall regex match, $1 is the text matched by the first capturing group, and ${name} is the text matched by the named group "name".

But with PowerShell, there's an extra caveat: double-quoted strings use the dollar syntax for variable interpolation. Variable interpolation is done before the Regex.Replace() function (which -replace uses internally) parses the replacement text. Unlike Perl, $1 is not a magical variable in PowerShell. That syntax only works in the replacement text. The -replace operator does not set the $matches variable either. The effect is that 'test' -replace '(\w)(\w)', "$2$1" (double-quoted replacement) returns the empty string (assuming you did not set the variables $1 and $2 in preceding PowerShell code). Due to variable interpolation, the Replace() function never sees $2$1. To allow the Replace() function to substitute its placeholders, use 'test' -replace '(\w)(\w)', '$2$1' (single-quoted replacement) or 'test' -replace '(\w)(\w)', "`$2`$1" (dollars escaped with backticks) to make sure $2$1 is passed literally to Regex.Replace().

Using The System.Text.RegularExpressions.Regex Class

To use all of .NET's regex processing functionality with PowerShell, create a regular expression object by instantiating the System.Text.RegularExpressions.Regex class. PowerShell provides a handy shortcut if you want to use the Regex() constructor that takes a string with your regular expression as the only parameter. $regex = [regex] '\W+' compiles the regular expression \W+ (which matches one or more non-word characters) and stores the result in the variable $regex. You can now call all the methods of the Regex class on your $regex object.

Splitting a string is a common task for which PowerShell does not have a built-in operator. With the regex object we just crated, we can call $regex.Split('this is a test') to get an array of all the words in the string.

If you want to use another constructor, you have to resort to PowerShell's new-object cmdlet. To set the flag RegexOptions.MultiLine, for example, you'd need this line of code:

Using mode modifiers inside the regex is a much shorter and more readable solution, though:

$regex = [regex] '(?m)^test$'

Make a Donation

Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site! Credit cards, PayPal, and Bitcoin gladly accepted.