Hone your PowerShell text manipulation skills

If you are interested in task automation, then learning how to use the *-Content cmdlets for effective PowerShell text manipulation will help with advanced infrastructure management efforts.

Part of your automation activities include modifying text files, which used to mean just flat text, such as files you’d create with the Notepad application. These days, the concept of text includes CSV, HTML, JSON, XML and even Markdown files. PowerShell works with all those filetypes — and YAML is on the horizon for a future PowerShell Core release — but this tutorial will focus on working with flat text files.

There are number of cmdlets available for PowerShell text manipulation: Add-Content, Clear-Content, Get-Content and Set-Content. Also, the Out-File cmdlet can create a text file or write to one. You need to be aware of the changes between the Windows PowerShell and PowerShell Core versions — especially with encoding — to manage your files across platforms and across applications.

Creating text files in PowerShell

You can use both Add-Content and Set-Content to create a text file, but they both require content. Start by importing the text after the $txt text string variable in the screenshot.

The code uses Join-String, which is a feature introduced in PowerShell Core v6.2 preview 3, to create a server name and write it to the file. You could use the following code as an alternative that works in any PowerShell version:

Use care when using different PowerShell versions

If you use multiple versions of PowerShell, your PowerShell files will be read by another application — possibly on a different OS — or you want to use PowerShell to read files created by other applications, then you may need to be aware of encoding.

The Add-Content, Set-Content and Get-Content cmdlets have an encoding parameter that controls the way PowerShell writes the file. If you only use a single version of PowerShell and files that you create will only be read by PowerShell, then you don’t need to worry about this.

If you use multiple versions of PowerShell, if your PowerShell files will be read by another application — possibly on a different OS — or if you want to use PowerShell to read files created by other applications, then you may need to be aware of encoding.

Just to add another wrinkle to the encoding story, the default encoding changed in PowerShell Core 6.0.

Windows PowerShell uses a mixture of encoding, including ASCII and UTF-16, which may lead to issues when you try to read the files you create. For the most part, PowerShell tends to figure out the encoding for files, but other applications may not be so forgiving.

PowerShell Core 6.0 standardized on UTF-8 without a byte order mark (UTF8NoBOM) as the default encoding. The following cmdlets use UTF8NoBOM in PowerShell Core 6.0 and later: Add-Content, Export-Clixml, Export-Csv, Export-PSSession, Format-Hex, Get-Content, Import-Csv, Out-File, Select-String, Send-MailMessage and Set-Content.

New-ModuleManifest was moved to the UTF8NoBOM standard in PowerShell Core 6.1.

The PowerShell team recommends to explicitly state the encoding with the -Encoding parameter.

Working with the Get-Content cmdlet parameters

Creating and modifying files are useful, but at some stage, you will need to read the contents of a file. The file with the server names is a good example of an instance where you create a file used in your automation efforts.

The Get-Content cmdlet reads files. A typical scenario is to read the file and perform some action on each server:

The PowerShell pipeline unravels arrays and treats each item as a separate object. If the file contents were read as a single block of text, then you’d need to perform additional processing to separate the lines of text. If you don’t want the whole file, you can use the TotalCount — aliased as Head and First — parameter to read the first n lines of the file:

For large files, you may need to use the ReadCount parameter to control the number of lines sent through the pipeline at one time. The Raw parameter ignores a new line character and returns the entire contents of the file as a single string: