Pester is a fantastic tool to test and monitor the status of your infrastructure. There are several libraries which leverage this directly (e.g. SQLChecks, DBAChecks, Operation Validation Framework), and a growing set of resources discussing how to leverage Pester in this fashion (including a Pluralsight course and a chapter in the Pester book). If you don’t already have some tests you can run interactively (is my environment correct right now?) I’d suggest you start there first.

Once you’ve invested time into building out a library of Pester tests for your infrastructure, what you really want to do is analyse the Pester test results. There are various examples out there that discuss how to persist results to files, XML, SQL databases - but none of these options have the advantages that shipping to Log Analytics provides - which is what we’ll discuss today. A few reasons why I think think sending your results to Log Analytics is the superior choice:

Pester Result Schema

While we could log an absolutely minimal object to Log Analytics, I’ve found that adding a little more structure is helpful for both debugging and analysing test results.

We’ll be building the result objects in PowerShell, and then sending the objects to be stored in a table in Log Analytics. The below table shows the property names, as well as their column name in the table in Log Analytics. Columns are suffixed with their data type when created in Log Analytics, shown below as their LA Name (Log Analytics Name).

We’re going to map the value of InvocationStartTime to the built-in field TimeGenerated. If no field is supplied, TimeGenerated defaults to ingestion time.

The great thing about the Data Collector API is that these fields are all optional, and so if you don’t want to use the full schema you don’t have to (perhaps your tests won’t use context, or you won’t care about host/target).

Some columns that deserve a little more explanation are BatchId and InvocationId.

Batches and Invocations

Most infrastructure tests I run tend to come in a format that looks something like the following (pseudocode):

The execution of the whole script would be a Batch. Every call to Invoke-Pester is a separate Invocation, which can have zero or more test results (technically an invocation has 0..N Describes, each of which has 0..N Contexts, each of which has 0..N Tests).

Being able to look at batches & invocations will let you detect issues like:

Incomplete batches (a hard-error knocked it out half-way)

Overall runtime vs. Invocation runtime (vs. test runtime)

Rather than collecting your results and posting them in one go, I would encourage you to post them after every Invoke-Pester call. There are times when your automation will fail, and having incomplete results will assist in telling you how far your batch got before failing (vs. having no results if you wait until the end to try and post them).

With batches and invocations our pseudocode now looks something like this (I’ve also included an example call to post data to Log Analytics):

Working with the Pester result object

The object returned from Invoke-Pester needs a bit of work to transform it into the schema we outlined above. The $results object contains a property TestResult, which is an array of result objects (one object for every test executed).

Each result object contains information about the Describe, Context, and Test, as well as the result (pass/fail) and timing information. We use the TestResult to build our array of PesterResult objects to sent to Log Analytics:

The above example extracts the data from the Pester test results and no more (missing are things like BatchId, Target, etc.). Note that the above code is perfectly valid and can be used to quickly get started logging results.

A more complex example

A more complete example is shown below - this is taken from SQLChecks which iterates over configuration files and performs one call to Invoke-Pester (wrapped by Invoke-SQLChecks) per file being tested - in this case each file represents an instance of SQL Server.

A more complex example, simplified

Because this is a lot of code to write everywhere you deploy SQLChecks, it has been wrapped into a function - Invoke-SqlChecksToLogAnalytics, which means you can reduce the above example to the following:

Querying results

Once you have some results in Log Analytics you can start to query them. You can get to the query interface via the Azure Portal, and once there to look at our query results we’d write:

PesterResult_CL
| order by TimeGenerated desc
| take 100

Showing recent batches

As there might be many kinds of Pester tests being shipped, we’ll typically want to focus on a specific set - we’ll use SQLChecks as an example again. The following code will find the most recent batch from the last 7 days, and show all results:

While the most recent batch is a pretty common requirement, you may have different batch sizes (people running ad-hoc tests in the day are one example of smaller batches). One method I’ve used to find the most recent complete batch -is to look for queries that contain more than N results - I know my typical SQL checks have 900 tests, so the below query lets me filter out any small ad-hoc or incomplete batches:

In this specific example the result means that the trace flags configured on the server differ from the expected trace flags by a count of one.

Showing batch aggregates

To look at the overall stats (tests, passed, failed) we can group by any set of columns - in the below example we’re grouping by Target and Describe, and then ordering by the number of failed tests. This lets us quickly see which tests have failed and against what target.

In this case it looks like we have some data file space issues in addition to the trace flag problem.

Showing test or machine history

We might want to look at how a single test is performing over the estate. The below query shows the status of the Data file space used Describe by percent success (0% = all tests failed, 100% = all tests passed), split by target. Note we multiply the count by 1.0 to turn it into a float, rather than an integer (which would floor our result to always 0 or 1).

The below graph shows an example of a few targets which have never had a failure (yay!), one target which was partially failing (note the Y axis starts at 0.5) for a long time and recently was fixed, and another which partially failed and then recovered.

Finding the longest-running Describe block

If you’re looking to performance tune your infrastructure tests, you’ll want to know where the time is being spent. This final example shows how you can find which one of the describe blocks is taking the longest time to run. The example uses the most recent batch and plots the time taken in milliseconds for each describe block.

In this example checking for Duplicate Indexes dominates at almost 140 seconds.

Summary

Well done if you made it this far - if you’re starting from scratch with your infrastructure testing there are a lot of steps needed to get here. The good news is that once you’ve gone through all this setup for your first set of tests, onboarding and analysing the results from subsequent tests is very easy.

By having all your Pester results stored in Log Analytics you’re able to inspect the health of your estate now and historically very quickly, and can additionally share access to those test results directly (giving people the ability to write their own queries over your results), or create and share dashboards (perhaps leveraging the ability of Power BI to query Log Analytics). Some other options you have to leverage your results in Log Analytics include creating alerts with Azure Monitor (alert on Pester failures), or scheduling periodic reports with Flow (a daily summary of Pester results).

In the near future I’ll be showing how you can use Pester to perform data validation checks too - the results will, of course, be shipped to Log Analytics for easy querying/monitoring/alerting.