SSIS: Capturing PerfMon Counters During Package Execution

While troubleshooting SSIS package performance, having the ability to view the performance monitor counters can be handy for determining where bottlenecks may be occurring. Usually, however, this is a very reactive request. The package is slow, so perfmon counters are enabled. The package is then run and *hopefully* the issue reoccurs. A lesser known feature of the SSIS catalog from 2012 and up is the dm_execution_performance_counters function, which tracks these perfmon counters for you during package execution. This provides the ability to begin proactive monitoring of SSIS performance monitor counters, checking important perfmon metrics such as Buffers Spooled prior to being notified by management or users. It also provides an alternate way to troubleshoot and understand SSIS bottlenecks without needing to remote to the server and configure a separate perfmon trace. This blog post serves to walk through the dm_execution_performance_counters function, what it tracks, and how it can be utilized effectively within an environment.

Per BOL, the function returns statistics for executions while the package is running. This means that the information is tracked during the run, but will not return any results for an execution that has completed (failure or otherwise). The function accepts one parameter, which is the execution id of the package. All packages that are currently running can be tracked by passing in NULL. The function tracks the following counters (taken straight from BOL):

BLOB bytes read

The number of bytes of binary large object (BLOB) data that the data flow engine has read from all sources.

BLOB bytes written

The number of bytes of BLOB data that the data flow engine has written to all destinations.

BLOB files in use

The number of BLOB files that the data flow engine currently is using for spooling.

Buffer memory

The amount of memory that is in use. This may include both physical and virtual memory. When this number is larger than the amount of physical memory, the Buffers Spooled count rises as an indication that memory swapping is increasing. Increased memory swapping slows performance of the data flow engine.

Buffers in use

The number of buffer objects, of all types, that all data flow components and the data flow engine is currently using.

Buffers spooled

The number of buffers currently written to the disk. If the data flow engine runs low on physical memory, buffers not currently used are written to disk and then reloaded when needed.

Flat buffer memory

The total amount of memory, in bytes, that all flat buffers use. Flat buffers are blocks of memory that a component uses to store data. A flat buffer is a large block of bytes that is accessed byte by byte.

Flat buffers in use

The number of flat buffers that the Data flow engine uses. All flat buffers are private buffers.

Private buffer memory

The total amount of memory in use by all private buffers. A buffer is not private if the data flow engine creates it to support data flow. A private buffer is a buffer that a transformation uses for temporary work only. For example, the Aggregation transformation uses private buffers to do its work.

Private buffers in use

The number of buffers that transformations use.

Rows read

The number of rows that a source produces. The number does not include rows read from reference tables by the Lookup transformation.

Rows written

The number of rows offered to a destination. The number does not reflect rows written to the destination data store.

Out of the above, the more important metrics are the Buffer Memory and the Buffers Spooled counters. Both can be used to determine the amount of memory pressure on the machine during a package run. Sidestepping for a minute, an alternative use for this would be to monitor the rows read and rows written counters in on a QA server for testing prior to production deployment.

The function returns a table, with 3 columns: the execution_id, the Counter name, and the counter value. It also accepts a single parameter, which is the execution id of the SSIS run. Below is a sample:

Notice that the execution_id parameter is nullable. By passing in a NULL, all of the currently running executions are reported. Leveraging the knowledge above, it's possible to run this function under a couple of different scenarios. The first option would be to run the SSIS package through an execute sql script, and embed the @execution_id parameter into the transaction. This would be very handy in a case where you may want to tune or track a specific package, like I have done below:

Alternatively, you could also leverage the function to track all packages that are run by running the function on a schedule and passing in a null value. By creating a table to store the result in, it is possible to get a snapshot of all the executions that were running at a given time on the server. Below is a sample implementation script to demonstrate.

While you would not want to keep the information above for very long, reviewing it on a retrospective basis allows a quick determination of packages that may be experiencing issues during the overnight timeframe when multiple packages may be running concurrently. When attempting to performance tune long running SSIS packages, this function can save significant time and effort.