Auditing Changes in ETL with SSIS

A key component of the extract, transformation, and load (ETL) process is the cleanup, or transformation, of data while in transition from one place to another. In most cases there are restrictions - regulatory, company policy or otherwise - that require any data that is modified in the ETL pipeline to be audited as such. Even if you are not bound by auditing requirements, it is a wise practice to build an audit mechanism for any data you update or exclude as part of the ETL process. Years down the road if you or your successor is quizzed as to the reasoning certain transformations, an invaluable tool in answering those questions will be an auditing file or table. In addition, I have found that users will occasionally request access to the unmodified data as it was originally extracted from the source system, and an audit table can be useful in providing that information as well.

In this article, I'm going to demonstrate how you can include a simple auditing mechanism in your SQL Server Integration Services (SSIS). We're going to extract some data from a source table and check one of the columns against a lookup table; if no match is found, we will add a default value to the output and write that row to an audit table to indicate that we updated the column as part of the ETL process.

To get started, let's create our source table. The following code will create and populate the source table, and will create the empty destination table.

Now that the data is in place, we can create the SSIS package to perform the data manipulation. Here I've created an SSIS package with a single Data Flow Task in the Control Flow pane, and in that data flow I am attaching an OleDB Source to the OldCustomerDemographics table we just created (Figure 1).

Figure 1

Now I bring over a Lookup Transformation and add it to the data flow. This transformation will attempt to match the CustSalesRegion column from our OldCustomerDemographics table with the RegionCode column in the NewSalesRegion lookup table (Figure 2). Those rows that are matched successfully will pass through, and rows from the former that are not matched in the latter will be sent to the Error Output. In our case, we will configure the Error Output to redirect rows (Figure 3).

Figure 2

Figure 3

We'll come back and revisit the rows that are matched, but for now let's explore the rows that do not match an entry in the lookup table. In our example we will update the unmatched rows so that they belong to the NATIONAL sales region. To properly audit this activity, we will send the rows updated with the old values as well as the new values to the audit table we created above. To do this, I've added a Derived Column transform to specify the new sales region value, along with a Multicast transform so we can send the data to both the audit table and back to the primary output table (Figure 4).

Figure 4

In the OleDB Destination for the audit table, I'm wired up to the new audit table (OldCustomerDemographics_AUDIT) we created above. You can see that I've sent the new sales region value to the audit column we created (Figure 5).

Figure 5

Now that takes care of our auditing, but we still need to UNION the updated rows with those that were matched in the Lookup Transformation. Using the UNION ALL transformation, I have brought these two data streams back into one. In Figure 6, you will see that I am using the NewSalesRegion value from the Derived Column for those rows that we have opted to update.

Figure 6

Lastly, we'll send the output from the UNION ALL transform to an OleDb Destination, which is wired to the NewCustomerDemographics table we created in an earlier step. You can see in Figure 7 that there were 7 rows sent through the audit/update pipeline in SSIS. If you open SSMS and query the OldCustomerDemographics_AUDIT table as shown in Figure 8, you will see the 7 updated rows along with both old and new sales region values.

Figure 7

Figure 8

In this brief example, we can see an effective methodology for auditing data updates that occur during your ETL processing in SQL Server Integration Services.

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands
of articles and SQL scripts, a library of free eBooks, a weekly database news roundup,
a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals
that makes it such a success.