Entries in Integration Services
(10)

Overview: This is a high level review of the terminology for configurable items like parameters and variables in SQL Server Integration Services 2012. This discussion is applicable to the Project Deployment Model only.

Following is a high level flowchart of how values can be passed to parameters, variables and connection managers in SSIS 2012. The left side represents the SQL Server Data Tools environment in Visual Studio (i.e., during development before the project has been deployed); the right side represents the SSIS Catalog in the Management Studio environment (i.e., after deployment).

The remainder of this blog entry will discuss individual components of the above flowchart.

SSIS Project & Package Parameters

Project parameters are new with SSIS 2012. A project parameter can be shared among all of the packages in an SSIS project. You want to use a project parameter when, at run-time, the value is the same for all packages.

Package parameters are also new with SSIS 2012. A package parameter is exactly the same as a project parameter – except that the scope of a package parameter is the individual package it resides in. You want to use a package parameter when, at run-time, the value is different for each package.

Note that project parameters do *not* have an expressions property to define their value. They are intended to hold a literal value which does not change while the package executes.

See in the image at the top of the page that project parameters can pass a value to variables? Parameters can also pass values to all kinds of objects in SSIS – basically any property that allows an expression.

You can think of parameters are the replacement for package configurations which were used in earlier versions of SSIS.

SSIS Variables

There’s actually not too much new here with *package* variables in SSIS 2012 (other than you can move them now which is great). What can get confusing is that sometimes the environment variables are just referred to as variables – so you need to be aware of the context in which variables are being discussed. (Environment variables are discussed in the next section below.)

Within a package, SSIS variables have an expression property. The expression property, and the ability to change values during the execution of a package if needed, are two fundamental differences variables have from parameters. A variable can be scoped to the package or an object within the package (there’s no such thing as a project variable though).

Variables often consume values from parameters within their expressions (an example of this is in the next screen shot). Variables can also communicate values to objects like Execute SQL tasks, For Each containers, Send Mail tasks, and so on.

SSIS Environment Variables

SSIS environment variables are new with SSIS 2012. You actually interact with these in Management Studio after the project has been deployed to the SSIS Catalog. Don’t confuse these with Windows environment variables – although named the same, SSIS environment variables are different than Windows environment variables. Also don’t confuse these with “regular” variables used within SSIS packages (which are discussed in the previous section above).

An environment variable provides the flexibility to configure values for parameters and connection managers in Management Studio which are different from what was originally specified when the package was deployed. This is great functionality for the administrator of the SSIS and ETL processes.

An environment(s) and its variables are set up in Management Studio under Integration Services Catalogs. Once set up under the Environment Properties, these variables can be associated to projects and/or packages. Since their purpose is to override parameters or connection managers, I propose a similar name to the value it’s intended to replace – but with a prefix (such as EV) that makes it clear where the value is coming from.

If you wish to override the value for a project parameter with an SSIS environment variable, you do this on the “Configure” menu for the project.

Overriding the value for a package parameter with an SSIS environment variable is very similar – it’s just done on the Package’s “Configure” menu instead.

Note that SSIS environment variables can specifically provide values for parameters and connection managers. SSIS environment variables do not interact directly with the variables contained inside of SSIS packages.

SSIS Project Configurations

With the new project deployment model, the concept of configurations is mostly gone. There is one exception, however. While in SQL Server Data Tools (Visual Studio), you can specify if any parameters are dependent upon a particular deployment configuration being selected.

This reference to “deployment configurations” are not the configurations you might be thinking of from previous versions of SSIS – rather, these are the deployment configurations available in the project properties. This deployment Configuration Manager has been available to manage different deployment scenarios for a long while now. The piece that is new is the ability to associate parameters to these configurations.

This functionality is only available in the Visual Studio development environment, and only applies to project & package parameters.

Part 3: Loading a Model Where Some Attributes Are Maintained Directly in MDS

Sample Model

This builds upon the model built in Part 2 (so please review that blog entry if you’ve not done so already).

Conceptually the Account Model looks like this:

The Account Entity in MDS looks like this:

In Part 2 we assumed that all of the attribute values come from a source system. However, in Part 3 we are changing that up a little. We are going to say that the Account Type is maintained directly in MDS.

When MDS is the System of Record

One of the most common use cases for MDS is to augment the data which comes from your source system(s) with additional context. This could be groupings or descriptive information not stored elsewhere.

For purposes of Part 3, the Account Type entity is maintained in MDS only. When an attribute is maintained directly in MDS, we need to alter the process described in Part 2 just a bit to ensure the values are preserved during the import process.

In Part 2 we said all the values come from the source. Put another way, all data is pushed to MDS. Conceptually, that looks like this:

However, if one or more values come from MDS, we need to add a step to retrieve those values. Otherwise, we’ll lose them. Two reasons for this:

If we use an ImportType of 2, which allows updates to occur for existing records, then a null value in the staging table will overwrite an existing value.

The stored procedures provided to populate an MDS Model are not parameterized to allow us to specify which attribute(s) are loaded. Since all attributes are part of the import process, we need to make sure all data is present in staging to preserve the existing values.

Think of it this way: We need to retrieve the values out of MDS – as if it were any other source system – to load staging before the stored procedure executes. Conceptually, that looks like this:

SSIS Package

We’re going to use the same SSIS Package from Part 2. (Below only shows the modifications to it, so please refer back to Part 2 for any details not listed here.)

Note the same 5 steps are present. Loading the Account Type Entity is not present in the SSIS package, because it’s maintained directly in MDS.

Step 2 is where the important change occurs. Within the data flow for the Account entity, we need to query MDS to pull back the existing values for Account Type. If we skip this step and leave it null in staging, those null values will indeed overwrite the existing values because we’re using an ImportType of 2.

How you retrieve the data out of MDS depends on your personal preference. In this example I used the Lookup data flow transformation in SSIS. I don’t want to do a cross-database join in my source query, so I matched up the data from MDS after the source data is in my SSIS pipeline.

Prerequisite for this step: A subscription view must exist for the entity to export the data out of MDS. A subscription view is what’s used to extract data out of MDS. Another reminder we’re not to interact directly with the MDS source tables. In my situation, I want a subscription view for the main Account entity – this is where the accounts are actually mapped to the values. The Account Type entity is just a distinct list of lookup values, and doesn’t give me the mappings to actual accounts.

And that’s it! Once you know to identify the source of all MDS attributes before you begin creation of a staging package, then it’s smooth sailing from there on.

Sample Model

First, a few quick words about the sample model we’re going to use. It’s a simple Account Model which has 3 entities. The Account Type and Account Class entities exist solely to populate the primary Account entity.

Conceptually the Account Model looks like this:

The Account Entity in MDS looks like this:

The remainder of this blog entry discusses loading the Account model at the leaf (most detailed) level. All attributes, for all entities, come from a source system.

Leaf Member Staging Tables

To support the Account model, 3 staging tables were created by MDS when the entities and attributes were set up:

Recall that the Account Class and Account Type are domain-based attributes, used as follows:

Account Type: Contains a distinct list of Account Type code and name – used as the lookup table for the main Account entity.

Account Class: Contains a distinct list of Account Class code and name – used as the lookup table for the main Account entity.

For purposes of this example, we are going to assume all of the attributes in these entities come from a source system external to MDS. (Note: Part 3 of this series looks at how to handle it if one or more attributes are maintained directly in MDS – i.e., the data doesn’t exist in a source system anywhere.)

SSIS Package

Following is an example of the flow for the SQL Server Integration Services (SSIS) package. Note that in a Production ETL environment a few things will be more dynamic, but the purpose of this example is to be simple & straightforward.

Step 1: Truncates each of the 3 staging tables. This is to eliminate the data from the last time this process ran.

Step 2: Data flow to populate each entity in the model.

In your source query, make sure you pull back the code field for the domain-based attributes rather than the name field. For the Account entity, we’ll only populate Account Type and Account Class fields with the code value. However, if we were populating the Account Class or Account Type entity, we would populate both code and name.

BatchTag which I chose to create from an ETLExecution_id (i.e., a unique code used for ETL job management) plus a string which refers to this entity. The reason the string is appended is so that each BatchTag is unique for every data flow. MDS will generate an error if > 1 of the same BatchTag executes at the same time. Therefore, each of the 3 data flows in my example append a different string so they are allowed to run in parallel for speed.

The data conversions are only needed because my source data was non-Unicode, and MDS is Unicode.

Mapping of fields into the Leaf Member Staging Table looks like this:

The Code can be ignored if you specified for the Code to be created automatically when the entity was set up (a new feature in MDS 2012).

Note that the BatchTag needs to match what was specified in the Step 2 data flow. I’ve set mine to be unique per entity so the batches can be executed in parallel. The ? in the screen shot above signifies I’m using a parameter (which is mapped to a variable which contains the ETLExecution_id); this helps the BI team managing nightly loads know which run this was associated to.

A LogFlag of 1 specifies logging is enabled. The VersionName also needs to be specified – a slight complication if you do a lot of versioning in your environment.

Execution of this step to load the model is what correlates to the Integration Management page in MDS:

Step 4: Validation of the Model. Until validation is performed, the new inserts and updates are there but in a “waiting validation” status. If any Business Rules have been defined for the model, they will be applied when Validation is run.

The Validation applies to the model as a whole, so it only needs to be done once regardless of how many entities we just loaded.

Execution of this step changes the yellow question marks to green check marks:

The 5th and final step I have in my package is to check if any errors occurred and then alert the BI Team and/or Data Steward if needed. This is done by querying the views provided by MDS for each entity.

In my process, I just do a simple query that checks if 1 or more error records exist and if so, send an email. Note the ResultSet is set to Single row. The total count is passed to a variable. If the value of the variable is >=1, the next step of generating an email will be kicked off.

Introduction to the MDS Entity-Based Staging Structure

Each entity has its own stored procedure (udp) to load data from staging into the model.

Within the stored procedure, you have an ImportType parameter which specifies how new and updated members are treated. New in 2012 is the ability to update existing values, if you so choose.

If more than one batch will be running at the same time, each batch needs a unique BatchTag.

The model needs to be validated after the stored procedure to load the model is executed.

Each entity has its own view to display errors which occurred while loading the model.

The objective of this process is for us to interact with the staging table (in the stg schema), then allow the MDS-generated stored procedure interact directly with the model (in the mdm schema).

An overview of the process is as follows:

Advantages of this new structure include:

Ability to handle updates (as well as inserts and deletes), if you choose the Import Type which permits updates.

Easier to understand ETL processing.

Much faster and efficient ETL processing.

Security may be set up per individual staging table, if necessary. Permission to insert data to the staging table(s) is required.

Security may be set up per stored procedure, if necessary. Permission to execute the stored procedure(s) is required.

Members and attributes may be loaded in single batches related to one specific entity.

Tables, Stored Procedures, and Views in the Staging Schema

For each entity, up to 3 staging tables may exist depending if consolidated members exist, and if explicit hierarchies exist. The leaf member table will always exist once the entity has been created. It would be very common to not use all 3 possibilities.

Note: All names are based on the Entity name unless a different name was chosen when the entity was created.

Overview: Quick tip about resolving a error found in SQL Server Integration Services when mapping query parameters and variables.

Ever find where you so commonly make the same small mistake that you immediately recognize the error when you see it? Yup, I have one of those too. Let me walk you through it: In SSIS, I set up an Execute SQL Task. Within the SQL statement I use a ? to indicate a query parameter. Then I go to the Parameter Mapping page and map the parameter to the proper Variable Name & set the Data Type. Then I run the task to test it. Uh oh, here comes the following error:

[Execute SQL Task] Error: Executing the query "UPDATE Stage_Source.MasterAcctHist SET YTD01 = ..." failed with the following error: "Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

The thing I’ve found myself missing a couple of times recently is changing the Parameter Name. Left unchanged, it defaults to “NewParameterName.” In order for SSIS to properly map the query parameter to the variable, it needs the correct naming convention. In my case, I’m using an OLEDB connection type so the first Parameter Name in the list it’s looking for is a 0 (followed by 1,2,3). When it’s set up, it looks like this:

The actual Parameter Name SSIS is looking for depends upon your connection type – the MSDN link below is a handy reference.

Other reasons this particular error could come up is if your Data Type is set incorrectly, or if the number of query parameters don’t match the variables mapped (and in the correct order).

Description: A quick post to describe one possible resolution for: “Error 0x80070002 while preparing to load the package. The system cannot find the file specified.” This discussion applies to SQL Server 2008 R2.

Level: 101

The Error

After doing an SSIS deployment to a Production environment, I executed a package and received the following error.

Error 0x80070002 while preparing to load the package. The system cannot find the file specified.

What was troubling about this error is that the package ran fine in the BIDS environment, and it also ran fine within the Development and QA environments. So what’s up with it not running in the Production environment?

The Resolution

In my situation, the resolution was to change the Package property for “ProtectionLevel” to the “DontSaveSensitive” setting. It had been previously set to the default of “EncryptSensitiveWithUserKey.” We use package configurations to handle all data connection properties, so the setting of Don’t Save Sensitive works just fine (this is still a 2008 R2 implementation, not yet migrated to 2012 where package configurations are a thing of the past).

This is one of those cases where the error message had me puzzled a bit. It’s not that the .dtsx file wasn’t there at all; it’s that the new environment couldn’t access it considering the ProtectionLevel setting.