Sunday, February 26, 2012

There are several ways to read SharePoint List as data source in SSIS. The easiest way is using the SharePoint List Adapter. You could download it from the Codeplex site. However, you may want to have more control during extracting the list to avoid failure, to manipulate data before sent to the output rows, or you are experiencing timeout issue with lots of item in the list.

Here is the solution I used:

Add For Loop Container in Task Component with a Data Flow

Since the SharePoint list could be large, use the for loop container to loop through the paging.

Set InitExpression to the Page number @PageCount = 1

Set the EvalExpression to a Variable: @LoopNext which will be determine in the data flow.

Set the AssignExpression to be @PageCount = @PageCount + 1

Create a Script Component with Source Type to Read SharePoint List

In the data flow, add a Script Component with Source type.

Pass in @LoopNext and @SharePointPageInfo variables as Read Write Variables.

In the data flow, sent the SharePoint list items to the destination table and the error data to error table. This way the SSIS package will not fail to execute with any SharePoint Site issue. Other process could utilize the data in the error table to recover it softly.

In the Add Metadata component, add the @PageCount as an output column. It could be used for recovery or debug purpose.

Narrow Down the Needed Fields

Don’t select all the fields from the list. Sometime the list may contain binary data, meta data or image that may take long time to load. To narrow down the fields, just specify the fields in the ViewFields XML Node as:

Add CAML Query to Extract Only the Delta

To pull only the delta records in, just add filters in the query. The SharePoint list has a system field called “Modified” which will be time stamped when the record get changed. You could utilize the CAML query to get the items that have Modified Date greater than a Date value or within an arrange of date. If you don’t know the syntax of the CAML Query, you could use some SharePoint Utility such as U2U CAML Builder.

Here is a sample CAML Query that will filter the result by range of date:

Page though Large Number of List Items in the Script

Depend on the size of the item, you may set the row limit to 1000 while calling the GetListItems(). The function supports service-side paging. When the XML result returned, it includes a ListItemCollectionPositionNext attribute that contains the information to support the paging. You need to save it for the next call. Make sure you do not modify the string.

In the example below, I set the ListItemCollectionPositionNext attribute to a local variable first and the bLoopNext to true. I then save the string to a SSIS Read/Write variable to be use for next loop.

Thursday, February 16, 2012

Recently I work on a SSIS project that has a Sequence Container with lot of variables that I need for a data flow. Later on I want to change the sequence container to be a For Loop Container. It is easy to copy and paste the the data flow in the container to the other. But, how to move the SSIS variables in the visual studio?

If you are familiar with the SSIS XML code, you may be able to copy or move the variables from one XML Node section to the other without corrupt it. For others, recreating the variable with correct value or expression could be cumbersome.

Here is a easy way to handle it:

Install BIDS Helper from CodePlex site.

Select the variable you want to copy or move

Click on the Move/Copy Variables from SSIS Variables tool bar to open a new window

Click on the DTS executable that you want the variable scope to change to , then click OK

I tried to uninstall and re-install the program, but it didn’t work. I have to manually delete the directory C:\Program Files\Microsoft Analysis Services\AS OLEDB\ and repair the PowerPivot, then it works.

Same situation happened again when I downgrade my PowerPivot back to version 1 to be compatible with my client’s environment. Once I manually remove the C:\Program Files\Microsoft Analysis Services\AS OLEDB\ directory and then repair the program, it works again.

About Me

I am a Business Intelligence IT professional and currently work for Duke Energy. I enjoy working in BI. This blog was created as a way for me to share some of the challenges and ideas that I have experienced.