Step 1: Creating a new batch

This is a personal preference, so you can skip this step if you’d like.

Start by creating a Batch table. This will store the timestamp and filename for each time we run our SSIS job. Yes, this is overkill for a demo, but it is a good idea to teach good SSIS habits, too. We will also store the BatchId on each record.

-- Create the new Batch table.createtable[dbo].[Batch]([BatchId]intnotnullidentity(1,1),[StartTime]datetimeoffsetnull,[Filename]varchar(200)null,constraint[PK_dbo.Batch]primarykeyclustered(BatchId));-- Create a new stored procedure for creating a new batch entry.createprocedure[dbo].[InsertBatch]@startTimedatetimeoffset=null,@filenamevarchar(200)=nullasbeginbegintransaction;insertintodbo.Batch(StartTime,Filename)values(@startTime,@filename);selecttop(1)BatchId,StartTime,Filenamefromdbo.BatchwhereBatchId=scope_identity();committransaction;end-- Add BatchId column to Person.altertable[dbo].[Person]add[BatchId]intnotnull;-- Add BatchId column to Address.altertable[dbo].[Address]ass[BatchId]intnotnull;

Our first SSIS step is to create a new record in this table and save the ID. Add a new Execute SQL task to your workflow. I have already added an ADO.NET connection to my database. My SQL task will use this existing connection.

Next, set the input parameters for the stored procedure - StartTime and Filename.

Finally, we need to capture the result set. Create a user variable called BatchId, and save the result to that variable. We set the Result Set to “Single Row” on the general page. Here, the BatchId is the first value returned (0-indexed). We will use it later when we insert our XML data into tables.

Step 2: Delete the existing data

This one is easy. We want to prepare our tables for incoming data. Create a procedure that will wipe away all existing data.

Call this procedure with an Execute SQL task. This time, there are no parameters or results.

Step 3: Fun with XML files

Now that we have properly prepared our tables, we are ready to read our file and insert new rows. This is done with a Data Flow Task. Point your file connection to the correct place on your drive. Hit the button for “Generate XSD…”.

This will scan your XML and read every node. It will also create appropriate IDs so nodes can be linked. Next, data must be sorted. Every node must be sorted, and this XML has four such nodes - person, name, addresses, and address.

After all data has been sorted, we are ready to create joins. We need a join to turn four types of nodes into two tables. The Person join combines the person and name node types, and it looks like the following.

And here is the Address join. This combines the addresses and address nodes. Note that the addresses node contains the person_ID value, which we will need to link addresses to people.

Next up, add the BatchId value to both the Person and Address sets. This is easily accomplished with a Derived Column Transformation. This is what it looks like for Person. The Address derived column is the same.

Finally, we are ready to store our data. Create two ADO.NET Destinations, one for Person and one for Address. This is what the mappings for Person should look like. Map your in-memory variables to your database columns.

The following is the completed data flow task.

Step 4: The Final Result

Let’s fire up SQL Management Studio and check out the results.

The source code is available on Github. Both the SSIS job and the SQL database project are available for you to review.