Problems I've had, solutions I've found, along with commentary.

Month: February 2014

There seems to be some debate about the use of temporary tables in SSIS. I, for one, highly recommend using temporary tables rather than trying to pull data from different sources straight into your database production tables. Consider the following:

You are pulling data across a VPN connection when the connection suddenly fails. Perhaps a router failed or something, but now you are stuck with a partial pull.

There is some type of unexpected corrupt data that causes your flow to error out. Again, you are stuck with a partial pull.

In the above situations you’ll now be stuck with production tables in a production database that have records that need to be cleaned out. At best, you’ll have to figure out which records made it and which didn’t and then make up the difference. By using temporary tables in your production database all you have to do is wipe out the temporary tables and then start over after figuring out what the error is – the idea being that once the data makes it into your temporary tables you now have full control and no longer have to worry about VPN’s going out or data needing to be changed or cleaned as all that has already been done.

So in your SSIS package you pull data over to your database into a series of temporary tables that might in face have the same structure as your production tables just with different names. Then you pull the data from your temporary tables into your production tables. Some even go as far as to have a separate database just for temporary tables which is perfectly fine.

Remember, when pulling data from other sources, especially outside sources, unless you have some ownership over said sources you control only half of the operation. You have no guarantees as to whether the source is reliable or the data is clean. Therefore, it is very likely your dataflow may be interrupted.