Monday, December 29, 2008

Sounds simple and straight forward, right? Well, it can be, depending on what you are doing. For those of you who have experience in this domain, this method does not require you to enable xp_cmdshell. Whew! The approach below uses the SQL Server job agent to dynamically create a job, assign the package as a step, and delete the job when it is done. It also gives the job a unique name, so it can be run concurrently.

So, the major benefits of calling a package this way are the fact the xp_cmd_shell does not need to be enabled/called, and SSIS does not have to reside on the server of the client application. If you were to try to invoke an SSIS package from a client app using the Dts.Runtime, you would have to install SSIS on the client machine. This is NOT FREE! In earlier versions of SQL Server, you could redistribute the dts runtime, but not anymore with SQL Server 2008.

Monday, December 15, 2008

I was recently tasked with the objective of creating an SSIS package to perform some standard ETL types of functions. Big deal, right? What made this particular go around unique for me, was that the input file used as a data source can have different column/row delimiters and text qualifiers for each file and we do not know what the formatting is going to look like until we try to load the file. The number of columns stays the same, as do the data types of each column. Here is a simplified example:

As you can see in File 1 above, the columns are tab delimited with a carriage return and line feed at the end of the row. In File 2, the columns are comma delimited. Also, text is qualified with double quotes.

From an ETL perspective, I had two choices: I could write an SSIS package for each permutation and create a maintenance and support nightmare for myself, or find a way to dynamically set the Flat File Connection properties at runtime. I decided to opt for the latter. In earlier versions of DTS and SSIS, this was not as easy as it sounds. In SSIS 2008, the task became relatively simple.

The first thing I did was to create a Flat File Connection and point it to a sample file that represented an instance of the input format that I would need to load data from. This creates metadata behind the scenes in SSIS to represent the Flat File Connection. The trick is to modify the metadata at the runtime of the SSIS package and replace the delimiters and qualifiers with new values. But how do you get the new values? I am glad you asked.

I created a script task and added references to the script project:Microsoft.SqlServer.DTS.RuntimeWrapMicrosoft.SqlServer.DTS.ManagedDTS

Once I could reference the SSIS object models I needed, the next thing I did was to create some variables to hold the values of the delimiters and qualifiers, once I knew what they were. The variables are set by actually opening the file and reading the first row (I know, I know. There are A LOT of assumptions in doing it this way) . I called two different methods, that loop through arrays of values, until I find what I am looking for. Once I have a good idea of the input file structure, I modify the Flat File Connection with the new delimiters and qualifiers. Here is an example of the code I used in the Script Task:

//create the connection to the flatfile connection to dynamically set the propertiesRuntimeWrapper.IDTSConnectionManagerFlatFile100 flatFileConnection= Dts.Connections["InputFileFlatFileConnection"].InnerObject as RuntimeWrapper.IDTSConnectionManagerFlatFile100;RuntimeWrapper.IDTSConnectionManagerFlatFileColumns100 columns = flatFileConnection.Columns;

//establish the count of columns based on the flatfile connection_ColumnCount = columns.Count;

//iterate throught he columns and set the delimiters based on the file interrogationforeach (RuntimeWrapper.IDTSConnectionManagerFlatFileColumn100 column in flatFileConnection.Columns){//if we are at the last column, set the row delimiterif (column == flatFileConnection.Columns[flatFileConnection.Columns.Count - 1]){delimiter = _RowDelimiter;}//we are not at the last column, set the column delimiterelse{delimiter = _ColumnDelimiter;}

/// /// Reads the first line of the file to determine the number of columns, column delimiters, and text qualifier/// /// file to interrogate/// bool indicating whether or not the interrogation was successfulprivate void interrogateFileContents(string filePath){string contents = string.Empty;System.IO.TextReader objReader = new System.IO.StreamReader(filePath);

//read the first line of the input file (assuming the first line is data)contents = objReader.ReadLine();objReader.Close();

//iterate through the collection of delimiters until we find the one that is used in the filefor (int i = 0; i <= delimiterChars.GetLength(0) - 1; i++) { columnCount = row.Split(Convert.ToChar(delimiterChars[i])).GetLength(0); if (columnCount == _ColumnCount) { _ColumnDelimiter = delimiterChars[i].ToString(); break; } } } /// /// Sets the private modular level variables that controls the text qualifier/// /// string that represents the first row of data in the fileprivate void setTextQualifier(string row){char[] qualifierChars = { '\"', '\'' };string column = row.Split(Convert.ToChar(_ColumnDelimiter)).GetValue(0).ToString();

//iterate through the collection of qualifiers to see if the text is qualified or notfor (int i = 0; i <= qualifierChars.GetLength(0) - 1; i++) { if (column.StartsWith(qualifierChars[i].ToString()) && column.EndsWith(qualifierChars[i].ToString())) { _TextQualifier = qualifierChars[i].ToString(); _IsTextQualified = true; break; } } } }}So, as you can see from the code above, I am wading through the data with character arrays and setting the Flat File Connection properties when I find a match. This code can be extended to do a lot more, but I will leave it in someone else's capable hands to implement factory patterns and the such.The point of this code is this: you do not have to create separate packages for each combo of delimiters and qualifiers.