SQL Server Integration Services - SSIS

The task will detect changes to existing files as well as new files, both actions will cause the file to be found when available. A file is available when the task can open it exclusively. This is important for files that take a long time to be written, such as large files, or those that are just written slowly or delivered via a slow network link. It can also be set to look for existing files first (1.2.4.55). The full path of the found file is returned in up to three ways: The ExecValueVariable of the task. This can be set to any String variable. The OutputVariableName when specif... [More]

Use the power of regular expressions to cleanse your data right there inside the Data Flow. This transformation includes a full user interface for simple configuration, as well as advanced features such as error output configuration. Two regular expressions are used, a match expression and a replace expression. The transformation is designed around the named capture groups or match groups, and even supports multiple expressions. This allows for rich and complex expressions to be built, all through an easy to reuse transformation where a bespoke Script Component was previously the only alter... [More]

The regular expression transformation exposes the power of regular expression matching within the pipeline. One or more columns can be selected, and for each column an individual expression can be applied. The way multiple columns are handled can be set on the options page. The AND option means all columns must match, whilst the OR option means only one column has to match. If rows pass their tests then rows are passed down the successful match output. Rows that fail are directed down the alternate output. This transformation is ideal for validating data through the use of regular expressio... [More]

As the name suggests we have taken the current Row Count Transform that is provided by Microsoft in the Integration Services toolbox and we have recreated the functionality and extended upon it. There are two things about the current version that we thought could do with cleaning up Lack of a custom UI You have to type the variable name yourself In the Row Count Plus Transformation we solve these issues for you. Another thing we thought was missing is the ability to calculate the time taken between components in the pipeline. An example usage would be that you want to know how ma... [More]

The Row Number Transformation calculates a row number for each row, and adds this as a new output column to the data flow. The column number is a sequential number, based on a seed value. Each row receives the next number in the sequence, based on the defined increment value. The final row number can be stored in a variable for later analysis, and can be used as part of a process to validate the integrity of the data movement. The Row Number transform has a variety of uses, such as generating surrogate keys, or as the basis for a data partitioning scheme when combined with the Conditional ... [More]

The Trash Destination and this article came from early experiences of using SSIS and community feedback at the time. When developing a package it is very useful to have a destination adapter that does nothing but consume rows with no setup requirement. You often want run a package part way through development, or just add a path so you can set a Data Viewer. There are stock tasks that can be used, but with the Trash Destination all columns are treated as selected automatically (usage type of read-only), so the pipeline knows they are required. It is also obvious that this is for development or... [More]

The Trace File Source adapter is a useful addition to your SSIS toolbox. It allows you to read profiler traces stored as .trc files and read them into the Data Flow. From there you can perform filtering and analysis using the power of SSIS. There is no need for a SQL Server connection this just uses the trace file.
Example Usages
Cache warming for SQL Server Analysis Services
Reading the flight recorder
Find out the longest running queries on a server
Analyze statements for CPU, memory by user or some other criteria you choose
Properties
The Trace File Source adapter has... [More]

The Checksum Transformation computes a hash value, the checksum, across one or more columns, returning the result in the Checksum output column. The transformation provides functionality similar to the T-SQL CHECKSUM function, but is encapsulated within SQL Server Integration Services, for use within the pipeline without code or a SQL Server connection. As featured in The Microsoft Data Warehouse Toolkit by Joy Mundy and Warren Thornthwaite from the Kimbal Group. Have a look at the book samples especially Sample package for custom SCD handling. All input columns are passed through the transf... [More]

Recently there was a bug raised against BIDS Helper which originated in my Expression Editor control. Thankfully the person that raised it kindly included a screenshot, so I had the error code (HRESULT 0xC0017011) and a stack trace that pointed the finger firmly at my control, but no error message text. The code itself looked fine so I searched on the error code but got no results. I’d expected to get a hit from Books Online with the Integration Services Error and Message Reference topic at the very least, but no joy. There is however a more accurate and definitive reference, namely the heade... [More]

Having finally got around to upgrading my custom components to SQL Server 2012, I thought I’d share some notes on the process. One of the goals was minimal duplication, so the same code files are used to build the 2008 and 2012 components, I just have a separate project file. The high level steps are listed below, followed by some more details. Create a 2012 copy of the project file Upgrade project, just open the new project file is VS2010 Change target framework to .NET 4.0 Set conditional compilation symbol for DENALI Change any conditional code, including assembly versio... [More]

This component needs little explanation. It generates random integer (DT_I4) and string (DT_WSTR) data and places them in the pipeline. You specify how many columns of each you would like and for any string columns you pass a fixed length value. You then need to specify how many rows in total you require to be generated.
This component is used by us to do testing of the pipeline and components downstream. Previously we would have used a script component (as a source) to generate the rows but found ourselves rewriting the code too often so created this component.
Screenshots
... [More]

I recently noticed that Red Gate have launched themselves into the SSIS component market by releasing a new Data Cleanser component, albeit in beta for now. It seems to be quite a simple component, bringing together several features that you can find elsewhere, but with a suitable level polish that you’d expect from them. String operations include find and replace with regular expressions, case formatting and trim, all of which are available today in one form or another, but will the RedGate factor appeal to people? Benefits include ease of use, all operations in one place, versus ... [More]

SQLBits 9 has now been announced as 29th September to 1st October 2011 at the Adelphi Hotel in Liverpool. This will follow the now familiar three day format, a training day, and two full days of concurrent sessions. Saturday 1st October will of course be the free community day. Despite growing the event quite dramatically over the past four years, this is something we are all very proud to have maintained and is a key factor when planning the events.
Plenty of more info to come, but in the meantime session submission is now open so why not submit an abstract?

On Wednesday 18th May 1Pm –2PM Eastern Time I am going to be helping Attunity deliver a webcast. In it I want to highlight some of the challenges we face in getting data around our environments. I want to highlight the Microsoft products that can help us realise Business Intelligence. It promises to be a good hour. Hope to see you all there. Here is the link Attunity Webcast

Whilst getting some courseware ready I was playing around writing some code and I decided to very simply show when a window starts and ends based on you asking for a TumblingWindow of n time units in StreamInsight. I thought this was going to be a two second thing but what I found was something I haven’t yet found documented anywhere until now.
All this code is written in C# and will slot straight into my favourite quick-win dev tool LinqPad
Let’s first create a sample dataset
var EnumerableCollection = new [] { new {id = 1, StartTime = DateTime.Par... [More]

I had an hour spare this afternoon so I wanted to have another play with Reactive Extensions in .Net and StreamInsight. I also didn’t want to simply use a console window as a way of gathering events so I decided to use a windows form instead.
The task I set myself was this.
Whenever I click on my form I want to subscribe to the event and output its location to the console window and also the timestamp of the event.
In addition to this I want to know for every mouse click I do, how many mouse clicks have happened in the last 5 seconds.
The second point here is really i... [More]

In his blogpost Roman from the StreamInsight team asked if we could create a Reactive Framework version of what he had done in the post using StreamInsight. For those who don’t know, the Reactive Framework or Rx to its friends is a library for composing asynchronous and event-based programs using observable collections in the .Net framework. Yes, there is some overlap between StreamInsight and the Reactive Extensions but StreamInsight has more flexibility and power in its temporal algebra (Windowing, Alteration of event headers) Well here are two alternate ways of doing what Roman... [More]

I am giving a full day seminar on StreamInsight at the SQLBits conference in Brighton, UK. The seminar is happening on 7th April 2011. Early bird discounts are available so get over to the site and register. During this day I will be explaining exactly what StreamInsight is and we’ll be looking at how to solve some very interesting business problems with streaming data. A complete rundown of what it is I will be covering is here.

I have been thinking a lot recently about what it would be like to have StreamInsight and SSIS working together. Well the CAT team have produced a paper on some of our options here. Here are some of my thoughts. There is of course a slight mismatch in their types of usage. StreamInsight is an Event Stream processing engine capable of operating on new data in the sub second timeframe. The engine allows you to do real time analytics and take decisions on events that have potentially only just happened. SSIS on the other hand is a batch processing engine. In... [More]

Whilst writing some courseware I wanted to be able to see the start and end times of Edge shaped events from within the debugger. A quick recap on Edge events At the start of the event you do not know the end time and most probably cannot work it out or you should be using one of the other shapes. You enqueue an event (Start Edge) with the start time and payload of the event. The end time of the event is set to infinity When you see the end edge come through, you enqueue another event (End Edge) with the previous start time and payload and restate the event’s end time.&... [More]