31 Days of SSIS – No More Procedures (20/31)

Time to shift off of configurations and environments and talk more about SSIS packages and best practices. If you haven’t been following this series and are just joining in, we are at day 20 of the 31 Days of SSIS. Now that is out of the way, let’s move on.

Today, we are going to discuss a recent best practice that I’ve been promoting surrounding the use of stored procedures as source objects for data flows. Or to put it more succinctly that in SSIS packages data sources should use table-values functions instead of stored procedures for the SQL Command text.

Best Practice Roots

A couple months ago, I came across the following error message:

SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available. Source: “Microsoft OLE DB Provider for SQL Server” Hresult: 0x80004005 Description: “Protocol error in TDS stream”.
SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component “OLE DB Source” (7) returned error code 0xC0202009. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.

The odd thing about this error is that it only occurred when executing the SSIS package through DTEXEC or SQL Agent. Running the package in BIDS presented no issue and the package was always successful. Redeploy the package and the same issue occurred.

To summarize what these posts talk about, stored procedures do not have a defined output schema. The output can change depending on the parameters supplied. Furthermore, the metadata that SSIS uses may not be based on the execution path that you intend to us when SSIS package executes.

Output Contract

When SSIS packages are built there needs to be a reasonable expectation that the SQL command text for the data source will provide the columns that the SSIS package requires. It would be great if stored procedures could be designed in a fashion that allowed for one and only one output. This isn’t the case and there are no indications that this will change and probably no true need for it to change.

There is, though, a defined metadata output for table-valued functions. As they say in the Highlander, “there can be only one.” Whether the table-valued function is inline or multi-line, there is always a defined set of columns that will be returned. The SSIS data source knows specifically what will be returned from the function.

It’s still true that someone can change the table-valued function and cause similar issues to occur after deployment. There is the insurance that side effects from table changes will not negatively affect the execution of the SSIS package with the obscure message above that seems more ominous that it actually is.

Best Practice Wrap-Up

Having read this, you may be wondering if this is much ado about nothing. If I hadn’t uncovered this issue a few months back I might be thinking the same time. I’ve used SSIS since SQL Server 2005 was released and that was the first time I recall having this issue.

There was time between now and when I originally encountered this issue. With one of my clients there was a major release between when I first encountered the error and now. Would you be surprise to find out that we ran into the same issue three more times? Working on these issues after a release is no picnic and had I followed the advice I give now these issues likely would not have happened.

Guess what I recommended so that we wouldn’t deal with these again in the future? Now, that you’ve heard my recommendation, what do you think?

I know this is a very old post so I won’t be too surprised to not get a reply, but I’m very curious as to why the package executed fine in BIDS but failed during DTEXEC/SQLAgent. Whilst I understand the issue, I can’t see why the method of execution would vary the outcome.

(Love the blog BTW, putting your older articles on SSIS and XQuery to good use – thanks!)

I have encountered the same issue over the years. Although some will disagree, I consider this to be a flaw in the design of SSIS, although I understand the reasoning MS may have used. They could implement a work-around, in my opinion, similar to what is available when you use stored procedures in SSRS. When you first set a data source in SSRS to use a stored proc, and click OK on the data source designer, it initially is unable to obtain column definitions and thows an error. However you can work around it by clicking the "Refresh fields" button on the tool bar. SSRS then prompts you for a test execution of the proc, and based on the results returned from the execution, gets the metadata needed and is able to populate the column definitions. From that point on, the data source works perfectly. It should be that easy or easier in SSIS.

I believe I heard that Denali will add new functionality to stored procedures to allow the "Contract" of the column definitions of data returned by the proc to be defined as part of the stored proc definition. I am not sure if this will correct the issue in SSIS however.