Thursday, 29 December 2016

Case
How do I execute a package with the new Scale Out function? I don't see any options when executing a package.

Find the Scale Out options

Solution
The new Scale Out execution is not (yet) integrated in the standard package execution window. And the Package Execution Task has not changed. Therefore it will always execute the package on the same worker as the parent package. Both will probably change within a couple CTP releases.

Catalog
If you right click on SSISDB within the catalog then you will see the new context menu item "Execute in Scale Out..."

Execute in Scale Out...

Next you can choose which packages to execute and on which worker servers.

Execute in Scale Out

After hitting the OK button no reports are shown like in the regular execution, but you can find the reports in the context menu of the Catalog.

No open report option

And in the report you can see which 'random' worker executed the particular package.

The Machine property show which worker was used

Conclusions
Nice first version of the Scale Out. Hopefully the next CTP contains a new version of the Execute Package Task and an integration of the regular execution and scale out execution. Please try it out and let me (or Microsoft) know what you think about it.

Some considerations, because the worker services uses a local system account you might want to consider changing that to a domain account or use other options like a proxy or a database user. Other concerns are the firewall if you're using a local database on the master and local paths (d:\myfiles\) on the master won't work either.

Solution
SSDT 17.0 RC1 still got some bugs, for real projects you should use SSDT 16.5 But if you want to discover for example the Scale Out function or Support for Microsoft Dynamics Online Resources of SQL VNEXT then you should/could use this version.

Three solutions (in order of recommendation):

Since this bug only occurs in SSDT, you could deploy outside SSDT with PowerShell of by just double clicking the ISPAC file. Then this error won't occur.

It's a known issue. Just wait for the next release of SSDT (early 2017).

Wednesday, 28 December 2016

Case
SQL VNext has a new Scale Out function. How does it work and how do you install and configure that?

Solution
The new Scale Out option in SSIS VNEXT gives you the ability to execute multiple packages distributed to multiple worker machines. You can select multiple packages on the master that will be executed in parallel by one or more worker machines.Machine setup
To make sense of a Scale Out you of course need multiple machines. We need a master machine and one or more worker machines. Because the master distributes the executions and doesn't execute packages it self, you may want to consider installing a worker on the same machine as the master to make use of its resources. The worker machines only have a worker installed. A SQL Server engine installation is not necessary on a worker.

Option 1: Master only distributes executions

Option 2: Master also executes packages itself

For this example I will use option 1 with a master and two separate workers on three HyperV machines. All machines are identical with Windows Server 2012 R2 with all updates installed.

Installation step 1: install SQL Server VNEXT on master
Install SQL Server VNEXT on the 'Master' machine called "SQLVNEXT_M". Below the most important steps of the installation. At the bottom all screens are shown in a movie.

We need the Database Engine to store the SSISDB

We need SSIS and the Scale Out Master

SQL Server Authentication mode is required on the SSISDB

Choose port 8391 and create a new SSL certificate

All steps

Installation step 2: configure firewall on master
Open a firewall port on the Scale Out Master. We need at least an inbound rule for port 8391 supplied as EndPoint in the previous step, but a complete list of all SQL Server ports can be found here.

Installation step 3: copy master certificate to workersWe created a new certificate during installation of the Scale Out Master. You can find it in <drive>:\Program Files\Microsoft SQL Server\140\DTS\Binn. We need that certificate during installation of the Scale Out Workers. So copy it to the worker machines.

Copy SSISScaleOutMaster.cer to worker

Copy SSISScaleOutMaster.cer to worker

Installation step 4: install SQL Server VNEXT on workersInstall SQL Server VNEXT on the 'Worker' machines called "SQLVNEXT_W1" and "SQLVNEXT_W2". Below the most important steps of the installation. At the bottom all screens are shown in a movie.

Only select SSIS and Scale Out Worker (no engine needed)

Add an EndPoint like https://SQLVNEXT_M:8391. This is the name of the
Scale Out Master machine and the port chosen during the installation of the
Scale Out Master.
The certificate is the one you copied from the Scale Out Master
in one of the previous steps.

All steps

Installation step 5: copy worker certificates to masterDuring installation of the Scale Out Worker machines, certificates where created which we need register on the machine with the Scale Out Master. With these certificates the Scale Out Master can authenticate the Scale Out Workers. You can find SSISScaleOutWorker.cer in <drive>:\Program Files\Microsoft SQL Server\140\DTS\Binn
(repeat this for all workers).

Copy SSISScaleOutWorker.cer to master

Copy SSISScaleOutWorker.cer to master

Installation step 6: install SSMS VNEXT to add Catalog on master
Now we need to add a catalog to the master. To do this you need to install SSMS VNEXT first. For this demo situation I installed SSMS VNEXT on the master machine.

Install SSMS VNEXT

Add catalog as you normally do, but notice the extra option:

Enable this server as SSIS scale out master

Add catalog

Installation step 7: install worker certificates on masterNow we need to install all Scale Out Worker Certificates on the Scale Out Master machine. They should be stored in the Trusted Root Certification Authorities. Repeat the steps below for all Worker Certificates.

Store certificates in Trusted Root Certification Authorities

All Steps

Installation step 8: enable scale out workers on master
Make sure the services SSISScaleOutWorker140 on the Worker machines are started and SSISScaleOutMaster140 on the master. Then start SSMS and connect to the SQL Server instance on the master and execute the following query:

-- Get Worker info
SELECT * FROM [SSISDB].[catalog].[worker_agents]

It could take a few minutes before the worker machines are registered. Once that happens the query should return records. Use the values from the WorkerIdAgentId column in the next Stored Procedure Call to enable the Scale Out Workers

Now you're ready to deploy your first project and execute packages with the new Scale Out function. Also see the Microsoft walkthrough for the scale out setup.Conclusion
The Master Worker setup is a great way to distribute package executions over multiple servers. When you will choose for upgrading your existing SSIS server with more memory and more cores above the new master-worker setup, probably depends on the licensing model. But when you already maxed out the hardware of your current SSIS server, then this new master-worker setup is an easy way to upgrade.
And what about a future Scale Out to Azure? If a weekly, monthly or quarterly run is taking to much time or one of your worker servers is down (for maintenance?).

Thursday, 1 December 2016

In 2006 Jamie Thomson came up with naming conventions for SSIS tasks and data flow components. These naming conventions make your packages and logs more readable. Five SQL Server versions and a decade later a couple of tasks and components were deprecated, but there were also a lot of new tasks and components introduced by Microsoft.

Together with Koen Verbeeck (B|T) and André Kamman (B|T) we extended the existing list with almost 40 tasks/components and created a PowerShell Script that should make it easier to check/force the naming conventions. This PowerShell script will soon be published at GitHub as a PowerShell module. But for now you can download and test the fully working proof of concept script. Download both ps1 files and the CSV file. Then open "naming conventions v4.ps1" and change the parameters before executing it. The script works with local packages because you can't read individual package from the catalog, but you can use a powershell script to download your packages from the catalog.

Tuesday, 25 October 2016

Case
I used PowerShell to deploy my SSIS project to the Catalog. Can I also automatically create a SQL Server Agent job with an SSIS jobstep?

SQL Agent Job for SSIS package

Solution
Yes, almost every Microsoft product supports PowerShell and SQL Server Agent is no exception. Only the SSIS specific part of the jobstep seems to be a little more difficult to handle. So for this example I first created a SQL Server Agent job(step) for an SSIS package in SSMS manually and then scripted it to see the jobstep command. This command is a long string with all SSIS specific information like the packagepath, enviroment and parameters. Below you see parts of the generated TSQL Script. We are interested in the part behind @command= in row 12:

This command string is used in the PowerShell script below, but hardcoded parts are replaced with values from the PowerShell parameters (see row 66). The rest of the script is more straightforward and easily to adjust or extend. If you're not sure about how to adjust the script then first take a look at the T-SQL script which has similar steps and with the same properties to set.

Saturday, 22 October 2016

SSIS Appetizer
I'm not sure I have a purpose for this, but did you know that you can use the cache file of the Cache Transformation (introduced in SSIS 2008) as a source file in the Raw File Source.Demo
For this demo I use two Data Flow Tasks. The first creates the cache file and the second one uses it as a source.

Two Data Flow Task

1) Create Cache
The first Data Flow has a random source (a flat file in this case) and a Cache Transformation named "CTR - Create Cache" as a destination. When you create the Cache Connection Manager, make sure to check "Use file cache" to provide a file path for the cache file. Copy the path for the next step.

The Cache Transformation and Connection Manager

2) Read Cache
The second Data Flow Task uses a Raw File Source. In the editor you can specify the location of the Raw File. Paste the path from the Cache Connection Manager (a .caw file). For demonstration purposes I added a dummy Derived Column behind it with a Data Viewer on the path between them. Now run the package a see the result. You will get some hash columns 'for free'.

Raw File Source

Please let me know in the comments if you found a good purpose for this.

Note: you can't use a raw file as a cache file unless you're able to add the extra hash columns as well.

Friday, 16 September 2016

Case
I want to extract data from a SAS database file (*.sas7bdat). How do I do that in SSIS?

Solution
This is possible but not out of the box. You need to install an extra provider to accomplish this.

1) Download SAS Provider
First you need to download and install the SAS Providers for OLE DB. There are multiple versions make sure to download the correct version (otherwise you get error messages like "This application does not support your platform"). You only need the select SAS Providers for OLE DB.

Install SAS Providers for OLE DB

2) Setup OLE DB Connection Manager
After installation the new provider will be available in OLE DB Connection Manager editor. Make sure to choose "SAS Local Data Provider X.X". This is the provider that can read SAS database files (*.sas7bdat).

SAS Local Data Provider 9.3

Second import step in the setup is to select the folder where the sas7bdat files are located. Don't select a file! All files will appear as tables in the OLE DB Source component. In my case I could leave the User name and Password fields empty because I already had access to the folder (but I'm not an SAS expert).

Fill in folderpath in Server or file name field

3) Setup OLE DB Source Component
Now you can use a regular OLE DB Source Component to extract data from SAS. However there are two concerns. When you select a table and close the editor you will get a warning that there is something wrong with the code page.

Cannot retrieve the column code page info from the OLE DB provider.

If the component supports the "DefaultCodePage" property, the code page

from that property will be used. Change the value of the property if the

current string code page values are incorrect. If the component does not

support the property, the code page from the component's locale ID will

be used.

After clicking OK there will be a warning icon in the OLE DB Source Component which you can remove by setting the "AlwaysUseDefaultCodePage" property on true.

Before and after changing AlwaysUseDefaultCodePage

The second concern is more annoying: all datatypes will be DT_SRT (ansi string) or DT_R8 (float). You cannot change this and you need to add a data conversion.
Date(times) are also numbers: dates will be a number of days after January 1 1960 and datetimes will be the number of seconds after January 1 1960 and any decimals are used for milliseconds. A Derived Column expression for date could look something like:
DATEADD("DD", (DT_I4)[mydatecolumn], (DT_DATE)"1960-01-01")

All string or float

Tip: you can also use BIML to create SSIS packages with a SAS7BDAT source.

Case
I recently created packages with a SAS source, but now I want to use the same SAS source in my BIML Script. But I'm getting an error that the Local Provider doesn't support SQL. How can I solve this?

Error 0 : Node OLE_SRC - DIM_TIJD:

Could not execute Query on Connection PROFIT1:

SELECT * FROM DIM_TIJD
The Local Provider does not currently support SQL processing.

Solution
There is NO easy solution for this. The provider doesn't support SQL Queries and that's what the BIML engine does first to get the metadata from the source table. Luckily there is a search-and-replace workaround. A lot of extra work, but still much easier then creating all packages by hand!

1) mirror database in SQL server
I used the metadata from SAS to get all tables and columns which I then used to create (empty/dummy) SQL Server tables with the same metadata as SAS (The datatype is either varchar of float). The tool to get the SAS metadata is SAS Enterprise Guide. It lets you export the metadata to for example Excel and then you can use that to create the dummy tables.

A little script created by a SAS developer to get metadata

Metadata export example in Excel

2) BIML
Instead of the SAS OleDB connection manager I used a temporary SQL Server OleDB connection manager, but I also kept the SAS OleDB connection manager in my BIML code and gave both the same name with a different number at the end (easier to replace later on).

BIML Connection Managers

Because the SAS OleDB connection manager isn't used in the BIML code it won't be created by the BIML engine. To enforce that, I used a second connections tag between </Tasks> and </Package>. It also lets me give them nearly the same GUID (easier to replace later on).

BIML Force create connection managers

The end result of the BIML script:

A whole bunch of packages that use the SQL Server database as a source (instead of SAS DB)

Two connection managers with nearly the same name and GUID (SAS OleDB and SQL OleDB)

3) Search and Replace

Now you must open all generated packages by using View Code (instead of View Designer). When all packages are opened you can use Search and Replace to change the name and GUID in all packages. Make sure you don't replace too much that could damage your generated packages. Then save all changes and close all packages. Next open your packages in the designer to view the result.

Tip: you can use also the same metadata (and a big if-then-else construction) to create a derived column in BIML that casts all float-columns to the correct datatypes (int, date, decimal, etc.).

Sunday, 31 July 2016

A while ago I did a post on the SSIS (2012/2014) Feature Pack for Azure. Now a new version has been released for SSIS 2016 and they made some minor changes.Download
If you start SSDT 2015 you will see a greyed out Azure toolbox section. If you select it you will find a download link. Download and install both the 32bit and the 64bit version.

Download link in the toolbox

Installation
Installation (of both 32 and 64bit) is very simple Next, Accept, Next, Install and Finish.

Feature Pack for Azure Setup

New toolbox items

New features: 1 compression
The Azure Blob Source and Destination now support (de)compression (GZIP/DEFLATE/BZIP2), but the upload and download task don't.

Azure Blob Source

Azure Blob Destination

New features: 2 Storage Connection Manager
The new Storage Connection Manager now also supports the new Storage Account (and also the Classic Storage Account). The old storage account uses the REST API and the new storage account uses the Azure Resource Manager (ARM) API, which is wrapped in PowerShell. Nearly the same, but managed differently. The new Storage Connection Manager also has an extra option to choose an other domain (Azure moon cake).