Set up a Azure SQL Database project

Create a project with only Azure SQL databases

Since the release of version 17.5 we have the possibility of creating and using databases on a Azure SQL Database.

Update!

Since the release of version 17.12 we have had the possibility of deploying to a Azure Analysis Services tabular model. I made a guide for it, the bit regarding I have a Azure SQL Server running is what you can use for this.

What is needed

A full SQL server is also possible, but it is not needed to connect and create to Azure.

If you want to use the scheduler service, it is also important that you install TX DWA on a server, or computer that is always on. It could be a Azure virtual machine, like i used i my own test, or another form of server.

It is also necessary to know the Azure SQL server name and have a SQL user with the rights to create databases on it.

How to set it up

The first step is to create a repository on the Azure SQL database.

Add the server name, repository name, set it to use SQL server authentication and add the SQL user.

Press create and it will then give a visual representation of what is being done.

Create a new project, make sure SSIS is not selected, add a BU and a STAGE/DSA database.

Add the server name, use SQL Server authentication and add the same user. It should not be named xxx_STAGE, as it will be used for both DSA and DWH.

Create the database.

Attach a data source. It can be anything really, but in my test case i used a NAV database, that was conveniently placed on the same Azure SQL database.

Add some tables and fields. Then create a DSA schema. This is to split up the two parts of the database in DSA and DWH. It need to be set up as the "Main default schema".

Then add the Data warehouse. It is set up so it points to the database that were created earlier. Use the same server and user account. The only exception is that the Direct Read option is used. It is important that it is set to "Matching Server and Database". In Azure SQL Database, two tables cannot transfer between each other. As it will not be able to do distributed database queries.

Before the the fields from stage to DWH are attached, The other Schema needs to be set up, so you avoid tables having the same names.

Deploy/Execute. See that the Azure SQL Database set up.

Limitations

The use of SSIS is not available.

It is not possible to create OLAP servers on Azure Analysis Services. It does not work with Azure SQL Databases yet. It is going to be added in the future.

If you want to create a DWH database and STAGE database you cannot use direct read.

Alternative ways of using Azure Databases

Perhaps you want only the DWH table to be located on Azure. This is possible, as long as you do not use Direct Read and SSIS. You can also do it the other way, where it is STAGE that are on Azure and DWH is locally.

You can also set up multiple environments, have a Dev environment locally and a Prod environment fully on Azure.

Questions

Do you have questions regarding a way to use Azure and set it up? Please add it in the comments.

Yep, I saw that. I'm looking for details on the infrastructure requirements side, such a suggested number of CPU cores & RAM for the Azure VM. I'm guessing the number of cores will correspond with how many parallel processes you need to run?

Regarding the infrastructure requirements.You can split out the threads in as many as you have cores, but it is not a limitation in itself. If you want, you can split it out into 8, even though you have 4 cores. That said it is still a good rule to follow.

In this instance where the SQL server is located in another location, i would probably use a "Compute Optimized" server. How fast is up to you, but it seems to be the best fit.

Regarding the ETL process and no SSIS. In my article i use a middle step with the direct read that get used instead of the normal transfer. Other than that there still is the normal data cleansing procedure afterwards.

There will be a new article that will be about the differences between SSIS and ADO.Net.

When not using one database and schemas to split up the layers we have to rely on ADO.NET for data transfer between database. How much performance degradation can we expect compared with using SSIS in a typical server setup?