Building SQL Server Data Warehouse in Azure

Choose your options

Azure as a cloud service is gaining market share; 13 per cent as of Q1 2018 and second only to AWS with 33 per cent according to Synergy Research Group. Many companies are hoping on-board the Azure train as the platform becomes more mature and more services become available and cost effective.

Building a data warehouse using SQL Server database on Azure is one of the many use cases that’s gaining traction.

This blog explores and compare some key differences between the 3 options to build data warehouse using SQL Server database on Azure.

Azure SQL Services

In this SaaS offering, SQL Server is offered as service. This is a managed service. One thing to keep in mind is that there is limitation of 4TB database. For database >4TB, sharding methodology is required. Sharding databases makes infrastructure and application more complex.

Azure SQL DW

This SaaS offering is for high performance data warehouse. It uses massively parallel processing (MPP). Some limitation to this is that it does not support cross database queries and limited number of concurrent queries: for Generation 1 it supports only 32 while Generation 2 it’s bumped up to max of 160. This is also the most expensive offering. One feature that help alleviate the expensive cost is that you can pause to save cost.

SQL on Azure VM

This falls under Azure IaaS offering. There 2 ways to pay for SQL Server license for this option.

Bring Your Own License (BYOL)

You can do bring your own SQL license and install on the Azure VM. There is minimum SQL Server license cost of 16 cores that Microsoft will charge for this model. Additional licenses are sold in packs of 2 cores. Microsoft also requires annual software assurance cost equal to 1/3 cost of the entire license when installing SQL Server on any public cloud.

Bundle License with Azure VM

You pay the Azure VM with SQL Server License. In this choice you also have the option to reserve ahead of time of one, two, and three years. There is incentive for cost savings for reserving longer periods.

Below is a table that summarize some key features to compare the 3 options.

~$71,394/yr. This is based on single 16 core. 132 GB memory @$8.15/hour. VM included

~69,468 /yr for Gen 1 with only 500 DWU

~$135,704/Yr for cheapest Gen2 offering of 1000 DWU

Cross Database Queries

Yes

Yes

No

Performance -massively parallel processing (MPP)

No MPP

No MPP

MPP

Pause to Save Cost

No

No

Yes

Concurrent Queries

6400

6400

32 Gen 1, while Gen2 supports 160 but cost is 10 time for cheapest offering. See Cost row.

Security -Always Encrypted (AE)

Support AE

Support AE

Do not support AE

Security -Transparent Data Encryptions (TDE)

Support TDE

Support TDE

Support TDE

Note: * All Azure services cost are based on 24/day and 365day/year metering.

Summary

When considering implementing SQL Server data warehouse on Microsoft Azure weigh the 3 options mentioned above and choose according to what fits the use case and organization. Ask the following questions:

Do you have a large amounts of data ( >4TB to be stored in database)?

Do you require highest performance, is MPP required?

Do you need the ability to do cross database queries?

How many concurrent queries does the data warehouse need to support?

What is your operational budget and how much is cost a factor in your choice?

What are the security requirements for data to be stored in data warehouse?

Besides the keys ones above, there are many additional questions (not covered here) that should be asked as part of requirement gathering to decide on most suitable and cost-effective data warehouse architecture and design.

Note: The features and prices listed are as they were at the time of this writing…