In my previous blog post I explained how we are going to leverage the power of transactional replication, to move your on-premise databases to Azure SQL databases (PaaS). This works like a charm, but you do end up with a bunch of system-generated objects in your database. So how do you get rid of these?

What “mess” is left behind after disabling replication
After you’ve completed your migration to Azure SQL DB, and cut the strings on your transactional replication, you end up with a bunch of system-generated objects. These are used by SQL Server to “replay” the changes on the subscriber (inserts, updates and deletes) that were made on the publisher. This is just an example of one of our smaller databases:

As you can see, there are 3 stored procedures per table (ins, upd, del), and some other objects used by replication.

Cleanup
Fortunately the SQL Server team gave us a way to easily clean up these objects: sp_subscription_cleanup (that, even though MS Docs says it won’t, WILL work on Azure SQL DB).

This stored procedure accepts the following input:

@publisher: The name of the SQL Server instance of the publisher
@publisher_db: The name of the publishing database
@publication: The name of the publication (this was configured in the setup/configuration of replication)

By running this stored procedure on your subscriber, it cleans up all remaining transactional replication components. But where can you find the input for this procedure?

First I was looking at MSreplication_objects, but this doesn’t work on Azure SQL db (both MS DOCS says so, and I’ve tested it). So my guess was that all the other systemobject mentioned in related articles wouldn’t work as well.

So what do you do when you can’t find the answer yourself after extensive Googling/Bing-ing? Yes, you call in reinforcements (in this case William Durkin (Website | @sql_williamd), who knows a lot about replication):

But as you can tell by the reply on William’s response, I quickly found the answer by just trying something out: MSreplication_subscriptions. You can query this table on your subscriber to retrieve the publisher- and publication-properties. Again: even though MS DOCS says it won’t work, it does work on Azure SQL db.

So by using the output of MSreplication_subscriptions on your subscriber, you can quickly find the information needed to run sp_subscription_cleanup.

Automating the script with C#
Running the script on 1 or 2 databases can be done manually. But because I migrated an entire data platform with replication, we’re talking about A LOT of databases. So instead of doing this all manually (you need to connect to every single database to check this), I decided to create a small tool to help me with this.

In all honesty, my C#-skills aren’t that great, but my response to that is: “If it looks stupid but works it ain’t stupid”.

The tool connects to the instance you provided, returns a list of databasenames, and uses this list to connect to every database individually to check for replication components:

The tool can be used to connect to any Azure SQL db instance, as long as you have 1 login/user that can connect to all database on that server. If your server supports that (it depends on the configuration you use of course), you’ll see a list that provides you with the following information:

For those of you who use Azure today, the security discussion must have been a thing on some occasion. Explaining to managers (and possibly colleagues) that Azure is a lot more secure than a(n) (on-premise) data center, and that Azure is easier to maintain and scalable. Trust me, we’ve all been there!

But besides the physical security, there’s also the digital security. In the world of today it’s easier to find a data-breach on the news, then it is to find an item about a bank robbery. So how can you secure your data in Azure in an easy but solid way, without the hassle of changing your applications?

Encryption could be one of your tools to achieve a secure infrastructure and/or applications. But encryption is a challenge for pretty much everyone. Almost every day we hear about companies not doing it right, or not doing it at all. But luckily, Azure helps us with setting this up with just the click of a button.

So how do you enable it? There are 2 ways to do so, but I’ll only show you the route via the Azure portal. Information on how to do this via T-SQL can be found here.

First, login to the Azure portal, and navigate to the database you want to encrypt. Click on “Transparent Data Encryption”, and just with a click of a button you can encrypt your data:

This will start the encryption process and, depending on the size of the database, after a while you’ll see that the data is encrypted:

This feature will allow you to encrypt your database, without any application changes. This is because the encryption and decryption is being handled in an “intermediate layer” by Azure. The data will be decrypted before returning it to the client, and the other way around it will be encrypted before it’s stored. So your applications will continue to work without any changes in the application-code or connectionstring(s) to the database(s).

Storage Accounts encryption
There is also an option to encrypt your Storage Accounts in the same way as TDE works for Azure SQL databases (without any application changes). When you enable this on your Storage Account, please remember that only the new data will be encrypted, and that the existing data won’t be encrypted until it changes. For more information on this, please read this article, and this MSDN thread.

When you’re creating a new Storage Account, you can choose to encrypt it right away:

But when you want to encrypt an existing Storage Account with data in it, you need to do it on 2 different levels (it’s a separate setting for BLOB and files):

This will encrypt your data with the same algrorithm as TDE for SQL Server will do: “All data is encrypted using 256-bit AES encryption, one of the strongest block ciphers available.” (source).

Conclusion
For us as a company, enabling this features means that all of our data is encrypted. We’re only sending and receiving data from within Azure, so the communication is also secure. And even though the majority of our data is public data (publicly available such as flight information, etc.), it’s a safe feeling to know that all our data is encrypted when stored.

In one of my previous posts, I described how to setup replication to an Azure SQL database. This works like a charm, and I still highly recommend using this when you want to migrate data from an on-premise server (or Azure VM) to a Azure SQL db (PaaS).

But in our environment, we use SQL Server 2016 and contained databases for some of our datasets. Unfortunately (but totally understandable), you can’t setup replication from a contained database. So how do you deal with this? For our use-case, I’ve written a script to automatically change the database from contained to non-contained. And because I’m probably not the only one who needs to do this, I’d like to share it with you.

Steps to take
It might sound a bit difficult, but when you think about it, it’s actually quite easy. To go from a contained database with a user to a non-contained database with a login, you need to take the following steps:

The script
Before you start using this, I want to warn you. Using this script on your server(s) or in production is at your own risk. This worked for me, but maybe it doesn’t on your server(s)!

In order to let this script work, you must be able to stop applications and queries from executing on your database. In our case, we can just stop the service(s) that use a specific database (that’s the advantage of micro services). The reason for this is that you must be able to obtain an exclusive lock on the database, in order to switch from contained to non-contained.

Conclusion
Even though I thought that using a contained database could be a big blocking factor for us in the process of migrating data to Azure, it really wasn’t that big of a deal. Especially now that we automated the process, it doesn’t add more than 5 minutes to the process of replicating data to Azure.

In the previous post I wrote, I explained how to setup replication from an on-premise SQL Server instance to an Azure SQL database. While doing this, I came across a very strange issue (or maybe even bug) when setting up replication.

The problem child
After working on reproducing the issue for a day, and trying to reduce the issue to a small-scale problem, I came to the conclusion that the problem was (probably) caused by a single primary key on a table in the database:

Creating publication & subscriber
The setup of the publication and subscriber wasn’t that difficult. As I said before, there are a few things you need to configure differently then you would do for SQL Server to SQL Server replication.

So I won’t talk you through the whole process again, but refer you to the articles instead.

Generate initial snapshot
Once the publication and subscription are in place, it’s time to generate the initial snapshot. The snapshot agent prepares the snapshot that contains the schema and data, needed to initialize the subscriber(s):

In some cases it takes a while, but in the end, I found this “warning” on my screen:

When you dig into this by opening the agents tab in the replication monitor, you see the actual error:

First I thought that the DATE column being part of the primary key was the problem. But then again, another table in a different database had the combination of a DATE and a UNIQUEIDENTIFIER as a primary key. So that couldn’t be it.

Digging deeper
During my investigation to this issue, I found the following:

– There was enough space on the disk for the snapshot
– Other database with DATE in primary key (PK) worked (even though combination was DATE + UNIQUEIDENTIFIER, instead of DATE + VARCHAR)
– Adding “-UseInprocLoader” to Snapshot Agent job doesn’t work

So after digging a bit deeper, and trying to reduce the issue to a small-scale problem, I found:

– Reproduced in an empty (new) database, with just the failing table as single object in the database –> STILL FAILS
– Changed object in test DB to use newly created IDENTITY(1,1) as PK, and added Unique Constraint to old PK columns –> WORKS
– Make DATE column part of different PK (together with IDENTITY), and no Unique Constraint –> WORKS
– Changing the PK to just the VARCHAR column –> WORKS
– Changing the PK to just the DATE column –> WORKS
– Setting database on local instance (SQL 2016 Dev) to comp. 110, just like on test –> WORKS

After a lot of different variables in the test-setup, I found out that it’s probably an old bug that wasn’t properly patched when upgrading the SQL Server engine to a newer version. Let me elaborate on that:

– The bug is reproducible on the test server, which is an upgraded engine from SQL 2012 or 2014 to SQL 2016 RTM
– The bug is reproducible on the production server, which is an upgraded engine from SQL 2014 to SQL 2016 RTM
– The bug is not reproducible on a clean install of SQL 2014
– The bug is not reproducible on a clean install of SQL 2016 RTM
– The bug is not reproducible on a clean install of SQL vNext CTP

Finding a work-around
Because I couldn’t find a work-around for this issue, I requested the help from Justing Langford (Blog | @JustinLangford) from Coeo. He pointed me to an article that describes a few possible work-arounds.

For me, adding a row filter for the table did the trick:

As described in the article mentioned above, adding the row filter disables the BCP partioning for this article (table), and the snapshot agent completed without any problem:

Conclusion
Although this bug should’ve been fixed ages ago, it looks like it (re)appeared again after an engine upgrade. I’m not sure how this happened, but all I know is that it took me about 3 days to find, reproduce, reduce and work around the problem.

For performance reasons we are looking for a way to split our write-operations from our read operations. A good way to do that is by duplicating the database onto another server (on-premise) or platform (Azure SQL DB). When you look at the options (Availability Groups (AG), database mirroring, clustering, replication, Azure sync groups, etc.) the easiest way to quickly duplicate a database and keep the data up-to-date at the same time is SQL Server replication.

Replication to another on-premise instance is easy. You just follow the steps in the wizard, it works out-of-the-box, and the chances of this process failing are small. With replicating data to an Azure SQL database it’s a bit more of a struggle. Just one single word took me a few HOURS of investigation and a lot of swearing…

Setting up replication and databases
Before you start with setting the whole thing up, make sure that the replication is installed on your instance. If this is not done yet, install the needed components via the installer. This component can be installed without rebooting the server, or the need to restart the instance.

If you needed to install replication, make sure you restart the SQL Server Agent. For some reason, the configuration change works on the instance without any problems, but the Agent can’t run the needed jobs until you restart it.

The next step is creating 2 databases: a publisher (on the on-premise instance) and a subscriber (an Azure SQL database):

Setup the publisher
Now that you have 2 databases, it’s time to setup the publisher. I’m not guiding you through the whole replication setup (there are plenty of tutorials out there that are WAY better than my explanation), but I’ll show you the highlights.

Here comes the part that took me a while to figure out. When you try to replicate changes to an Azure SQL database, and you use the normal connection, it will fail to connect in the end. But if you use a secure connection, replication works like a charm. So make sure you use the following server name ([Server Name].database.secure.windows.net):

Adding that single word took me an hour or so to figure out. Hopefully this will save you a few hours of investigation, looking for dodgy error messages, ending up in finding an “errorcode 40852”, and leading you to the solution.

When this is set up, select the target database (subscriber):

When you’re prompted for the distribution agent security information, make sure you use a SQL Server login that has the correct permissions on the Azure SQL database:

Start the replication
Now that everything is in place and configured, it’s time to create the initial snapshot. This snapshot will be used to create all the objects and data that are supposed to be replicated on the Azure SQL database. Open the snapshot agent by clicking “View Snapshot Agent Status” on the publication:

The snapshot agent hasn’t run yet, so press “Start” to start the process:

When this process has completed, you should see something similar like this:

Now it’s time for a final test. Insert a record on the source database (publisher):

INSERT INTO dbo.SomeObject
(SomeValue)
VALUES
('Test value')

You should see this record pop-up on the subscriber after a few seconds (default latency is 5 seconds):

SELECT *
FROM dbo.SomeObject

You can also check the “Synchronization Status”, and you should see a command being replicated:

Conclusion
Out of all the available options, I still think replication is the easiest way to replicate data into Azure. The only downside is that there doesn’t seem to be a lot of information about this, especially when you look at the “database.SECURE.windows.net” part.

If you need more than just the one-way traffic, or you want a bit more functionality, maybe replication isn’t your weapon of choice. But for me, it worked like a charm… Eventually…