We are just a bunch of geeks who love working in IT and are making the most of the transition that the industry is currently undergoing. This is the place where we learn and collaborate on our DevOps journey.

Monday, 17 October 2016

Who Ate All The Disk Space? (Yeah, It Was BizTalk)

This post is for newbie BizTalk users who have installed their environment using the Next, Next, Finish approach.Today you have your shiny new development sandbox* with a nice chunk of disk space. Tomorrow you will have slightly less disk space. Next week a little less still. In a month, no disk space. This will be a bad place to be.There are a number of things that eat up disk space in the world of BizTalk:

BizTalk uses a number of SQL Server databases at its heart. These databases are NOT backed up by default and you should NOT use database maintenance plans or third party backup software to back them up. You'll see why in a minute.

BizTalk normally reads messages, processes them, delivers them and then deletes them. It keeps the databases surprisingly small. However if your BizTalk solutions have issues the messages will be "Suspended" which means "saved to the database until the problem is fixed". At this point your database is now growing.

BizTalk normally reads messages, processes them, delivers them and then deletes them. This is great until someone asks you what happened to a message from, for example, the last fiscal month. At this stage you will turn on various tracing options, add logging to your process and add archiving ports to keep copies of messages. At this point you are likely to persist multiple copies of every message flowing through BizTalk which is going to get very big, very quickly.

Before you do any other work on your sandbox, get your house in order. I would heartily recommend performing these steps as a minimum:

If you use the BizTalk Administration Console to switch on tracking of messages for any BizTalk component, use it sparingly and then switch it off as soon as you have diagnosed your issue (later we will see how debugging works and how to avoid using tracking anyway). Never use the tracking options in your production environment.

Create a folder, for example, C:\FileDrop, to be the single location for all incoming and outgoing files on your sandbox. Then create a Windows Scheduled Task to keep that folder free of old, junk files.

Configure the BizTalk SQL Agent jobs that the BizTalk installer created and then disabled.

The rest of this post covers what to do with the BizTalk backups and why.

First of all BizTalk uses multiple databases. There are a number of reasons for this including the theoretical option of putting the databases on multiple servers. The downside is that a normal backup script will back up the databases sequentially, meaning you have no single point in time to recover two. The backups will be out of sync and potentially useless in the event of a disaster.

Out of the box, BizTalk provides backup jobs to work around this complexity.

Backup BizTalk Server (BizTalkMgmtDb)

DTA Purge and Archive (BizTalkDTADb)

The BizTalk backup jobs deal with this using a feature in SQL Server referred to as checkpoint marks. The first step performed by a BizTalk backup job is to create a single, simultaneous checkpoint mark in all the BizTalk databases. The next steps then back up each database up as at the checkpoint.

This is all pre-written for you in a bunch of stored procedures which are called from the two jobs I've mentioned above. However the job steps are not configured automatically. You need to edit the steps to put fill in a couple of parameters, namely the backup location and the retention period. Here's an example of how those steps may look for the Backup BizTalk Server (BizTalkMgmtDb) job.

exec [dbo].[sp_BackupAllFull_Schedule] 'd', 'BTS' , 'E:\Backup'

exec [dbo].[sp_MarkAll] 'BTS', 'E:\Backup'

exec [dbo].[sp_DeleteBackupHistory] @DaysToKeep=2

Here we are backing up to the E:\Backup folder. We are backing up every day ('d') and we are creating a checkpoint mark called 'BTS'. We are also going to delete any logs over two days old.These steps will back up all of your critical, frequently changing BizTalk databases. The stored procedures take more options but for a sandbox server these should do you well. For the "less critical" tracking databases here is an example of how to configure the DTA Purge and Archive (BizTalkDTADb) job.

exec dtasp_BackupAndPurgeTrackingDatabase 0, 1, 7, 'E:\Backup'

These jobs will keep your databases backed up. There are now a couple of final tasks you will need to do.

Create a Windows Scheduled Task to delete old backup files

Periodically shrink the databases if required (unlikely and usually not essential)

Finally a little tip for you. How does BizTalk know what databases to actually back up when it runs these stored procedures? It has a list, stored in the database BizTalkMgmtDb. And, even better, it has an extra list for adding custom databases to the overall backup job. Now you can keep your own solution databases in sync with the BizTalk databases for perfect point of time recovery from disaster.

* If you have wedged BizTalk on a pre-loved, tried and trusted sandbox see my upcoming post, "Why Not To Wedge BizTalk Onto A Pre-loved, Tried And Trusted Sandbox"

* There are actually two jobs which deal with "priority" and "non-priority databases