Backup and relax?

Keeping a good backup strategy is a must for every DBA and database developer. But just creating a backup isn’t enough. Perhaps you don’t have enough storage to store the full backups of your database for over a week. Or taking a full backup of a database takes so long, it’s only possible on weekends. So what are your options?

RPO and RTO
Your whole backup strategy starts by determining the RPO (Recovery Point Objective) and RTO (Recovery Time Objective). The great Brent Ozar (Blog | @BrentO) wrote a great blog post about these terms, and explains what they mean.

Basically it means that you need to determine what maximum data loss is allowed, and from there you start creating a backup strategy. So how do you determine these? This is how RPO and RTO look like if you visualize them:

Storage
Another thing you want to consider is the storage available for your backups. Most of the time the backups will be stored on a NAS (Network-attached storage), and not on the local server, so storage isn’t a big issue in that case.

I’ve also seen companies that created the backup on the local server, and after completion copied it to a network location. In my opinion it’s only one more dependency that you could prevent, but other than that it’s a valid option.

Backup types
SQL Server supports multiple backup options. They all have their pros and cons, and give you the ability to create a backup strategy that fits your needs. In this blog post, I’m assuming the database that we work with is created as a full recovery model database.

Full backup
A full backup takes a backup of the entire database. With this backup file you’ll be able to recover the entire database, without the need of extra log files. Creating a full backup can take more time, depending on the size of the database. Let’s visualize this with an example:

Looking at the example you’ll see that every night a full backup is created. But on Friday night 8 PM the database crashes, and we need to recover from backup. The last full backup was taken 20 hours ago, so those 20 hours of data changes are lost.

Differential backup
If you have less time to spend on backing up your database every night, one of your options is to take a differential backup. A differential backup only backs up data that is changed since the last full backup. A differential backup can’t be created without taking a full backup first. If you try to create it without a full copy of the database, SQL Server will throw an error:

When you create a full backup, a backup chain is started. This means that SQL Server registers which LSN (Log Sequence Number) was added to the last backup. When you take the next backup, the backup will contain all transactions from the last LSN of the previous backup until the time of your new backup.

A backup chain can’t be started with a differential backup. Also, when you want to restore a differential backup, you need the full backup it’s based on. To put this into perspective, look at the visualization below:

At midnight on Monday we create a full backup of the database. Every other day we create a differential backup at midnight. On Friday at 8 PM the database crashes, and we need to restore a backup. All we need is the full backup from Monday morning, and differential backup 4. Although it takes less time to create a differential backup, you see that this hasn’t helped you much. You still lost 20 hours of data.

Transaction Log backup
The last major type of backup is a transaction log backup. This backup contains all transactions that were executed after the last full or differential backup were created. This gives you the opportunity to perform the so called “point-in-time recovery”.

Just like the differential backup can’t be created without a full backup, a transaction log backup can’t be created without a full or differential backup first. So the transaction log backup can’t be used to start a backup chain. Let’s take the same example, and add a transaction log backup every 15 minutes (the blue lines represents the transaction log backups):

If the database crashed on the same time as the previous examples, your data loss is slimmed down from 20 hours to a maximum of 15 minutes. In order to recover your database, you need the Full backup created on Monday, the differential backup created on Friday, and all transaction log backups created after the differential backup at midnight. So if the database crash occurs a few seconds before the next transaction log backup, the maximum data loss is 15 minutes. Again, without a full or differential backup you can’t create a transaction log backup:

Pitfalls
Whenever you create a database, backup that database, and throw away that backup file, you can create a differential or transaction log backup. SQL Server doesn’t require the last full or differential backup (in case of a transaction log backup) to be present. So remember to always check if there is a valid backup available, either on the server or on your backup location.

Backup compression
From SQL Server 2008 onward, you can use a new feature called Backup Compression. Whether or not you’re compressing your backup can also make a big difference in performance. A compressed backup is smaller, so it requires less I/O when created, and can increase backup speed significantly. On the other hand, compressing a backup increases CPU usage. So it’s a tradeoff you need to consider. But in some cases, this could solve the problem of having a shortage on storage.

Files and Filegroups backup
Creating a file or filegroup backup can be practical when performance or database size is an issue for you. Perhaps taking a backup of your 500GB databases takes to long, and you need to consider other options.

You can backup all filegroups separately, but it’s also possible to combine a number of filesgroups in a single backup. This makes it easier to balance your backups over several disks when you’d like to. But perhaps it’s easier to create a backup that consists of multiple files. This can be achieved by adding more destination files at the bottom of the “create a backup”-GUI. SQL Server than balanced the data across the files you added.

Adding more destination files to your backup can also increase performance. Jes Schultz Borland (Blog | @grrl_geek) wrote a great article about that. She tested several options to see what the impact on performance is.

Copy-only backup
One of the most important backup options (in my opinion) is the copy-only backup. This allows you to create an ad-hoc backup without breaking the backup chain.

A copy-only backup works independently from any previous backup or backup plan. So a copy-only backup will not take the last LSN into account, or store the last LSN added to the copy-only backup you’re creating. So if you have a backup plan in place, and you or one of your colleagues needs to create an ad-hoc backup, copy-only is the way to go.

Now can we relax?
The short answer is: NO! Your work has only just begun. Now that you have a backup strategy, you need to build it, test it, tune it, and cherish it. Creating the perfect backup strategy isn’t a silver bullet. Your databases change, your processes change, your colleagues change…

So when is the last time you tried to restore a backup from your automated process? You can’t remember? Time to get it done than! You know what they say: A DBA is only as good as his last restore. So if you want to keep working as a DBA for your company, start preparing a test restore now.

And then…?
Once you’ve created a backup strategy, the hard work just begins. How are you implementing your backups? Are you planning on using the default SQL Server maintenance plans? Are you building something yourself with SQL Server Agent Jobs and SSIS packages? Maybe you want to buy a solution from a specific vendor you like or know? Well, what about a free solution?

It’s a solution used by many of our community members, and won a lot of prizes over the years. Not sure if it’s safe to use? Why don’t you look at the list of companies that use his solution, or read the blog post of Jonathan Kehayias (Blog | @SQLPoolBoy) about it.

Another great resource to start from is the TechNet page about backups. This contains a lot of information about the techniques behind the backup process, and the possible pitfalls you’ll encounter.

Conclusion
When creating a backup strategy, you need to take a lot of factors into account. What kind of hardware are you working with? Is the storage you need available? Is it possible to create a full backup every night, or only on weekends?

After building your (custom) solution, you need to spend time on tuning and maintaining it. Your databases aren’t static, and will change every second, every minute, every day. So keep changing your process to perform at it’s best, and in the end, you will create your own free time to spend on cool things.

4 Responses to Backup and relax?

Well put together post and thought out. There is a lot of information that will allow users to think more on their approach and how best to handle within their own shop. Nicely done Jeff. Look forward to the next blog post.