Proactive Database Administration

Index Maintenance

When we manage databases in either the FULL or BULK_LOGGED recovery models, we need to pay particular attention to the size of the transaction log files. If our processes aren’t optimal, we can see log files grow either out of control, or beyond what we think is a reasonable size.

Virtual Log Files

As I mentioned in a previous post, the transaction log contains an ordered sequence of the physical operations that occur on a database.

What I didn’t mention is that the physical transaction log file also contains logical sections, called virtual log files (VLFs). It’s the VLFs which actually contain the physical operations I mentioned. The purpose of VLFs is to allow SQL Server to more efficiently manage the log records; specifically, to know which portions of the transaction log are used, and which aren’t.

Knowing the portions of the log that are used is important when we go to take a transaction log backup, which creates a copy of all the transaction log records since the last transaction log backup. After a transaction log backup, as long as all the log records within a VLF are not part of an active transaction (or are required for some other feature, such as replication), the VLF can be marked as unused/inactive. This allows SQL Server to reuse that portion of the log file. This process is called many names, including clearing the log and log truncation. It does not affect the physical size of the log file.

Problems only start happening if all the VLFs end up being used, and none are available1. This means the log file (the physical file) has to grow to accommodate more transactions. A physical log file growth automatically creates new unused VLFs, and so the transactions can continue.

What is causing the log to grow after I shrank it?

Any write operation in the database can potentially cause the log file to grow. The real question is: why are all the VLFs used up when that’s not what is expected?

Here are some of the problems you might encounter:

Not taking transaction log backups frequently enough, or not taking them at all. The only way to mark VLFs as inactive is to take a transaction log backup (again, only for FULL and BULK_LOGGED databases). If transaction log backups are only taken once/day, besides exposing yourself to a full day of data loss, you also need a transaction log file large enough to hold all the transactions that occur for a full day. That could be quite a bit of space! The solution is obvious, and it’s a win for recoverability, too: take transaction log backups more frequently.

Overly aggressive and/or too frequent index maintenance. I’m certainly a proponent of index maintenance, but it’s very easy to get carried away. Maintenance plans that rebuild shiny new indexes for you every night are both ridiculously easy to set up, and also ridiculously bad for the size of the transaction log. Typically, rebuilding all the indexes of a database takes more log space than the data files actually take up. If you’re dealing with databases of even small-to-moderate sizes (say, 1 GB and larger), this can add up to a tremendous amount of wasted storage space, because all that transaction log has to be backed up at some point, and the log file will likely end up larger than the data files. What I strongly recommend doing is putting in place a much more selective index maintenance process, such as Ola Hallengren’s index maintenance scripts, which I use in production.

The database is in FULL or BULK_LOGGED when it actually should be in SIMPLE. Before you go flipping the settings around (which can potentially cause data loss), go read about choosing the right recovery model for the database. Typically, I see this happen when a production backup is restored to a development environment, and a backup gets taken for some reason. And then the transaction log ends up growing, and growing, and growing as developers test and roll back their work, right up until things screech to a halt when the drive that contains the log file has filled up.

A SQL Server feature, such as replication or mirroring, is enabled. Until the transactions have been copied to the secondary server, the records cannot be cleared from the transaction log. If the secondary is slow applying changes, or if you don’t even have a secondary any more and things weren’t cleaned up correctly, this could lead to infinite transaction log growth. To solve this really depends on where the specific problem is — solve that first, and then see if there is still an issue with the transaction log.

If none of those cover your scenario, you can run the following query, which will tell you why the transaction log is still in use.

1 Extra empty log space is pre-allocated such that there is always enough space for active transactions to be able to roll back. Regardless, that space is “used” in the sense that it’s been spoken for already.

What is Fragmentation?

When the logical order of a storage unit doesn’t match the physical order on a storage device, the storage unit is said to be fragmented. In other words, if there are storage units that contain A B C D E, but the physical order is D E C B A, this means the storage device may have to do random access instead of sequential access. This can be a big deal if the device is slow at doing random accesses vs. sequential, as is the case with hard drives of the spinning/magnetic variety.

Layer 1: Fragmented Indexes

This is the most granular level. Your database’s indexes are fragmented when the pages and/or extents are out of order within the data file. The amount of fragmentation can be measured by using sys.dm_db_index_physical_stats.

Solution: Install and periodically run an intelligent index maintenance solution, such as Ola Hallengren’s.

Layer 2: Operating System Files

A database is physically persisted in files in a file system. The files are fragmented when the allocation units that make up the files are out of order in the file system. This is the exact same type of fragmentation you can encounter when saving normal files on your computer.

Potential causes: Multiple file growths on the same drive (smaller growths make things worse); creating databases on an existing drive that is already fragmented; using a file system allocation unit size smaller than 64 KB (the size of one extent).

Solution: Periodically run Windows Disk Defragmenter on the drives that store the database files (this requires shutting down the database engine instance for the duration of the maintenance).

Layer 3: Virtual Machine Disks

If your Windows instances are virtualized, there is an additional layer of potential fragmentation, because the drives in Layer 2 are abstracted — the contents of the drives themselves are files in a file system. And like Layer 2, those files can potentially become fragmented.

Potential causes: Dynamically-allocated virtual disks that grow; creating a new virtual disk on a drive that is already fragmented.

Solution: Only used fixed-size virtual disks, or use a separate area dedicated to dynamically-sized virtual disks. For the fixed-size area, if things aren’t fragmented at all, nothing needs to be done. For the dynamically-sized area, periodically run Windows Disk Defragmenter on the drives that store the virtual disk files (this may or may not require taking the virtual machine(s) offline during the process).

As you can see, there are multiple layers where fragmentation can be introduced, and they all act at the same time. A comprehensive maintenance solution should cover all the layers in your environment. Generally speaking, a layer closer to the physical device is easier to handle, and it’s best dealt with when first setting up the instance of Windows and the database server itself (aka set it and forget it).

From a database administrator perspective, the biggest takeaway is to deploy an automated intelligent index maintenance solution, and proactively grow the database files as needed in larger chunks, instead of relying on small auto-growth increments.

He makes good points, particularly about taking full advantage of the ridiculously low price of memory right now. This makes external fragmentation a thing of the past, while the server is running. SSDs make fragmentation a thing of the past, full stop, so if you’re running those, you can abandon this article now. (See you in the next paragraph!)

I want to bring up an important point that wasn’t mentioned in the article: if your SQL Server instance is turned off, deliberately or otherwise, all caches, including the buffer pool are now cold, and have to be reconstituted from somewhere — that somewhere is from disk.

This is crunch time: the data needs to be read from disk as fast as possible into the buffer pool so it can be served directly from memory the next time it’s requested. This could be the most critical period of time if you have performance SLAs to meet, because as Brent mentioned, random read performance can be orders of magnitude slower than sequential reads1. Putting in place an intelligent index maintenance process helps the system perform more sequential reads, and thus improve performance. The longer the reads take, the more your SLAs are under pressure, or perhaps broken entirely.

That brings up another important point: if you’re ever negotiating performance SLAs, make sure to take cold starts into account during the discussion. The business may be okay with relaxed SLAs in the 10 minutes following a restart, for example. But maybe they aren’t okay with that: index maintenance could play a part in meeting the business requirements without investing in more expensive technologies. In any event, if you’re going to agree to something, this is definitely one aspect you want to put in writing. Moreover, the process should be tested — now, before you agree to anything, and periodically in the future — to make sure that you can keep your word, and give you a proactive look at how long you have until the SLA can no longer be met.

The last thing you want during a failover is a flurry of heated questions from a CIO standing over your shoulder, but if you have a written document you can point to that says “this is what was agreed,” and that’s how the system is performing, that’s the end of the discussion right there.

1 The transfer rate cited — under 2 MB/s random reads — is a somewhat misleading example because no one would set up shared storage with a 4 KB stripe size. At least I hope they wouldn’t… A more typical stripe size is 64 KB, which would yield a higher transfer rate than the cited number, but would still be much slower than sequential reads.