Technical content about Microsoft data technologies. All opinions expressed are purely my own and do not reflect positions of my employer or associates.

Taming the Log-Eating Defrag Beast

SQL Server Standard Edition can be a beautiful thing. Since failover clustering, 64-bit support and the corresponding big memory became available with Standard Edition in 2005, it can handle most SQL workloads demanded by a small or medium business. Unless you are running a pretty huge system that truly requires Enterprise functionality like table partitioning or async database mirroring, it's often perfectly safe to run with Standard x64 on a two socket, multicore machine with enough RAM, and save a huge amount of license cost. Most of my work has been in the small/medium business space, and to be honest I've seen more cases where people shell out for Enterprise and didn't really need it than cases where people have stretched Standard Edition past its limits (with some exceptions). The most common blunder is paying to run Enterprise Edition on a storage system that's too slow - it can be far better to run Standard on adequate hardware than Enterprise on too few disks - and cheaper, to boot. But I digress.

There is a pain point with running Standard Edition systems that you want to be available near 24 x 7: index maintenance. Offline index rebuild is a blocking operation and can interfere with your applications even in the wee hours. Online index rebuild is an Enterprise Edition feature, and that's costly. Likewise, parallelism for index rebuild is Enterprise only, and index rebuild on Standard Edition can only use one processor core - so it can be much slower. See http://msdn.microsoft.com/en-us/library/cc645993.aspx for a summary of Standard's limitations. Basically you can't rebuild indexes on Standard Edition and keep your system fully online.

It is possible to defrag indexes (and manually update statistics) instead, and do so without blocking. But there's a challenge: index defrag (now ALTER INDEX ... REORGANIZE) can produce a huge amount of transaction log, filling LDF files and/or whole log disks, so you need to pull a bit of a stunt to really make that work, or else your pager is sure to buzz in the middle of the night with a full disk, or worse, and nobody likes that!

Here's what I have done with decent success to make index defrag on Standard Edition really work:

Make your usual Transaction Log Backup SQL Agent job.

Make one schedule for that job on a "normal" interval like every 15 minutes. Enable this schedule.

Make a second, special schedule, for the same log backup job, that runs every minute. Disable this second schedule.

Name the two schedules carefully, such that you can enable and disable them by name, programmatically. I use the names "<mydatabase> Log Backup Schedule" and "<mydatabase> Defrag Log Backup Schedule"

Create a script, like the one that follows, to defrag indexes that will enable the one-minute-interval log backup while doing its work, then disable it afterward. As an added precaution, have the script check how full the transaction log file is before proceeding.

Test very carefully: the server where you run this must have enough disk space for log file(s) to be able to handle a little over a minute of defrag activity before the second job schedule comes on and starts taking log backups.

If your testing works, then schedule this process more frequently than with index rebuilds (perhaps nightly). Why? Because with more frequent runs, there's less defrag work to do each time, and the amount of log generated is more likely to be sane. If you have a long interval between defrag runs, and your indexes become severely fragmented, then the next run has more work to do and generates more log records.

Disclaimer: this script has worked well for me, but it's somewhat complicated, and if implemented incorrectly will cause you pain and suffering. Please carefully read and understand it, and implement only after careful testing.

Comment Notification

Comments

Generating ALTER INDEX command on pretty big database takes very very long time. All default parameters should be replaced with table/index id ..Have you considered using Ola script to maintain indexes

Oh, never mind, I see it now ... inside the loop? Why enable it with every pass through the loop? I would have expected you to enable it, then start looping through the fragged indexes. Maybe I'm missing something else.

@Bob - you're right. It'd be clearer ahead of the loop than inside. I think I was trying to avoid repeating the conditions in the 'while' -- i.e. if (there's work to do) enable the job; while (there's work to do) process the queue -- which is probably silly.