A basic approach to SQL Server administration

Menu

Files in SQL Server need to grow as the database grows, and in very specific circumstances need to be shrunk as well with additional maintenance. There is quite a bit different between log files and data files, so I created a separate post for Shrinking Database Data Files.

When should you shrink log files

First, if a large process that will never happen again caused your log file to grow to a ridiculous size then it’s reasonable to shrink the file down to about 150% of the most space you expect to use in the next year. Cleaning this up will also cut down on restore times.

Second, if your log file grew in small increments then you’re left with a large number of VLFs like I talked about in file growths. The only way to get rid of these is to shrink your log files down then have them grow in larger increments. When I do this, I’m also growing them manually immediately afterwards in the same script.

What happens when you shrink log files

The log file will find a group of empty VLFs and delete them. The end, your file is shrunk.

Of course there’s a little more to it than that. The parts I’m most worried about are why did you need that space in the first place, why do you need the space freed up, and what happens when you need it again?

To understand why you needed it we need to talk about what a log file really is for a second. It is every change made to your database since the last point-in-time where it didn’t need all the changes saved off. This is typically either the oldest open transaction (so it can roll back any change if the transaction fails) or, if your database is in full or bulk-logged recovery, the last time you ran a log backup It can also be due to other functionality such as database mirroring, but you typically have an experienced DBA on hand for that.

Why is this common to do?

There is typically at least once in every large database’s life where the log needs to be shrunk for one of a couple reasons.

Going off of the two reasons above, the first two are because a large process ran that won’t run again. The third reason I go over here is because of small VLFs.

1. Log backups weren’t set up

The first large process is when a database is set up in full or bulk-logged recovery, but the log backups weren’t set up until the drive filled. This is because the log needs to be backed up before it can be flushed, which means it will grow until it runs out of space. It’s common to miss this when the backup plan needs you to specify which databases to back up, as opposed to doing all databases. Once you take the first log backup it leaves you with a huge log file where you don’t expect to hit that size again, and it’s ok to shrink it.

2. Runaway transaction

The second large process is when someone runs something that didn’t go as planned. A transaction, even an implied transaction for a single statement, stayed open forever, or close enough to forever to cause us a headache. This caused a ton of excessive growth in the log files, and that space will only ever be used when someone says “oops” again. In this case, evaluate if you’ll need that space for another process before you shrink because someone will say “oops” again someday. If the drive is dedicated to log files and you only have one large database, consider shrinking it just enough to clear the warnings in your monitoring software.

3. VLF cleanup

The final reason is because you have too many VLFs. When a log file is created or grows it splits up the new space into smaller chunks called Virtual Log Files that SQL Server will cycle through when writing logs. If the database is growing in small increments it creates tons of these, slowing SQL Server down. The point here is that you resolve the issue by shrinking the log (delete smaller VLFs), then grow the log to the desired size in larger chunks (create bigger VLFs). The details on finding what size your VLFs are, what size new VLFs will be, and just about everything else VLF related is in my post Database Log VLFs.

What happens when log files grow?

If you’re going to shrink a log file then you’re increasing the chances that it will have to grow again, so we need to talk about growth for a second.

When a data file grows it can often take advantage of instant file initialization where it just reserves unallocated space on the disk, making it a quick process. Without this being turned on or if you’re using Transparent Data Encryption, SQL Server would have to zero out the disks to get rid of data that used to be there. Kimberly Tripp (b|t) goes into a lot more details than this in her post Instant Initializations – What, Why and How?.

Log files can’t use this feature. They write those 0’s every time the log grows, and SQL Server is patient enough to wait for that to happen. Here’s how that went over the last time I overheard one of my users “discussing” this with my biggest server.

SQL Server: My log’s full, hold on while I get some more space.

User: I don’t care, I just want my query to finish.

SQL Server: I’m told to grab 8,000 MB at a time, please be patient.

User: Not a problem, there’s 8,000 MB right there, please hurry.

SQL Server: I only use clean space, hold on while I ask the OS to write about 8,388,608,000 zeroes for me. This isn’t a problem, I’m patient.

User: What? NO! Well, at least that gives me enough time to open a ticket for the DBA.

Ticketing System: Hello again, I haven’t seen you for several minutes.

The moral of this story is that you don’t want autogrowth to happen. You want it turned on for emergencies and one-time growths, but it’s better to grow manually in a maintenance window. Here’s how that maintenance window conversation goes:

DBA: I see you’ve hit 90% utilization on your log a couple times, how about we add 8,000 MB.

SQL Server: Oh, I didn’t notice, but I can add that space now. I only use clean space, hold on while I ask the OS to write about 8,388,608,000 zeroes for me. This isn’t a problem, I’m patient.

User: Zzzzzzzz….

DBA: There’s no rush, everyone can use the existing log space while you work on that.

Ticketing System: Zzzzzzz…..

The 8,000 MB I used here isn’t a made-up number, it’s a best practice due to VLF sizes on larger databases. Most DBAs set autogrowth to less than 8,000 because they’d rather deal with smaller VLFs than irate users, and I’d recommend 1,000 MB for autogrowth on larger databases. To be fair, that number drops to 500 MB starting in SQL 2012, but that’s still a lot of waiting during peak hours.

On smaller databases you aren’t going to want to grow log files by 8,000 MB off hours, or even 1,000 MB for autogrowth. The growth should be something rather significant for the database yet size appropriate, and I typically like to do close to 50% of the current file size. So, if I have a 500 MB log, I’ll set it to 250 MB autogrowth and do manual growths of the same size. For smaller databases where 50% log growth would be less than 100 MB, I’m not always worried about manually growing the log files since there are diminishing returns on my time and effort.

My post Database Log VLFs gets into the details of why those are best practices and what it means to you.

Entry-Level Posts

Tim Ford (b|t) noticed how we moved to advanced topics and asked us to remember where we came from in his Entry-Level Content Challenge. This is my second post to take on his challenge, and I hope others will take him up on this!

Advertisements

Rate this:

Virtual Log Files (VLFs) split a physical database log file into smaller segments, which are required for how log files work in the background. These are created automatically, but automatically doesn’t always mean perfect. Here’s a practical view of what you need to know and how you can set them up properly.

What VLFs Do

My initial view of how a transaction log worked was that each change was written to a log file, and, as the documentation of those changes was no longer required, it was deleted from the log file. I think this is a common way to view it, and it’s close to the truth. However, it’s not close enough for a conversation on VLFs.

What really happens is that changes are written to the first VLF, which is just a segment of the log file. When that VLF fills up, it moves on to the next VLF in a systematic order.

Each change made to the database is assigned a Log Sequence Number (LSN), and SQL Server keeps track of the oldest LSN it still needs for any purpose. This purpose can be many things, but is typically the oldest LSN of the following:

Oldest LSN on a log backup (full or bulk-logged recovery)

LSN at the start of the oldest active transaction

LSN last replicated when using certain types of replication

Here’s my extremely complicated script to tell you what that reason is for your database:

SELECT name
, log_reuse_wait_desc
FROM sys.databases

If the last LSN in a VLF is older than what you need to keep, that VLF can be reused. This makes is very efficient because this becomes something like a set-based operation for SQL Server, the entire VLF is cleared and marked for reuse at once instead of line-by-line.

To help you picture how the VLFs work, someone at Microsoft drew you a picture on Transaction Log Physical Architecture. Then you can look at the results of DBCC LogInfo, and it will make a lot more sense when you see a VLF on each line there along with its status.

Only One Log File, Dedicated Drive

Adding more than one log file on a single database isn’t going to help your performance. SQL Server is writing to a single VLF at a time, and that single VLF is part of a single log file. If you have multiple log files then you’re writing to one while the other sits idle. Once you fill up the last VLF on the first file it starts writing to the second file while the first sits idle. If these are on separate drives that means each drive has to be able to handle the I/O, but it’s hit or miss if you’re using the I/O which leads to wasted resources and inconsistent performance.

As for how you’re writing to the log, most of the work done is writing to the tail end of it so spinning disks do really well just keeping the head in one place and writing to the file. That is unless you have other types of files on this disk so the head writes a little to the log, jumps over to write a little to a data file, writes a bit more to the log, then jumps over to read a couple indexes. This is a lot of where the recommendation to keep your data and log files on separate disks come from, but there is more to it than I’ll get into here.

However, we’re going to use RAID for redundancy making the drives too large for just a log, then we’ll put it on a SAN with a write cache, and do so many other things to make it more complicated. If a server is extremely stressed and highly critical, look into dedicated spindles, SSDs, or other options. On general shared disk arrays, it’s nice if you can have an array dedicated to just log files, even if it’s log files for multiple servers.

VLF Size Matters

When you’re writing to VLFs there can be two issues. First, the VLFs are very small and SQL Server is jumping all over the place maintaining small files and figuring out where it should write the next entry – this is common and a big performance hit. Second, the VLFs are too large and SQL Server is reserving a ton of space when only a small part of the tail end of the VLF is in use, then trying to clear it all at once – this is less common and less of an issue.

In addition to this, VLFs each need to be processed when restoring or recovering a database. Recovering is part of restarting SQL services, so you should be hitting this issue at least once a month with your Windows Updates. Every VLF adds some overhead to this process, and a huge number adds a lot of overhead to lengthen this process.

Details of this are on the Microsoft Customer Service and Support blog post How a log file structure can affect database recovery time, and it includes this eye-opening quote, “The first phase of recovering a database is called discovery where all the VLFs are scanned (in serial and single threaded fashion) before actual recovery starts.”

Are your log files pre-sized to take up the entire drive even though they only ever use 5% of that space? Is that hurting you on recovery?

How Big Are My VLFs?

I have a script for that. This is set to filter only logs that I want to look at, but you can comment out there WHERE clause on the final statement to see it all. Also, it would be extremely rare for me to look at individual VLFs, so this is only looking at the sums and averages for each file.

What’s the Right Size?

The most common thing you’ll see on the internet is people talking about having too many with the limit being 1,000 VLFs and ideally at 50 VLFs. I’m a little different and focus more on size than number. The concept I’m going for is that excessive context switching is what causes me grief, and a reasonable size VLF will keep that under control.

Now there’s no magic number that will work noticeably better in all cases, so I won’t give you static numbers and say they’re magic. What I can say is that too small is basically fragmentation with the overhead of context switching, too many increases recovery time, too big is huge chucks, and too few could cause extra growths as unused space is still being reserved.

It’s easy to visualize how smaller and excessive numbers of VLFs can cause issues, but that doesn’t mean huge VLFs are the way to go. Tammy Richter Jones (b) gets into details with her post If > 50 VLFs are bad, then 16 is good, right?. I’d love to see more on the affects of huge VLFs to have a more complete understanding if a 1 TB log file is better off with 50 20GB VLFs or 2,000 500MB VLFs. I’m not convinced that >50 is bad.

Personally, I like formulas a lot more than static numbers, especially since they allow me to avoid extremes without stressing out. The formulas I used broke it down into three categories:

Below 50 VLFs, this is a reasonable number, they get a free ride no matter how big they are.

Between 50 and 250 VLFs, the average size in MB has to be at least 1/3 of the count of VLFs.

For a file with 100 VLFs, the average VLF has to be over 33 MB.

Over 250 VLFs, the average size in MB has to be at least 1/2 of the count of VLFs.

For a file with 500 VLFs, the average VLF has to be over 250 MB.

This is good enough for almost any database. I don’t care about there being a couple small VLFs mixed in there because I’ll spend most of my time in the large ones. I don’t care about the numbers getting larger because the average VLF is large enough to avoid too much context switching. You’re not out to micromanage, so take a relaxed approach to this unless you have a reason to not be relaxed.

What’s Default?

Remember how I said there were no magic numbers? Well, that doesn’t mean there aren’t wrong numbers.

By default, every database is based off of model, which has a 1 MB log files growing in 10% increments. So 1/10th of a MB for the first growth, which is just a tiny fragment. If you leave it this way on large databases you could end up with a 1 TB log trying to grow 100 GB at a time, which your users will notice.

Be default, you get the worst of both worlds! The goal here is to avoid extremes, so you’ll want to change the growth rates away from the default.

How Do I Change the Size?

There’s only one way to change the size of your VLFs. Delete them and recreate them.

For every database EXCEPT TempDB, this means shrinking the logs (it deletes VLFs to get to the size you want), then growing the logs again. Both manual and automatic growths will split the new physical space into VLFs, but that depends on your version of SQL Server.

So if you have too many VLFs you can shrink your log file. It will delete VLFs that are not in use to make it smaller. Then grow the file in increments that give you VLFs sized how you want them. If you’re going for 500 MB VLFs then you grow your file 8,000 MB at a time.

The VLFs that weren’t deleted in this process, because they were in use or you didn’t shrink the file as far as you could have, will not be affected. This means you’ll have different sized VLFs throughout your file, but that doesn’t really matter. What does matter is that you don’t have any ridiculously large VLFs and you’re spending most of your time in properly sized VLFs.

Here’s how I do it. Shrink it as much as possible. Shrinking is never as easy as it should be, so verify it shrunk it a significant amount. If it didn’t, take log backups and try again, repeating this process a handful of times if needed. Then, if I want the log to be 32,000 MB, I just grow it by 8,000 MB 4 times.

Assuming you were able to shrink this down to 10 VLFs remaining and you want it to be 32,000 MB, this would give you 74 VLFs with all new ones being 500 MB. It goes over the 50 VLF limit I’ve seen elsewhere, but I’m very happy with these sizes and numbers.

With 8,000 MB growths the new 2014 calculation won’t kick in until the log was already 64,000 MB, and at that point an 8,000 MB VLF probably isn’t a bad thing.

Autogrowth Rates

I recommend manually growing larger log files between 8,000 and 16,000 MB at a time, but not autogrowths. The process that causes a file to grow, and any other process that needs that space, will sit around and wait for an autogrowth to complete. That could take a minute on descent disks, which is longer than many timeouts I’ve worked with.

Manually growing files is best, but almost every database will rely on autogrowth. Know your timeouts, know how long it takes to grow a file, and size it accordingly. I like to do 1,000 MB here, but will do as less if I have to. If you can’t make a descent sized growth based on this, be more paranoid about manually growing files while leaving autogrowth set to smaller sizes.

The reason for 1,000 MB is that this is the largest size that will still give you 8 VLFs instead of being split into 16 VLFs, with the actual limit being at 1,024 MB. If you did 2,000 MB, you’d still end up with 125 MB VLFs. You can get that up to 250 MB each at 4,000 MB, but you’re telling your users to wait while 4,000 MB is zeroed out before they can move on. Although the calculation changes for larger logs on SQL 2014, I’d still stick with 1,000 MB for a good autogrowth size.

Note, there is a resolved bug that would cause issues when log files grew in 4 GB increments, so you’ll see a lot of people using 8,000 MB instead of 8,192 MB. We aren’t using an effected patch level anymore, but DBAs are made to be paranoid. We’re not just lazy at math, although this doesn’t rule out that possibility.

So TempDB never has too many on startup, but is this too few? What if TempDB’s log is 40 GB, do you want a 2.5 GB VLF? Is setting up TempDB to have 5 log files that are 8,000 MB each so it starts up with 500 MB VLFs a better idea?

Seriously, someone answer this one, I’m curious myself.

Changes Have Risk

The script above creates scripts to make changes. It’s not perfectly safe to make those changes. Test it, understand it, and be careful running it.

One of the most important pieces is that the scripts it generates will try to shrink the log as much as possible before growing it again. In terms of cleaning up VLFs, this is perfect. In terms of trying to run this during the day, it risks transactions failing because they’re trying to grab the next VLF as you’re running a shrink job to delete all unused VLFs.

You might be able to get away with this during the day most of the time, but it’s best to wait for a maintenance window to be safe.

Tony Rogerson (b|t) at the time of this writing has only put one post on his new blog, Transaction Log Concepts: Part 1. If this is his typical quality of work, we have a lot to look forward to. I’m personally looking forward to the other 3 parts of this 4 part series.

Test It

If your database is absolutely critical to the point that squeezing an extra couple milliseconds of performance throughout the day would be noticed or if a server coming back online 15 seconds faster would save thousands of dollars, test it. See how it performs in load tests and service restarts using your test servers and your database.

If you have a 1 TB log file and verified it needs to be about that big, what happens when you size it so you have 2,000 VLFs that are 500 MB each? What happens when you have 100 VLFs that are 10 GB each? Did it change recovery, any replication you’re doing, backup times, etc.?

Know that this can make a difference and that this is just another configuration change you can test out. Crazy numbers on either side can be bad, and perfect numbers in the middle aren’t going to come from some stranger on the internet that never saw your server.

Rate this:

I see TempDB using more memory than I feel it should and found a way to resolve it. Previously I dove into technical details and questioning what I saw in my post TempDB Memory Leak? which is great for proving and debugging an issue. Now I want to step back and talk about it from a practical perspective.

To keep it practical, I’m going to walk through a single server, step-by-step. What I saw that caught my eye and how I applied my workaround. This is a real, production server with screen shots from Idera Diagnostic Manager for both before and after comparisons.

What I’m Seeing

A large portion of my buffer pool is being utilized by TempDB pages which are unallocated on disk. The easiest way to see this is this quick query:

Since these are all pointing to real, but unallocated TempDB pages on disk, I was able to develop a workaround of simply having fewer TempDB pages on disk. AKA, I made TempDB data files smaller.

Sizing TempDB

Here we’re talking about being hurt by TempDB being too large, but we can’t lose sight of TempDB being too small hurting you as well. Specifically, when your server is the busiest with a small TempDB it will have to stop and grow the files, then deal with any fragmentation that process caused. So, while I’m not going to recommend filling your dedicated drive, I’m not going to recommend making TempDB tiny, either.

My recommendation is to monitor your server, find out how much space TempDB uses, and make your TempDB significantly larger than that. The longer you have watched your server, the better. The goal is still to avoid almost any growth possible, but I’m adding in a “don’t go crazy” clause.

You can do this yourself as I talked about in Monitoring Database and Table Sizes using the function call FileProperty(<filename>, ‘space used’). If you do it this way, make sure you have this running for a minimum of several weeks at least once an hour, day and night. It’s possible, not difficult to do, and not recommended for servers that have a reasonable monitoring system already in place.

If you’re doing this to your production servers you should already have monitoring software that’s watching file sizes for you, even if it’s only tracking this in the background with no obvious way to find the information in the GUI. This is the situation I find myself in using Idera Diagnostic Manager. The data’s there, but the GUI is set up for more immediate concerns. I need this query to get what I want out of it.

My recommendation is to take the peak size outside of a one-time process and make the total TempDB data files at least 150% of that size. If you have 4 TempDB data files and the peak size used is 4 GB, 150% of that is 6 GB, so make each file at least 1.5 GB. Then set autogrowth to something reasonable because one-time processes will happen.

My Before Observations

All of the screen shots and trending information are from Idera Diagnostic Manger, but you should be able to get the same basic information out of any monitoring software.

I had a server with rather low PLE and a lot of physical I/O.

The physical I/O is difficult to see here because this server has a lot of writes. When you look closer you can see that it rather regularly went to 400 page reads per second.

All of those reads caused lots of waits on the server. In my 12-hour sample period I saw a total of 34,000. The units shown for this field are fine for comparison reasons, but they appear to be a total of the ms/s on the chart for values collected every 6 minutes, not the total waits. You have to query the tables behind Idera Diagnostic Manager to get the real values. In this case, the total waits were 138 minutes for shared and 49 for exclusive latches, for a total of 187 minutes of waiting for the 12 hours.

Just seeing this I wanted to request more memory on the VM, after all it only had 16 GB. However, I made it a rule to never add memory without justifying what’s in memory first. To do that, I used my query on Query the Buffer Pool, and this was the first line in the results:

Of the 8.5 GB of memory used by the buffer pool, about 6 GB of that was used by unallocated space in TempDB. I can accept some because of how TempDB caches temp tables and other stuff, but to have 70% of the buffer pool used up while I’m seeing low PLE and high I/O is not acceptable.

My Fix

I had the files set to practically fill the dedicated drive according to best practices, so each of the 4 files was 7,000 MB adding up to almost 28 GB on a 30 GB drive.

Using the query in the Sizing TempDB section above, this server has never used more than 1 GB of data in TempDB at one time. While that makes for some interesting questions on how it’s using 6 GB of cache, it does give me a lot of room to reduce the size.

There were no complaints about performance due to what this server is being used for, so I didn’t try to shrink TempDB. Instead I took the patient approach an resized TempDB, which took effect the next time SQL Services restarted. Then I waited for our scheduled server reboots for Windows Updates.

Note that my recommendation above would have been to make the files at least 1.5 GB and I made them 2 GB here. I’m not worried about exact numbers, I’m worried about being reasonable sizes that also avoid growths. With peak observed usage at 1 GB and TempDB sized at 2 GB, any growths should be very rare.

Overall, the data files will grow at 512 MB at a time, which, when it happens, is large enough to avoid extreme fragmentation yet small enough to avoid extreme waits. This should only happen during “I didn’t mean to do that” or “how’d it get that execution plan” moments, but it will happen.

My After Observations

After the change I waited a couple days to let the cache stabilize after a reboot. We rebooted on Saturday, and all of these numbers are from the Tuesday after.

When I looked again, TempDB was using 1/10 of the space it was before for unallocated pages in memory. It’s not perfect, but it’s much better.

TempDB using less cache allowed PLE be a lot more stable, regularly staying well above 2,000. Using Jonathan Kehayias’s formula, PLE should be above 1,050 on a server with 14 GB of cache, so this is looking good.

With data staying in cache more, physical I/O also dropped. Before it was normal to see spikes to 400 page reads per second, now you’re seeing about half of that.

Less activity also meant less waits. The couple spikes still went almost as high, but those are things you’ll fix more through tuning than memory management. The totals at the bottom are what I’m focused on right now, the PageIOLatch_xx waits combined added up to about 11,000 for the day, so about 1/3 of what it was before.

Again, I don’t like how Idera Diagnostic Manager calculates the totals. When I queried the tables I saw that there was 60 minutes of waits on shared latches and 13 minutes on exclusive latches, for a total of 73 minutes. This is in comparison to 138 + 49 = 187 minutes before.

The Downside

Even with the new, smaller size, TempDB is a lot larger than it needs to be so it would be very rare to see any growths in the near future. However, I have to keep in mind that the load on my servers will always be increasing, so I may end up hitting autogrowth down the road. Then I’ll reboot the servers each month with Windows Updates, it’ll shrink TempDB back down for me, and it will have to grow again the next month.

Because of this, I’ll need to visit every one of my servers from time to time (semi-annually?) to see if this is occurring and increase the size of TempDB if it is. I’ll use the same script for determining the peak sizes because I have software that’s capturing that already, but I could find other methods to get that information if I needed to.

The only other downside I see is with servers that hit a high peak TempDB usage overnight then never come close to it again during the day. This workaround won’t do much for those servers. You need that space overnight, so trying to make the size smaller than what it needs at 2 AM will just make everything worse. However, knowing about this will drive you nuts enough to look into the overnight processes to see if you can lower peak usage. Perhaps you can spread out some concurrent tasks or tune some bad queries that never mattered due to their timing. It’s not a bad thing to do this work, it’s just that this probably wasn’t a good time for something to climb up your priority list.

Take a Look

Take a look at your servers. How much cache is TempDB using? Does TempDB ever even use half of the space allocated to it? Is it causing issues on your servers?

Never make a change just because you saw someone say it’s a good idea, especially if they’ve never seen your servers before. Look at the numbers on your servers for yourself, ask questions to understand it, and make the best decision for those specific servers.

If you do make this change, check TempDB before your next reboot. Did it have to grow? If it did, you need to change the size manually so it doesn’t have to hit an autogrowth every month. If you shrunk it down to 2 GB total then it grew to 2.5 GB, consider making it 3.5 GB to avoid future growths while keeping the files reasonably sized.

If this was an issue on your servers and you feel SQL Server shouldn’t have functioned this way, please visit the Connect Item on this. An up-vote is great, a comment or anything else is even better.

Rate this:

All bloggers started as beginners. We picked things up, learned them, and got to the point where we wanted to start sharing our knowledge. The problem is that most of us moved on to talking about advanced topics and left people to fend for themselves on the difficult entry-level learning curve. My blog, Simple SQL Server, was actually created with me saying I was going to fill that gap, right up until the time I figured out the best way to learn advanced topics was to write things down as I learned.

Entry-Level Content Challenge

It’s very simple. Create one blog post a month that someone just learning your job could understand and benefit from, use the #EntryLevel tag on your blog and notification on twitter, and link to his original post.

This is for everyone, I don’t care what job you have. I’m a SQL Server DBA, Tim’s a SQL Server DBA, but every profession has beginners that could use some help.

We’re Outnumbered

One of the things, if not THE biggest thing, you hope to do with a blog is to make a difference. Then you talk about advanced topics to help other people near your level keep moving forward. You also realize that teaching the details on these topics accelerates your learning and confidence faster than you thought possible. It’s great; I recommend everyone does it.

However, there are a couple problems with this logic. By the time people feel confident enough to blog, they aren’t talking about the fundamentals anymore. Also, for every advanced person in any field, there are many, many more entry-level people who could benefit from learning the fundamentals properly from the start.

Bloggers aren’t talking to the largest portion of their potential audience, who just happen to be going through the most difficult part of the learning curve.

It’s the Right Thing To Do

In no specific order, here’s why it’s the right thing to do. This list contains selfish reasons, moral reasons, and everything in-between.

You want to reach a larger audience, and there’s more people who need this content.

You were a beginner once and learned off of others.

You wish you learned the fundamentals right from the start, but you didn’t always.

You learn everything you teach better, even if you consider it simple.

Tim challenged you, and it’s hard to say no to a challenge.

You always wanted to make a difference.

New Bloggers

This is also a great opportunity for new bloggers. Talk about subjects you know well, even if you’re not at the level you feel comfortable teaching advanced topics.

The content is the only difficult part of blogging. Setting up a blog and creating a post is practically identical to setting up a new email account and sending an email. If you want to get into blogging and need help, let me know.

If Einstein Can Do It…

Einstein said “If you can’t explain it simply, you don’t understand it well enough.” I took this quote to heart enough to name my blog after it. Try explaining the fundamentals simply, you’ll see how much your understanding of the topic increases to accomplish that feat.

Rate this:

It’s easy to query XEvents to see some of the basic info in deadlocks, including the system_health session which is already capturing this information by default.

This script will, by default, read from system_health. However, change the name of one parameter at the start to have it read from any active session. I pull the file name automatically, so the session has to be active to avoid many changes to the script.

System_health session

The system_health XEvent by default is turned on, captures deadlocks, and is limited to four 5MB files. You can change all three of those, but it’s not common for people to do that.

Yes, you’re limited to more recent events, but the information is there. If you find the limitations are too much, you have a couple options that should be considered in this order.

Is anything else capturing deadlocks? For me, I have Idera Diagnostic Manager capturing deadlocks as well for my production servers. This is my main source of looking at deadlocks, so I actually rarely use system_health for this purpose in production. The first goal is to avoid doing anything additional.

Consider setting system_health to use bigger files (not too big) or have more rollover files. The more you do the more resources it will take to read it all at once, so don’t go crazy. However, if you have 5 days of data when you wish you had 7, it’s not horrible to double the size of data you retain. The second goal is to increase disk usage before you add more overhead.

Consider a separate XEvent session. Ugh…I wish I didn’t have to throw this out there as an option. Love your servers, don’t add more overhead if you don’t have to. The last resort is to not end up here at your last resort.

Idera Diagnostic Manager

At the time of me writing this, I have a script to look at all of the deadlocks in Idera DM that I use quite regularly for my production servers. However, they just came out with a new major version and I want to make sure my query runs there before posting my script publically. If you don’t care and just want the script, email me.

Viewing the Deadlock

The details of this script give you a great overview of the deadlock. You know the code involved on each side, the tables and applications involved, etc. It’s enough to look for trends, know if a deadlock was one you expected, and other basics.

However, this is just a summary list of your recent deadlocks, it’s not the details. For me, I’m very happy just looking at the XML output to see the details. This is how I look at it from XEvents and my monitoring software. That makes me odd, and I’m ok with that.

Rate this:

Normal DBAs will sum up shrinking a database data file in one word: NO!

I’ll talk to you about this subject because I’m not normal. Still, my goal is that you’ll go from “why can’t I shrink my files” to “I don’t want to shrink my files.”

Truth be told, even when I HAVE to shrink files, I don’t want to.

Basically, you needed the space at one point in time, so you’ll probably need it again. Unless you changed a process or something extremely rare happened, you’ll use the space again and there’s no reason to take on the issues involved with shrinking just to have a file grow again. While this advice is mostly the same for log files, it’s different enough where I’m making a separate post for shrinking database log files

When should you shrink data files

First, a process changed where you’ll never need that amount of space again, such as the database used to keep 10 years of history but now an automated process keeps it pruned to 6 months. This change also left you with a very significant amount of free space that you need for another purpose.

Really, that’s about it.

What happens when you shrink data files

The most common way to shrink a file is to have it reorganize pages before releasing free space, so I’ll cover that. It basically has two steps:

Fragment the file as much as possible (killing performance)

Truncate the end of the file

By “reorganize” it means take the tail end of the file after the size you said you’d like the file to be shrunk to, find every populated page of data after that point, then move those pages to the first available spot you find in the file. When the end of the file is empty, truncate it.

It’s an expensive process that leaves a big mess behind, but there are times you’ll need to go through with it (kicking and screaming). If you do, schedule an index maintenance job immediately after this to undo most of the damage you just did and make sure the server still loves you.

Think about this, you’re considering kicking off a process where you have to tell the server you’re sorry afterwards. You better have a good reason for this. A 100 GB database with only 2 GB used may bea good reason to shrink. A 100 GB database with 75 GB used is normal and healthy.

If you are going to shrink, make sure the database won’t be using that space again. If you have a 100 GB database with only 2 GB used, does it populate, process, and purge 90 GB in an overnight process? Find out before you shrink.

Manual Shrinking

If I haven’t scared you off yet (and I hope I did) then here’s how you shrink.

There are Shrink Database options as well. I don’t use it and don’t recommend it. You know what files you want to shrink, don’t shrink them all. When this is used it’s typically because people want to shrink their data files, but they mess up their log files while their at it. Be nice to yourself, don’t use shrink database.

What about the Auto-Shrink option

No.

Just no.

It’s not a feature, it’s a threat.

If you have a certain amount of free space it will automatically shrink your database causing all the problems I just discussed, but it will probably kick in during your busiest time of day. Then it won’t follow that up with index maintenance, so it just left you hopping all over your disk to find data. The best part is that it didn’t check to see if you would need that space again, so it’ll probably grow tomorrow to get that space back.

In the example above of a 100 GB database that only uses 2 GB during the day but populates 90 GB for overnight processing, those 2 GB will be horribly fragmented, you’ll spend the resources shrinking, then you’ll take the performance hit growing the database every night as it puts the growth anywhere it can which typically means physical file fragmentation as well. In the mean time your monitoring software will show that you have tons of free space, right up until you get the disk full error. You can auto-shrink that database and get all those issues, or you can just let it do its job as you do yours without a maintenance nightmare.

To be fair, we all word our feelings on this differently. I say no, Tom LaRock says never, Paul Randal says Turn It Off!, and Brent Ozar is colorful enough to come up with the term Hamster Wheel of Death.

As of the time of this writing, no one has used the term “Magical Unicorn” to describe this feature.

Sum it up

Don’t shrink data files.

If you didn’t listen to #1, why?

Follow it up with index maintenance.

Every time you try to automate this process a unicorn dies.

Entry-Level Content Challenge

I’m taking Tim Ford’s (b|t) Entry-Level Content Challenge, and this is my first post with the #EntryLevel tag. The challenge is to have one post that earns that tag each month, but I hope to exceed that.

Rate this:

On February 6, 2016, Cleveland is hosting a free training event for SQL Server. This has a lot of the great stuff from the big, paid events, and skips some of the negatives.

There’s a great team from the North Ohio SQL Server Users Group that took on a very difficult task to put all of this together. They hand selected every presentation, being forced to turn away over half of the abstracts and many speakers. What they came up with was a variety of topics that span the breadth of SQL Server and the experience levels of the professionals working with databases.

What is this SQL Saturday?

For those of you who have never been to a SQL Saturday, here’s how this one works. Other SQL Saturdays will vary from this slightly, but not too much.

There are 6 training rooms with hour-long presentations running simultaneously in each. You move from room to room between each session choosing the ones that apply to you the most. The schedule for the sessions is posted in advance, and it’s impressive.

An optional $10 lunch, which is typically very obvious that no one is making a profit off of, is the closest thing the event has to an entrance fee. The stuff you’d expect to cost you is provided by volunteers and sponsors who pull together the venue, presenters, and community for you.

There’s time to network with vendors, attendees, and speakers before the event, during breaks and lunch, and often end up in conversations that last throughout the next session. This differs from the big events in that it’s much smaller and personable with most attendees being relatively local. The people you meet at a regional event like this are going to be more relevant to your life because they’re part of your local community.

The event at Hyland Software concludes with prizes ranging from $100 gift cards to new laptops being handed out to random attendees. Yes, you can basically get paid to go to free training, which I still don’t quite understand.

Then there’s the after-party, also provided by the SQL Saturday volunteers and sponsors. There’s food, fun, and more time to get to know some great people.

The venue can’t hold everyone who should come, and often can’t hold everyone who wants to come. Register now to make sure there’s room for you, and make sure to cancel if your plans fall through.

What is SQL Saturday?

Not just this event, what is SQL Saturday as a whole? It’s a program specifically set up to help everyone in the community to develop.

The obvious people benefiting are the attendees who get a chance to see all the sessions and meet the vendors. You can see how they would benefit from here, but not quite how much until you’re at the event.

The less obvious are the speakers, some speaking publically for the first time. As one of the speakers, I can personally say that I’ve grown more as a person and as a professional than I thought possible. It’s a step I highly recommend, and one I’m very grateful to have with SQL Saturday.

The even less obvious are the vendors. They’re speaking to the public, not just their established customers, getting candid feedback on what works, what doesn’t work, and how they can make their offerings better. Even if they don’t make a single sale from attending the event, they can always come out ahead.

Thank you:

These event are around the world with one or more occurring almost every weekend of the year as an annual event in most major cities. See SQL Saturday’s website for a full list of what’s coming up. If your city isn’t coming up soon, look in past events because most cities hold it about the same time every year.

Who’s speaking?

Where do I start? Well, by listing EVERYONE on the current schedule, along with the blogs and twitter handles I’m aware of.

Note that the schedule may change some before the event, but it never changes too much.

What do we ask of you?

Show up.

Despite the amazing list of speakers, we’re all volunteers. That means we do it because we want to do it, and truly enjoy it. If you want to help us enjoy it even more, be involved. Talk to us and the other attendees, ask questions, and give feedback.

If you want to absolutely blow us away, get in touch afterwards and let us know the difference it made for you. Yes, we want this, it will absolutely make our day! Let the organizers know what you think of the event. Tell the presenters how you were able to use their advice at work, and even ask questions down the road. Get in touch with the vendors to discuss their products, even if you aren’t planning a purchase, to give them real feedback and see what’s out there. I used to think this was imposing, but now I realize this is the one of the most awesome compliments you can get.

The most important part….

Show up!

Hope to see you there!

UPDATE – 2016-01-13 – On the waiting list

If you registered before today, you’re in. If you’re registering now, you’re on the waiting list.

If you can’t make it, please cancel your registration so someone on the waiting list can get in. We’ll miss you, but at least there will be someone else we won’t be missing.

Rate this:

Post navigation

Consulting

I am now working for Plus Consulting on a team that provides database performance, business intelligence, and database administration consulting services. In the past there has been requests for me to help during business hours that I had to decline, and I'm glad to say that's no longer the case.

Category Pages

I wrote a lot about these categories. Since I had trouble finding my own posts, I organized them and wrote a quick sentence or two about each one for my personal use. Then I decided I should share these pages, too.

Two pieces of advice:

Give thanks

If you benefit from something, say something. Also, if you would speak up if something wasn't done right, you should do the same if it is done right. While saying something is easy to do, for some reason it's more difficult to follow through with than it should be.

I'm personally working on this. Specifically, if I'd visit someone's cube if they didn't resolve a ticket as quick as I expected then I'll visit their cube to say thank you if it's done quicker or better than expected.

Give back

My career benefited greatly from the work others made public, so I thought I should do the same...it was a selfless concept. Then I realized how much I learned by putting my knowledge together in posts and presentations. I have yet to write a post, prep a presentation, deliver a presentation where I didn't learn more, even on topics I thought I knew well enough to just do everything off the top of my head.