Pages

Sunday, 1 April 2012

How to Plan SQL Server Database Files

How to Plan SQL Server Database Files

One of the most important tasks you can do in SQL Server is to setup your
data and log files. Not getting these files setup correctly can be one of the
biggest causes for production problems whether it’s disk contention, space
usage, or something else. And honestly this level of planning is often
overlooked and by the time the problem is discovered the application and its
users have already suffered. So let’s get into some good discussion about how
to setup your database files.

Log files

We’re going to start with logs first because they’re the easiest and probably
the ones you’ll touch most frequently. And we need to talk about placing your
log files on disk first. In general, you’ll want to place your log files on a
different physical partition than your data files. This is for 2 reasons. The
first is disk contention. Every transaction has to write something to the log
file before it can write it to the data file and if both files are on the same
disk, then the disk arm has to work twice has hard because it has to jump over
here to write the log and then jump over there to write to the table. Putting a
log file on its own disk is also better for the performance of the log because
logs write sequentially so if the log file is on its own disk the disk arm has
very little moving to do to get to the next place it needs to write so it’s much
faster. The second reason you want data and log files on separate disks is for
recovery. If the data partition fails you’ll want the log on a separate set of
disks so you can still recover the logs and roll the last transactions forward
so you don’t lose any data.

Several log files

It’s a common misconception that you will get a performance gain by using
several log files. This is something we see quite often and it simply isn’t
true. Log files are written sequentially which means that each log file is
filled up before the next one is written to. So if you have 4 log files (Log1,
Log2, Log3, Log4), SQL Server 2008 will fill up Log1, then fill up Log2, etc.
This is different from the way data files behave and we’ll discuss that in a
minute. The only reason to have multiple log files on multiple partitions is
for space. You may need more disk than a single partition can provide. Oh, and
putting multiple log files on a single partition is just dumb. It gains you
nothing.

Data files

Now let’s talk about where you’ll place your data files. As we’ve already
said it’s a good idea to place them on a different physical partition from your
log files. And we know we already said that, but experience has shown us that we
could fill an entire page with that exact advice and a good portion of you still
won’t grasp the importance of it. So we’re really going to hammer that point
home in this article.
So while your data files are separated from your log files you may feel free
to have multiple data files on multiple physical partitions to take advantage of
performance gains. Now, we have to say again that these have to be physical
partitions because logical partitions are still on the same physical disk and
it’s the disk arm contention you’re trying to avoid. Data files, unlike log
files, use what’s known as an equal fill algorithm. This means that all of the
files are filled equally as much as possible. So let’s say you have 4 data
files (Data1, Data2, Data3, Data4). When you write to the database, SQL Server
2008 will write to these files in a round robin fashion and they should grow at
more or less the same rate. You can also place tables inside specific files so
you have a good level of control over how you split up the I/O workload in your
database. And while the number of files you need to optimize your workload is a
subject that’s up for debate in the community, for most systems it won’t make
that much difference. However, it is a good idea that no matter how many data
files you decide to have, to make them all the same size. This goes back to
that equal fill algorithm we were talking about before.

File growth

Another good topic is file growth. This is a mistake that many beginners
make. They accept the defaults for file growth and that’s just asking for
trouble. We’re going to give you some advice on how to set your file growth and
you can adjust it to suit your needs, but at least you’ll know the
arguments.

The best thing to do is to set both your data and your log files as large as
you ever want them to be. Depending on your version of SQL Server, it can be
very expensive to grow files so setting their size ahead of time can alleviate
performance problems before they even start. So if you’ve got a single
partition dedicated to your log file, then go ahead and make your log the same
size as your partition. If it’s dedicated then you’ve got nothing to lose. And
the same goes for your data files. If they’re on dedicated partitions (and they
should be), then you’ve got nothing to lose.

The next best thing is to set your files to autogrow by fairly large
predictable increments. The default autogrowth is 1MB for data files and 10%
for log files. I always grow data files by at least 1GB and often times even
more. If you’re going to suffer the expense of growing a file you don’t want to
do it 5 times a day, so make it worth your while. And 10% is too
unpredictable. As the log grows the 10% marker is going to get bigger too so
you’ll actually be growing your log more and more each time.

Set all your data files to grow at the same rate, and all of your log files
to grow at the same rate. Your data files don’t have to grow at the same rate
as your log files, but they should grow at the same rate as each
other.

Here we talked about separating your data and log files onto separate
partitions, as well as some of the theories on why we recommend the things we
do.