Performance impact: a large number of virtual log files – Part I

It is generally known that having a large number of virtual log files (VLFs) in a database transaction log is undesirable. A blog post by the Microsoft support team in Stockholm showed that a large number of virtual log files could seriously lengthen the database recovery time. Tony Rogerson also reported that lots of virtual log files were bad for triggers and log backups on SQL Server 2000.

This blog post explores two questions:

Is a large number of VLFs in a transaction log still a significant performance factor in SQL Server 2008?

Does a large number of VLFs have an adversely impact on application-related operations such as INSERT, UPDATE, and DELETE?

As I’ll show in the rest of this post, the answers to both questions are yes. Let me first describe the test, and then present the test results.

The test databases

I created two databases: dbVLF_small and dbVLF_large on the same SQL Server 2008 instance (Enterprise x64 Edition and the build number was 10.0.1600). The host was an HP DL585 with four single core 2.2GHz Opteron processors and 32GB of RAM (28GB of which was allocated to the buffer pool). The following table highlights the key properties of these two test databases:

Property

Property value

dbVLF_small

dbVLF_large

Data file size

20GB

20GB

Data file location

E: drive

E: drive

Log file size

10GB

10GB

Log file location

F: drive

F: drive

Recovery mode

Full

Full

Number of VLFs

16

20,000

As you see, the only masterial difference between these two databases was the number of virtual log files. The dbVLF_small database had 16 VLFs (a small number of VLFs), and the dbVLF_large database had 20,000 VLFs (a large number of VLFs). It’s probably a bit extreme for a database to have 20,000 virtual log files. But it’s not completely unreasonable if you let your database log file to grow in small increments. Note that my intention is to highlight the impact in case if your database does end up with a large number of VLFs.

The test table

An identical table, called customer, was created in both databases. The DDL for the customer table is as follows:

3,000,000 rows—with data as specified for the TPC-C customer table— were bulk copied into the customer table in both databases. The index was then rebuilt, and sp_spaceused showed the table size to be about 2GB.

The tests

Three simple tests were run against the customer table in each database. The common characteristics of these three tests were that they were a single large transaction.

The following chart gives a better visual representation of the same results:

In all the three cases, it was significantly (should I say dramatically) slower to perform these common SQL operations when the database had 20,000 VLFs than when the database had 16 VLFs. The insert was about four times slower. The update was about eight times slower, and the delete was about five times slower.

So the test results confirm that it is definitely a significant performance issue in SQL Server 2008 to have a large number of virtual log files in a transaction log. Moreover, the test results show that the problem can be felt by common SQL operations such as insert, update, and delete in large batch processes.

What does this all mean? First, pay attention to the often-repeated best practice advice: do not grow your database log files in small increments (because that is often how you introduce a large number of VLFs into your transaction log). For instance, if you know you’ll need 10GB for your log, it’s best to allocate 10GB in one fell swoop. Secondly, if you do find a large number of VLFs in your transaction log, it may be worth the maintenance effort to reduce the number. You can reduce the number of VLFs by first shrinking the log file to a minimum size, and then allocating the required space in one ALTER DATABASE statement. (SQL Server MVP Tibor Karaszi has a thorough discussion on shrinking log files at his site)

------

By the way, the 20,000 virtual log files were generated as follows. First, create the database in the full recovery mode with a 512KB initial log file size and 512KB file growth increment. Then, backup the database, and create a dummy table with a single char(8000) column. Finally, inserting data into this table in a loop until the log file size reaches 5000MB. This is essentially the same procedure used by the Microsoft support team in Stockholm in their post.

To prepare for the insert, update, and delete tests, backup the log, and expand the log file to 10GB.

Linchi, I wonder if small transactions would see the same or similar reduction in throughput. If you delete/insert/update 1-100 rows at a time in a loop (with begin/commit trans??) will there be a net reduction in executions per unit time with many VLFs?

That works great. Now when I see a number of rows coming back would this also indicate fragmentation of the database log? We have some servers database files on a SAN but currently most are on direct attached storage. From one of Linchi's previous blogs he had mentioned the potential performance issue with fragmented files. I would like to get some information from SQL server that can help me try an aleviate fragmentation and VLF's to improve performance.

Chris, you could have 2 forms of fragmentation problems. First your OS files could be fragmented. This is common when DBAs leave default settings for file size and growth. It can have a significant affect on performance. A windows file fragmentation checking tool is appropriate here.

The data structures inside the sql server files can also become fragmented - in 2 ways. Pages can be in a less-than-optimally-full state and pages can be out of order. These too can lead to performance problems. Much has been written and is available online and in BOL to diagnose and correct these issues.

Great post. I have never taken the time to do a comparative analysis like this for this example. Now I'll have one more tool to use when showing people why it's best to "right size" a transaction log file instead of allow growths.

Where did you get the 3m rows from? Or did you use a custom routine to generate the data? I would like to leverage the same data within our environment to show what type of performance degredation we are experiencing by performing your test against our own db log files with varying #'s of VLFs.

Linchi, I may be late to the party here but I have one glaring question about your technique. Did you reset the growth parameter on dbVLF_small to match dbVLF_large AFTER you created 20k vlfs but BEFORE you ran the queries? My contention here is that the operations were impacted by the file growth operations themselves and not necessarily the sheer number of VLFs. Although this doesn't explain the delete behavior.

In reply to Sean, if you read the article's paragraph starting with 'the 20,000 virtual log files were generated as follows' you'll see that all the growth was done in a loop up front of the timed test, so file growth time can not be a factor... but there is no mention of running a disk defrag after the repeated growth, and it might be interesting to see how much of the time difference was down to more VLFs and how much was down to disk fragmentation.