The Hidden Costs of Row-Level Versioning

Row-level versioning (RLV) is a powerful new technology in SQL Server 2005 that supports concurrent access to data being modified. But using RLV increases the demand on your tempdb database. I'll show you how you can mitigate RLV's costs by monitoring the use of tempdb using some new tools included in SQL Server 2005. But before I do that, let's look at some hidden uses of RLV that can require increased monitoring of tempdb and costs that you might be incurring even if you've just upgraded a database from SQL Server 2000 and don't use any of the new SQL Server 2005 features. (For more information about RLV, see "Keeping Concurrent,"October 2005,InstantDoc ID 47389 and "RLV with a View," February 2006, InstantDoc ID 48721.)

Triggers and RLV

Triggers have long been a part of SQL Server and were the only feature prior to SQL Server 2005 that provided any type of historical (or versioned) data. Triggers can access two pseudo-tables called deleted and inserted. Inside the trigger, you can access these two tables as if they were real tables, but accessing them while not in a trigger results in an unknown object error. If the trigger is a DELETE trigger, the deleted table contains copies of all the rows deleted by the operation that caused the trigger to fire. If the trigger is an INSERT trigger, the inserted table contains copies of all the rows inserted by the operation that caused the trigger to fire. And if the trigger is an UPDATE trigger, the deleted table contains copies of the old versions of the rows, and the inserted table contains all the new versions. Before SQL Server 2005, SQL Server would determine which rows were included in these pseudo-tables by scanning the transaction log for all the log records belonging to the current transaction. Any log records containing data inserted in or deleted from the table to which the trigger was tied were included in the inserted or deleted tables.

In SQL Server 2005, these pseudo-tables are created by using RLV technology. When data-modification operations are performed on a table that has a relevant trigger defined, SQL Server creates versions of the old and new data in the version store in tempdb.This occurs whether or not either of the snapshot-based isolation levels has been enabled.When a SQL Server 2005 trigger accesses the deleted table, it retrieves the data from the version store.When a trigger needs to determine which rows in the table are new rows and accesses the inserted table, SQL Server again gets the inserted table rows from the version store.

The organization of the inserted and deleted rows in the version store is a bit different than it is for rows versioned by one of the snapshot-isolation levels. As I mentioned in "Keeping Concurrent," when SQL Server determines that row versioning is needed, it adds 14 bytes to each row involved in the versioning. These 14 bytes are used as a row pointer.The row in the actual data page can point to a row in the version store, and a row in the version store can point to another row. With snapshot isolation, each row points to the next older version of the same row, and the row with a NULL for the pointer is the earliest one. With triggers, the pointer works a bit differently. Each modified row in the base table points to the deleted version of that row, but each deleted row is linked to another deleted row so that all the rows in the deleted table are linked together. Similarly, all the rows in the inserted table are linked together.

We can verify that the version store is used, even if the database isn't otherwise enabled for row versioning.The code in Listing 1 creates a copy of the HumanResources.Department table in the AdventureWorks database, then creates two triggers on the new table.(Some of the code wraps to multiple lines because of space constraints in print.)The trigger simply returns a single row containing the number of rows and the size of all the row versions in the version store. The script includes the statements to turn off row versioning in the AdventureWorks database to confirm that the triggers' use of version store doesn't depend on any database option.

Now update a single row in the Department table and notice the count of rows in the version store. There should be one row for the inserted table and one row for the deleted table.

Next, delete a single row in the Department table and notice the count of rows in the version store.

DELETE dbo.Department WHERE DepartmentID = 12

There should be only one row for the deleted table. If your version cleanup thread didn't come through before executing this DELETE, you should see a total of three rows in the version store, with two still remaining for the previous UPDATE and one for the new DELETE.

Monitoring the Version-Store Size

Because tempdb is used for the version store, applications that make heavy use of triggers in SQL Server 2000 need to be aware of possible increased demands on this database after you upgrade to SQL Server 2005.Thus, you need to appropriately monitor and manage disk space for the version store.

The version store consists of append-only store units, which I'll explain shortly.You can look at the following SQL Server Performance Monitor counters in the SQL Server:Transactions object to see how many store units you have:

version store unit count: count of version store units

version store unit creation: total number of version store units created to store row versions since the instance was started

version store unit truncation: total number of version store units removed since the instance was started

Although the SQL Server Books Online (BOL) page that describes the counters in the SQL Server:Transactions Object describes the store units as allocation units, they're nothing like the allocation units used to store regular table data. (For more information, see "Managing Data Space," June 2006, InstantDoc ID 50009.) The append-only store units don't belong to particular objects, as regular allocation units do, and are highly optimized for sequential inserts and random lookups. Each store unit can store many row versions from many different objects. If versions are to be stored either from snapshot isolation or from triggers, SQL Server creates a new store unit about every minute. Once started, a transaction will store all its versions in the same store unit no matter how long the transaction is active.

You can examine the values from Performance Monitor programmatically, by using the Dynamic Management View (DMV) sys.dm_os_performance_counters. The query in Listing 2 shows you the values for the three counters mentioned earlier.The value of version store unit count, which is shown in the query results as the current number of store units, is the difference of the other two values; that is, the total number of store units created minus the total number removed.

The size of a specific store unit depends on the number of versions generated by the transactions writing to that store unit.(Keep in mind that queries running under snapshot isolation might actually read versions from any store unit because a query might need to retrieve data modified by many different transactions.)

A background thread periodically removes store units if none of the versions they contain are needed any longer. Individual rows aren't removed, only entire units. This deallocation occurs in the order the store units were created, so that a store unit created at time T110 can't be removed prior to one created at time T100, even if no versions in the T110 store unit are needed.

In addition to the three counters I mentioned, you should also monitor these additional counters in the SQL Server: Transactions object: free Space in tempdb (KB) and version store size (KB). You can also retrieve the version store size by selecting from the sys.dm_db_file_space_usage DMV and looking at the column called version_store_reserved_page_count. This value (in pages) can be multiplied by eight to get comparable values to those displayed in the Performance Monitor, which returns values in KB.The value for version store size (KB) should in fact be very close to the value of version_store_reserved_page_count times eight.Thus, you can use either tool to monitor the space that the version store uses.

Keep an Eye on RLV

Even if you never plan to use one of the snapshot-based isolation levels, you aren't off the hook as far as row versioning is concerned. If you're upgrading a database that has any triggers on it, row versioning will be used in that process. SQL Server uses the version store in tempdb to keep track of the triggers' inserted and deleted values, and you must monitor your tempdb growth as if you were using snapshot isolation. I've showed you some of the basic counters to monitor that will help you detect the space requirements of tempdb when using RLV. Next month, I'll provide more details about monitoring the version store and demonstrate its use with triggers when snapshot isolation is enabled.

From the Blogs

Many organizations today cannot use public cloud solutions because of security concerns, administrative challenges and functional limitations. However, they still need a centralized platform where end users can conduct self-service analytics in an IT-enabled environment....More

It is crucial to move away from data and analytics stored on individual desktop computers. Today’s solutions must promote holistic, collective intelligence. The strong, continued alliance between Microsoft and Pyramid Analytics helps make all this possible....More

To become a truly data-driven enterprise, many business leaders recognize that they must extend the capabilities of self-service business intelligence (BI) and analytics to more of their business users. Many BI tools tackle part of this need, but they don’t offer a complete enterprise solution....More