What about moving LOB data?

Background

In our Internals and Performance Immersion Event (IE1) in Atlanta this past December (12/2011), a fellow SQL MVP (Aaron Nelson, @SQLvariant) started a discussion about moving LOB data. He was sure that you could move it using DROP_EXISTING but he wasn’t sure of the exact scenario. The subject didn’t really come back again but it stuck in the back of my mind until last week. During another Immersion Event (IE3 – High Availability/Disaster Recovery) where we discussed ONLINE operations and data balancing this came right back to the forefront…

QUESTION: How can I move LOB data associated with a table?

ANSWER: It depends. There are actually a couple of ways… this week we’ll tackle moving LOB data using OFFLINE operations. Next week we’ll look at how to do this using ONLINE operations (in all versions of SQL Server)!

Moving LOB Data with OFFLINE Operations

In versions prior to SQL Server 2012, an index that includes LOB columns cannot be built or rebuilt using online operations. However, even with OFFLINE index builds or rebuilds, existing LOB data is not moved. The one exception is when you change from a partitioned table to a non-partitioned table or vice versa. In honesty, the limitations and requirements around LOB data are a bit strange. When you create a table you can specify TEXTIMAGE_ON. This allows the LOB data for that table to reside on a filegroup other than where the data resides. However, this option (TEXTIMAGE_ON) is NOT available for the CREATE INDEX statement. So, where does the LOB data go? It depends on how you execute the statement!

To show you the behavior, I’ll create a simple scenario around a table called LOBTest. Here’s the definition for LOBTest:

CREATE TABLE [dbo].[LOBTest](

[c1] [int] IDENTITY(1,1) NOT NULL,

[c2] [char](8000) NULL,

[c3] [varchar](max) NULL

)

This table will reside in the filegroup that is the database’s default filegroup. If you’re unsure of what the default filegroup is, use SELECT * FROM sys.filegroups and see which filegroup has a value of 1 for the ier New">is_default column. Instead of allowing a table to be created on the default filegroup you can specify the filegroup by using ON filegroupname. For example,

CREATE TABLE [dbo].[LOBTest](

[c1] [int] IDENTITY(1,1) NOT NULL,

[c2] [char](8000) NULL,

[c3] [varchar](max) NULL

) ON [Filegroup1]

Additionally, you can use TEXTIMAGE_ON to define the text/image (LOB) data to reside elsewhere – from the data. For example,

>CREATE TABLE [dbo].[LOBTest](

[c1] [int] IDENTITY(1,1) NOT NULL,

[c2] [char](8000) NULL,

[c3] [varchar](max) NULL

) ON [Filegroup1]

TEXTIMAGE_ON [LOBFilegroup]

In this case, the data resides on Filegroup1 and the LOB data on LOBFilegroup. Without a clustered index, this table is an unstructured HEAP on Filegroup1. If you decide later that you want to cluster the table (which literally organizes the data based on the clustering key), then the “data” of the table will be moved and structured based on the clustered index definition. If you create the following clustered index then the data will reside on Filegroup2 but the LOB data will still reside on LOBFilegroup.

CREATE UNIQUE CLUSTERED INDEX [LOBTestPK]

ON [dbo].[LOBTest] (c1)

ON [Filegroup2]

NOTE: This article is NOT about choosing a clustering key or why/when it’s appropriate. However, this is generally a good idea. Beware, a poorly chosen clustering key can cause numerous negative performance problems in the database. Check out some of my other articles on this topic for more help on how to create an appropriate clustered index. You want to choose the key wisely!

What if you want to move the LOBData to another filegroup? The TEXTIMAGE_ON option is not allowed during a CREATE INDEX statement. So, if you want to move the LOB data to another filegroup – which is different than the data, then “game over” there is NO way to do this without exporting/importing or using INSERT/SELECT or SELECT INTO. Of these, the best option is INSERT/SELECT as you’ll need to first define the table (and you can specific exactly where you want both the data and LOB data will reside) before you copy it over. But, this is single, large transaction and while you’re migrating there will have to be some downtime to switch everything (by dropping the original table and renaming the newly created table). And, I haven’t even started to mention what happens if this table is being referenced by others. That further complicates this process.

However, there is an interesting trick, one that does NOT require dropping/recreating keys but will still be an OFFLINE operation in versions prior to SQL Server 2012. It’s to create the table on a partition scheme using the DROP_EXISTING option. This moves the data and the LOB data to the SAME filegroup (you do NOT have an option to separate them) but you can still move them and with little work. What will be required are a partition function and a partition scheme. The data does not have to be physically partitioned but the data will only move when the table/clustered index goes from partitioned to non-partitioned or vice versa OR from one partition scheme to another.

(1) Create a partition function using the primary key column’s data type. Add only one boundary point that’s much higher than any existing values in the table. You want to do this so that all of the data ends up on the same filegroup and in the same index structure(s). Choose a right-based partition function so that all of your data (lower than the boundary point value specified) ends up in the first partition.

(2) Create a partition scheme with two filegroups specified. The first partition (the one where the data will reside) should be naming the filegroup to which you want to move data. The second filegroup can just be PRIMARY as no data will reside there.

(3) Rebuild the clustered index on the new partition scheme using DROP_EXISTING.

(4) Create a new table with the same definition on the filegroup where the data resides.

(5) Switch out the partition into the standalone table.>

(6) Drop the partitioned table

Conversely, instead of running steps 4-5-6 you could rebuild the clustered index AGAIN on a new filegroup. This would move the data and the LOB data but not the nonclustered indexes. To move the nonclustered indexes you need to do this using DROP_EXISTING as well.

This is not something you want to first test in production so make sure that you have all of the steps correctly defined but the end result is that you CAN move LOB data without exporting/importing or using INSERT/SELECT.