[This post has been updated from a post on my old blog. The original is here.]

Right after I wrote the original post on this topic, I found that I had written a blog post on almost the exact same topic about a year and half previously! It used a different example, and the last detail showing a way to remove forwarded records was not included as it was recently added to the product. That previous post has already been updated to this new blog site and is available here.

I received an email regarding using ALTER TABLE to defrag a heap, and my immediate thought was that it did no such thing, because by definition, a heap cannot be fragmented. SQL Server's defrag (or REORGANIZE) operation is only concerned with logical fragmentation, which occurs when the logical and physical order do not match. And as you should know, heaps have no logical order. Heaps are just a bunch of rows.

In an index, there is an order, based on the index key(s). So suppose you have an index on LastName. SQL Server guarantees the leaf level will have all your LastName values in logical order, but does not guarantee a physical order. Logical order means that SQL Server can easily access the data in order, by following pointers. The page with "Abbott" and "Anderson" will point to the page with "Barkley" and "Bollinger", which will point to the page with "Charleston" and "Chung". So we can access the data in A-B-C order. When you first create the index, the physical order will be as close as possible to logical order, so the A's might be on page 112 and the B's on 113 and the C's on 114. However, as you add, update or remove data, the physical order may not be maintained. Pages may have to be split, so if the B page gets full, a new one may be allocated at page 358, where half the B's will move to. Then the first B page, 113, will point to the second B page at 358, and 358 will point to the C page, 114. Page splits due to full pages are the main cause of logical fragmentation.

When you have a heap, there is no logical ordering and there is no splitting of pages.

But I thought a bit more, and realized that there are some other issues that one might consider to be 'fragmentation' in a heap. One issue might be whether the extents belonging to the table are contiguous or not. When you look at sys.dm_db_index_physical_stats, a high value number for fragment_count or a low value for avg_fragment_size_in_pages can indicate lack of contiguousness. (And I know, the BOL page actually talks about fragmentation for heap, but since the definition seems to suggest an ordering, mentioning a next and previous page, I tend to ignore the issue.) But whether contiguousness is a good thing or a bad thing is not what I'm going to discuss here.

Also, whether fragmentation itself is something you have to worry about is not something I'm going to discuss here. What I really want to tell you about is something that can occur in heaps that can be MUCH MUCH worse than fragmentation, giving worse performance for a table scan than the most fragmented table you can imagine. I'm talking about forwarding pointers. A forwarding pointer appears when you update one or more variable length columns in a row in a heap, and increase the size of the row so it no longer fits on the original page. SQL Server will not split the page, as splitting never happens in heaps. Instead, the new enlarged row is moved to another page, and a small forwarding pointer is left behind. The forwarding pointer basically is just the file, page and row number address of the new location. The enlarged row at the new location is called a forwarded record, and has information in the row header that indicates the row has been forwarded, and it also includes a back pointer to the original page.

Let’s look at an example to see these forwarding pointers. I'll create a table with two variable length columns. After I populate the table with five rows, which will fill the page, I’ll update one of the rows to make its third column much longer. The row will no longer fit on the original page and will have to move. I can use DBCC IND (or sys.dm_db_database_page_allocations) to get the page numbers used by the table and DBCC PAGE to look at the full page.

Now let's look at the header for this page. DBCC IND will give us the page number:

DBCC IND (testdb, bigrows, -1);

-- Note the FileID and PageID from the rows where PageType = 1-- and use those values with DBCC PAGE (I got FileID 1 and PageID 164)

DBCC TRACEON(3604);GODBCC PAGE(testdb, 1, 164, 1);GO

I'm not going to step through all the output to DBCC PAGE, but note the value in the header m_freeCnt = 11. This means there are only 11 free bytes on the entire page. Pretty full, right?

Now let's update one of the rows. I'll change column c to be 1600 bytes instead of 0, and since there are only 11 free bytes on the page, the row will be WAY too big.

UPDATE bigrowsSET c = REPLICATE('x', 1600)WHERE a = 3;GO

If you look at DBCC IND again, you'll now see there is a second data page for the table. I'll let you explore the contents of the new page on your own, but I will show you what is on the original page where the forwarded row used to be:

Note that there are only 9 bytes total, and the record type indicates the row has moved, because the type is FORWARDING_STUB, which means it has a forwarding pointer.

I told you that forwarding pointers can be a really bad thing. Let's see why that is. I'm going to build a much bigger table now, that is a copy of a big table in AdventureWorks. I'll then ALTER the table to add a new column and fill it with 100 bytes in every single row of the table.

USE AdventureWorks2016;GO

DROP TABLE IF EXISTS Details;GO--Look at the fragmentation data for the SalesOrderDetail2 table.-- Note the avg_fragmentation_in_percent value

-- The data pages are not affected until we run the following update.-- Every row on every page will get an additional 100 bytes in the notes field-- added to itUPDATE dbo.DetailsSET notes = 'notes';GO

-- note there are LOTS of forwarded records now (76945),-- and many more pages the table (2897)

SET STATISTICS IO ON;GO-- The number of reads is not just the number of pages as we would expect for-- a scan of a heap, but is equal to the-- number of pages PLUS the number of forwarded records:-- 76945 + 2897 = 79842-- During a scan, the forwarded pointers are followed for EACH row, and then-- SQL Server goes back to the original position to continue the scan

SELECT * FROM dbo.Details;GOSET STATISTICS IO OFF;GO

So if you find a lot of forwarded records in a heap that is scanned frequently, you'll want to get rid of them. There is no background process that goes through and cleans up forwarding pointers for you. However, if a rows is updated to reduce the length of the variable length column and it becomes small enough to fit back in the original position, it will move back (assuming other rows haven't filled the space the row used to occupy.) Also, if you shrink a data file, forwarding pointers can be removed, but this is not recommended as a way to remove forwarding pointers, it is just a side-effect.

Since forwarding pointers can only occur in heaps, you can get rid of them by creating a clustered index on the table. You can also get rid of them by just simply rebuilding the table:

Notice that I still have a much larger number of pages than I did originally (2604 instead of 1067) but, after all, the rows are all much larger! More importantly, there are NO forwarded records. If I do another table scan, the number of logical reads should be only the same as the number of pages.

~Kalen

[This post has been updated from a post on my old blog. The original is here.]

I have frequently heard SQL Server developers and DBAs gasp when a query plan is indicating that SQL Server is performing a table scan, thinking that is the worst thing that could ever happen to a query. The truth is, it's far from the worst thing and in addition, not all table scans are created equal.

One thing that is far worse that a table scan is to execute a query that uses a nonclustered index, and having that query look up every single row in a table! Although that is a horrible thing to behold, it is not the topic of this post.

Today, I'm going to show you that two different table scans on the same data in a heap can give very different performance.

The behavior has to do with a technique that SQL Server uses when a row in a heap is increased in size so that it no longer fits in the original page. This usually occurs when a variable length column is updated to take more space. If SQL Server just moved the row to another page, any nonclustered indexes would have to be updated to indicate the new page address. (Remember, if the underlying table is a heap, nonclustered indexes point to the data row using a actual address.) Since there can be up to 999 nonclustered indexes on a single table, that could potentially be a LOT of work. So instead, when a row in a heap has to move, SQL Server leaves behind a forwarding pointer in place of the row that has moved. The nonclustered indexes continue to point to the old location, and then SQL Server just needs one more page lookup to find the new location. If just a few rows need to be looked up, this expense is minimal and more than made up for my the savings of not having to update all the nonclustered indexes every time a row moves.

However, what happens when there are LOTS of forwarding pointers?

The metadata function sys.dm_db_index_physical_stats has a column that indicates how many forwarded records are in a table. For tables with clustered indexes, this will always be 0.

Let's look at an example. I'll make a copy of the Person.Address table in the AdventureWorks2014 database, and add a new varchar column to it. Initially, the column takes no space.

The output shows me I have 1670 pages in the table and 15236 forwarded records.

Let's see what happens when we read every row in the table:

SET STATISTICS IO ON;SELECT * FROM Address2;SET STATISTICS IO OFF;

The logical I/O value returned by STATISTICS IO tells us that instead of just reading through every page, for a total of 1670 reads, SQL Server jumps out of sequence and follows the forwarding pointer for every forwarded record. So the number of logical reads is the sum of the number of pages plus the number of forwarded records:

1670 + 15236 = 16906

I was discussing this behavior with my friend and colleague Tibor Karaszi and he proposed an explanation for this behavior. He related it to the same behavior that Itzik Ben-Gan has described for why SQL Server will always follow page pointers when scanning a clustered index if consistent reads are desired. The alternative would be to just read the pages in disk order, or page number order, which can be determined by examining the IAM structures for the object. For clustered tables, we need to follow the page pointers instead of the IAMs to make sure that if a row is moved due to an update while the scan is occurring, that we don't read the same row twice (if the row is moved to a higher page number) or skip the row altogether (if the row is moved to a lower page number.)

But what about a heap? Are there potential problems scanning a heap while updates are occurring? Could we potentially read the same row twice or skip a row, since there is no 'ordered list' to read? Tibor suggested the following:

I believe that forwarding pointers take care of just that. Because of forwarding pointers, the "root" location for a row is stable. So, even if the row moves during a scan, the "root location"(forwarding stub) is at the same position. We have concluded that the scan uses the forwarding pointers when reading the rows. This means that a scan is not sensitive to row movements during the scan. It cannot "skip" rows that are there, or read the same row twice.

So a few forwarding pointers are not a bad thing, but having lots of them can increase the work done during scans or partial scans by a considerable amount.

So how do you get rid of forwarding pointers? There are 4 ways:

1. If the row is updated, so that its size decreases, AND if there is still room on the page where the row came from, it will be moved back. This is not dependable, so it isn't really recommended as a solution. When I updated my Address2 table, most of the forwarded records were moved, but not all:

My results showed that I am still left with 175 forwarded records. This is a great improvement over 15236, but it's still a lot. In my situation, I got such a high number of rows being moved back because I didn't do any further inserts into the original table so all the empty space from the original rows was still available.

2. Forwarded records will be cleaned up when you shrink the data file. This is definitely NOT recommended as a solution; I am only mentioning it for completeness. SQL Server does so much moving of data and updating nonclustered index pointers when shrinking a file, that updating the forwarded records is not very much extra work at all.

3. Since forwarded records only exist in heaps, you can make the table not a heap. Build a clustered index, and all the forwarded records will go away. Some people will say this is the best solution.

4. If you really don't want the clustered index, you can simply ALTER the table with the REBUILD option.

ALTER TABLE dbo.Address2 REBUILD

After the REBUILD, my physical stats look like this:

All the forwarded records are gone as the data has been moved to all new pages.

Hopefully, this information will be useful to you.

~Kalen

SQL Server Internals

Kalen Delaney teaches the most advanced SQL Server courses in the world. Her 5-day course “SQL Server Internals and Tuning” is designed for advanced SQL Server professionals who want to know how SQL Server really works, in order to take maximum advantage of its potential.