Search results matching tags 'indexes', 'included columns', and 'sort'http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&tag=indexes,included+columns,sort&orTags=0Search results matching tags 'indexes', 'included columns', and 'sort'en-USCommunityServer 2.1 SP2 (Build: 61129.1)Geek City QUIZ: What happens when you change a column in an index from a key column to an included column?http://sqlblog.com/blogs/kalen_delaney/archive/2008/11/24/change-a-column-in-an-index-from-a-key-column-to-an-included-column.aspxMon, 24 Nov 2008 07:15:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:10065Kalen Delaney<P>This quiz is not really a generic question about changing an index key column to be an included column; it's about a behavior noticed by a reader in one particular query.&nbsp; Dejan Nakarada-Kordic from New Zealand sent me a very interesting puzzle. He had a reproducible query for which he thought an existing nonclustered index should be used, and it only ended up being used if the column was defined as an INCLUDED column instead of a key column. Here is the script Dejan sent me:</P>
<P><FONT face="Courier New" size=1>-- First, create the table<BR></FONT><FONT face="Courier New" size=1>IF&nbsp; EXISTS (SELECT * FROM sys.tables <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; WHERE schema_id = 1 and name = 'Table1') <BR>&nbsp;&nbsp;&nbsp;&nbsp; DROP TABLE dbo.Table1;<BR>GO<BR>CREATE TABLE dbo.Table1(<BR>&nbsp;&nbsp;&nbsp; Transaction_Serial_No [bigint IDENTITY(1,1) NOT NULL,<BR>&nbsp;&nbsp;&nbsp; Channel] char(4) NOT NULL,<BR>&nbsp;&nbsp;&nbsp; Retry_Counter int NOT NULL,<BR>CONSTRAINT PK_Table1 PRIMARY KEY CLUSTERED <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (Transaction_Serial_No)); <BR></FONT><FONT face="Courier New" size=1>GO </FONT></P>
<P><FONT face="Courier New" size=1>--populate the table with 10000 rows<BR></FONT><FONT face="Courier New" size=1>SET NOCOUNT ON;<BR></FONT><FONT face="Courier New" size=1>declare @counter int;<BR>set @counter = 0;<BR>while (@counter &lt; 10000)<BR>begin<BR>&nbsp;&nbsp;&nbsp; set @counter = @counter + 1;<BR>&nbsp;&nbsp;&nbsp; insert into&nbsp; dbo.Table1(channel, retry_counter)<BR>&nbsp;&nbsp;&nbsp; values('ch1', 0);<BR>&nbsp;&nbsp;&nbsp; insert into&nbsp; dbo.Table1(channel,&nbsp; retry_counter)<BR>&nbsp;&nbsp;&nbsp; values('ch2', 0);<BR>end;</FONT></P>
<P><FONT face="Courier New" size=1>-- Now build a nonclustered index on <BR>IF&nbsp; EXISTS (SELECT * FROM sys.indexes <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; WHERE object_id = OBJECT_ID('dbo.Table1') AND name = 'IX_Channel')<BR>&nbsp;&nbsp;&nbsp; DROP INDEX IX_Channel ON dbo.Table1;<BR>GO<BR>CREATE NONCLUSTERED INDEX IX_Channel ON dbo.Table1 (Channel, Retry_Counter);<BR>GO<BR><BR>-- Examine the query plan for this query:<BR></FONT><FONT face="Courier New" size=1>SELECT TOP(20) Transaction_Serial_No<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM dbo.Table1&nbsp; <BR>WHERE Channel = 'ch2'<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; AND Retry_Counter &lt;= 10&nbsp; <BR>ORDER BY Transaction_Serial_No;</FONT><BR></P>
<P>The plan should show you a Clustered Index Scan being performed, even though the nonclustered index appears to be a covering index. </P>
<P><A href="http://sqlblog.com/blogs/kalen_delaney/WindowsLiveWriter/GeekCityQUIZWhathappenswhenyouchangeacol_13A9B/image_2.png"><IMG style="BORDER-RIGHT:0px;BORDER-TOP:0px;BORDER-LEFT:0px;BORDER-BOTTOM:0px;" height=67 alt=image src="http://sqlblog.com/blogs/kalen_delaney/WindowsLiveWriter/GeekCityQUIZWhathappenswhenyouchangeacol_13A9B/image_thumb.png" width=261 border=0></A></P>
<P>The query contains all 3 columns used in the query. The columns <EM>Channel</EM> and <EM>Retry_Counter</EM> are defined index keys, and because the table has a clustered index, the clustered key <EM>Transaction_Serial_No</EM> is also part of the index. </P>
<P>If you rebuild the index to use <EM>Retry_Counter</EM> as an INCLUDED column, the plan changes.</P>
<P><FONT face="Courier New" size=1>CREATE NONCLUSTERED INDEX IX_Channel ON dbo.Table1 (Channel)<BR>&nbsp;&nbsp;&nbsp; INCLUDE ( Retry_Counter)&nbsp; WITH&nbsp;&nbsp; (DROP_EXISTING = ON);</FONT>
<P><FONT face="Courier New" size=1>-- Examine the plan again:</FONT>
<P><FONT face="Courier New" size=1>SELECT TOP(20) Transaction_Serial_No<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM dbo.Table1&nbsp; <BR>WHERE Channel = 'ch2'<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; AND Retry_Counter &lt;= 10&nbsp; <BR>ORDER BY Transaction_Serial_No;</FONT></P>
<P>The plan now shows a nonclustered index seek:</P>
<P><A href="http://sqlblog.com/blogs/kalen_delaney/WindowsLiveWriter/GeekCityQUIZWhathappenswhenyouchangeacol_13A9B/image_4.png"><IMG style="BORDER-RIGHT:0px;BORDER-TOP:0px;BORDER-LEFT:0px;BORDER-BOTTOM:0px;" height=48 alt=image src="http://sqlblog.com/blogs/kalen_delaney/WindowsLiveWriter/GeekCityQUIZWhathappenswhenyouchangeacol_13A9B/image_thumb_1.png" width=244 border=0></A> </P>
<P>WHY should changing the column from a key column to an INCLUDED column change the plan?</P>
<P>You can try to figure out the answer for yourself, or read on. </P>
<P>&nbsp;</P>
<P>-------------------------------------------------------------------------------------------------------------------------</P>
<P>It turns out that there are two extra factors to consider. </P>
<P>1) Although covering indexes are a great thing, and the optimizer will choose them over other possible indexes most of the time, the cost of sorting can sometimes outweigh the benefit of a covering index.</P>
<P>2) When a column is an INCLUDED column in an index, it comes after all the key columns and after the clustered index key that is always part of a nonclustered index on a table that has a clustered index.</P>
<P>So, the original index on (Channel, Retry_Counter) is the same as an index on (Channel, Retry_Counter, Transaction_Serial_No), but moving <EM>Retry_Counter</EM> to be an INCLUDED column changes the index to be on the columns (Channel,&nbsp; Transaction_Serial_No), with Retry_Counter in last place, and unsorted.</P>
<P>The query wants the data sorted on <EM>Transaction_Serial_No</EM>, and scanning the clustered index gives us the data in <EM>Transaction_Serial_No</EM> order.&nbsp; The first&nbsp; nonclustered index is sorted first by <EM>Channel</EM>, then by <EM>Retry_Counter</EM>, then by <EM>Transaction_Serial_No</EM> so that index is not useful for the sort.&nbsp; </P>
<P>The second nonclustered index is sorted first by <EM>Channel</EM>, then by <EM>Transaction_Serial_No. </EM>BUT the query has limited the data to only data with the <EM>Channel</EM> value equal to 'ch2', so since the first column values we're retrieving are all the same, returning data in the nonclustered index order will return the data in <EM>Transaction_Serial_No</EM> order and no sorting needs to be done. </P>
<P>You might notice in this data set that all the values for <EM>Retry_Counter</EM> are also identical, so you might then think we should be able to use the first nonclustered index to avoid having to sort the data, but the optimizer cannot be sure that the <EM>Retry_Counter</EM> values are all identical. The statistics might indicate that all the values are the same, but the optimizer can't be sure the statistics are 100% up to date. </P>
<P>Here are some things to try:</P>
<P>1) If you remove the ORDER BY, you will see that the first index, with no INCLUDED column, is used.</P>
<P>2) If you change the second condition to searching for a constant (AND Retry_Counter = 0), the first index can be used. If both of the first two index columns are constants in the retrieved data, it will be sorted by the third column (<EM>Transaction_Serial_No</EM>).</P>
<P>3) If you force the query to use the nonclustered index when <EM>Retry_Counter</EM> is a key, you'll see that SQL Server has to do a sort:</P>
<P><FONT face="Courier New" size=1>DROP INDEX IX_Channel ON dbo.Table1;<BR>GO<BR>CREATE NONCLUSTERED INDEX IX_Channel ON dbo.Table1 (Channel, Retry_Counter);<BR>GO<BR>SELECT TOP(20) Transaction_Serial_No<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FROM dbo.Table1 WITH (index = ix_channel)<BR>&nbsp;&nbsp;&nbsp; WHERE Channel = 'ch2'<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; AND Retry_Counter &lt;=&nbsp; 10<BR>ORDER BY Transaction_Serial_No;<BR>GO</FONT>
<P><BR><A href="http://sqlblog.com/blogs/kalen_delaney/WindowsLiveWriter/GeekCityQUIZWhathappenswhenyouchangeacol_13A9B/image_6.png"><IMG style="BORDER-RIGHT:0px;BORDER-TOP:0px;BORDER-LEFT:0px;BORDER-BOTTOM:0px;" height=47 alt=image src="http://sqlblog.com/blogs/kalen_delaney/WindowsLiveWriter/GeekCityQUIZWhathappenswhenyouchangeacol_13A9B/image_thumb_2.png" width=244 border=0></A> </P>
<P>Notice that the SORT operation is estimated to be 95% of the total query cost, so SQL Server would definitely like to avoid that if possible.&nbsp;</P>
<P>So the issue wasn't really because of INCLUDED columns, it was because of the order the columns appeared in the index, and the fact that SQL Server wants to try to avoid sorting whenever possible.</P>
<P>Have fun!</P>
<P><FONT color=#ff00ff size=4>~Kalen</FONT></P>