Today, I would like to share one very quick tip about how to remove bookmark lookup or RID lookup. Let us first understand Bookmark lookup or RID lookup. Please note that from SQL Server 2005 SP1 onwards, Bookmark look up is known as Key look up.

When a small number of rows are requested by a query, the SQL Server optimizer will try to use a non-clustered index on the column or columns contained in the WHERE clause to retrieve the data requested by the query. If the query requests data from columns not present in the non-clustered index, SQL Server must go back to the data pages to get the data in those columns. Even if the table has a clustered index or not, the query will still have to return to the table or clustered index to retrieve the data.

In the above scenario, if table has clustered index, it is called bookmark lookup (or key lookup); if the table does not have clustered index, but a non-clustered index, it is called RID lookup. This operation is very expensive. To optimize any query containing bookmark lookup or RID lookup, it should be removed from the execution plan to improve performance. There are two different ways to remove bookmark/RID lookup.

Before we understand these two methods, we will create sample table without clustered index and simulate RID lookup. RID Lookup is a bookmark lookup on a heap that uses a supplied row identifier (RID).

It is clear from execution plan that as a clustered index is created on the table, table scan is now converted to clustered index scan. In either case, base table is completely scanned and there is no seek on the table.

Now, let us see the WHERE clause of our table. From our basic observation, if we create an index on the column that contains the clause, a performance improvement may be obtained. Let us create non-clustered index on the table and then check the execution plan.

-- Create Index on Column City As that is used in where conditionCREATE NONCLUSTERED INDEX [IX_OneIndex_City] ON [dbo].[OneIndex]([City] ASC) ON [PRIMARY]
GO

After creating the non-clustered index, let us run our select statement again and check the execution plan.

SELECT ID, FirstNameFROM OneIndexWHERE City = 'Las Vegas'GO

As we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table. However, the columns used in the SELECT clause are still not part of the index, and to display those columns, the engine will have to go to the base table again and retrieve those columns. This particular behavior is known as bookmark lookup or key lookup.

There are two different methods to resolve this issue. I have demonstrated both the methods together; however, it is recommended that you use any one of these methods for removing key lookup. I prefer Method 2.

Method 1: Creating non-clustered cover index.

In this method, we will create non-clustered index containing the columns, which are used in the SELECT statement, along with the column which is used in the WHERE clause.

Once the above non-clustered index, which covers all the columns in query, is created, let us run the following SELECT statement and check our execution plan.

SELECT ID, FirstNameFROM OneIndexWHERE City = 'Las Vegas'GO

From the execution plan, we can confirm that key lookup is removed the only index seek is happening. As there is no key lookup, the SQL Server query engine does not have to go to retrieve the data from data pages and it obtains all the necessary data from index itself.

Method 2: Creating an included column non-clustered index.

Here, we will create non-clustered index that also includes the columns, which are used in the SELECT statement, along with the column used in the WHERE clause. In this method, we will use new syntax introduced in SQL Server 2005. An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index.

No need to add ID to your nonclustered index… that’s redundant. Since you have a clustered index on ID, then ID will automatically be included in any nonclustered index you create (so that it could do the bookmark lookup).

So your nonclustered index examples would be to either create an index on (City, FirstName) or an index on (City) INCLUDE (FirstName).

I have a table customer in which cstid of BIGINT data type with identity. As table have identity column I do not want any type of index on it. Being cstid as primary key of table, is it possible without any index?

good post, very clear sample, but I would like to see a remark, that Key-Lookups could be in some situations, special with very wide tables, and the SELECT includes a lot or all of the columns, the much much better way. Think about a table containing 11.000.000 customers with 200 columns an the avg size of a row of 30kb. The SELECT need to return the whole row-data. Seach is done within the lastname, no leading placeholder to be able to use an Index. So an additional small index only containing the lastname and the key and then doing an Key Lookup is much much faster instead of using the index, even clustered index where every row nees severals data blocks. In case of IO we see differences in the factor of 500 times more IO.

We have a requirement to Merge six 12millions data tables into single Table.All the tables are having duplicate Emails.So We put the Primary key for email column using EnterPrise Manager and We r trying to merge by writing the following insert query, But it’s taking more than 20hrs…:-(

You are correct, however, I am going to cover that particular concept in different blog post.

This post is written to show the concept of covering Index. In next blog post, I am going to show as there is clustered index, your non clustered index will not require that key and it can be still covering Index.

Why not just remove, from the query, the column that’s causing the key lookup. This is really much simpler I’m sure you’ll agree. You’ve written about best practices for database design in the past, and taught me that fewer columns per table are always better. So if you’re getting lots of key lookups, it follows that we should split up our tables and make them smaller. This is the query optimizer’s way of telling us that we have messed up our design!

Thank you for continuing to educate us on how best to use our database.

But what if you don’t need those columns. Then you can remove them. This example proves that you shouldn’t add columns to a query that you don’t need. That’s never a best practice (at least in my experience). You revealed in another post that we should keep indexes as narrow as possible, and I have been following this rule religiously! Since starting down this path I have managed to get almost every index in our database down to a single column. So far this has worked very nicely as far as I can tell, but I’m not really sure how to collect query performance numbers. Would you mind helping me so that I can send you a report and you can show readers what great progress can be made if we follow your methods.

I’m curious to know why you have included the ID column in the covering index, as well as in the nc index with the includes. I thought all columns in the Clustered index were already included in the non clustered index.

I’m confused why the best recommendation isn’t to create a clustered index. In our situation we have a 5 Billion row table without a clustered index. The RID lookups are killing us. The need for a lot of extra non-clustered indexes is also killing performance because this is on a replication server.

Finally the writes to a clustered index are dramatically less than to a non-clustered index with a HEAP.

Hi,
If you are saying, if table has clustered index, it is called bookmark lookup (or key lookup); if the table does not have clustered index, but a non-clustered index, it is called RID lookup then you again wrote that as we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table. However, the columns used in the SELECT clause (id, firstname) are still not part of the index that means (neither cluster index nor non-cluster index has applied on it)and to display those columns, the engine will have to go to the base table again and retrieve those columns. so how we can say this particular behavior is known as bookmark lookup or key lookup

As i can see in above code, you are selecting ID then FirstName columns in Select list, but while including these columns for Non Clustered Index you have not followed the same sequence/order (Included Firstname then ID)

Is this really fruitful to improve query performance?

As per my understanding, we should follow the same sequence as we are selecting the fields, same sequence should be followed while including in Index, It would be easy for SQL Server to get the data in fast way.

Community Initiatives

About Pinal Dave

Pinal Dave is a Pluralsight Developer Evangelist. He has authored 11 SQL Server database books, 17 Pluralsight courses and have written over 3200 articles on the database technology on his blog at a http://blog.sqlauthority.com. Along with 11+ years of hands on experience he holds a Masters of Science degree and a number of certifications, including MCTS, MCDBA and MCAD (.NET). His past work experiences include Technology Evangelist at Microsoft and Sr. Consultant at SolidQ. Follow @pinaldave
Send Author Pinal Dave
an email at pinal@sqlauthority.com

Email Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.