Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

What RDBMS are you using? SQL Server? This sounds like a good candidate for a filtered index. Does the application really do a full column select (*)? Unless your nonclustered index is covering, you'll have key lookups.
–
Thomas StringerApr 19 '13 at 19:25

A large majority of the records will meet the Null criteria specified. Would I still want to use teh filtered index for this? I thought filtered indexes were most useful if they narrowed down the amount of records by a significant margine.
–
LumpyApr 19 '13 at 19:31

The gains that you will see from having the predicate on the index as well as filtered statistics will depend largely on how filtered it is. Without seeing more information regarding your table and data, it is hard to give exact recommendations. But my comments (and answer) are food for thought.
–
Thomas StringerApr 19 '13 at 19:39

2 Answers
2

Since we are talking about the clustered index, just because you defined the CI key column as ID, you still have the DeletedDate data in the leaf data pages of the index. That's the nature of the clustered index: It is the table data.

I didn't explicitly put the key columns here (and nonkey columns through the use of the INCLUDE clause) because you didn't publish the DDL of your table.

As in my comment above to your question, the choice of key columns (not just columns, but also the order of the columns) will largely depend on your workload and the typical queries that would be using this index.

If you are looking to cover your query(ies), then you would need to ensure that the index satisfies all of the data required of the query(ies). Not to mention, if you have other WHERE clauses (besides your NULL check on DeletedDate) or joins to consider, then the order of your key columns can be the deciding factor between a scan or a seek. And even though it is filtered, and depending on how much data you have in the index, the penalty could be considerable.

Clustered index key performs best to be unique, narrow, static and ever-increasing by itself. So in this case, the inclusion of DeletedDate actually result the clustered key becomes non-unique non-static (presuming the DeletedDate value could be changed).

The INCLUDE in the non-clustered index is useful to cover the query without having to perform a key lookup to the table. However, as the INCLUDE columns only stored in the index leaf level, it does not help in searching for the values of the query predicate (in this case, NULL)

A non-clustered index key of DeletedDate allows a more effective search on the range of records that satisfy the predicate (NULL). A filtered non-clustered index could further narrow down the subset (only NULL records) and provide a better performance as well as storage for the non-clustered index.

With your description showing that the query returns all columns from the table with a single predicate, You could create a single filtered non-clustered index key for DeletedDate with WHERE DeletedDate IS NULL. Test it and examine the execution plan.