I had this case today when I had to add a new column to a table with a gazillion rows and also create an index on this column. In my mind, having a column with NULL values would make the creation of the index instantaneous. I was wrong! In order to create the index, the SQL engine still scans all the rows and for tables with a lot of rows it takes a long while. But it felt really stupid. I knew that the column was filled with NULL values, I didn't need the computing of any index when I create it, instead only on INSERT/UPDATE/DELETE operations. So I started to look into solutions.

Enter filtered indexes! In Microsoft SQL Server 2008 an option for filters on indexes was introduced. The index must not be clustered and the definition it just the same as before, only with a WHERE clause that applies the filter. This seems to be the right solution to my scenario.

In order to test this I created a table called Test with two columns, a and b, both nvarchar(255). I filled the table with ten million rows having values for a and no values for b. Then I created an index on b; it took about 30 seconds. Then I created an index on a; it took 50 seconds. I removed the indexes and created a filtered index on b on the condition that b is not null. The operation was instantaneous. Success! And here are the actual operations with more exact values (check out the comments for extra SQL tips on speed):

CREATETABLE Test(a NVARCHAR(255),b NVARCHAR(255))

DECLARE @i INT = 625000 -- ten million divided by 16

BEGINTRAN-- if not using a transaction, SQL will create a transaction per insert!

WHILE @i>0BEGIN

SET @i=@i-1INSERTINTO Test(a) VALUES(CAST(@i AS NVARCHAR(255)))

END

COMMITTRAN-- 625000 rows with a filled and b empty [8 seconds]

-- insert selects are a lot faster than the while aboveINSERTINTO Test(a)SELECT a+'x1'FROM Test -- 1250000 total rows [3 seconds to create another 625000 rows]

A lot of good things can come from using filtered indexes, like for example a unique index on non-null values (which was pretty much a pain in the ass to do otherwise), but there are also gotchas. One of the things you have to watch out for is using filtered indexes on numeric columns. In this case the SET ARITHABORT ON command must be used (or insure in some other way that the option is on for all SQL sessions - SQL Management Studio and code both!, otherwise errors might occur or the index might be ignored. Also there seem to be some bugs that have not been addressed even in SQL 2012, like when using MERGE or when using filtered indexes on column being or not being null.