Tuesday, January 03, 2012

In SQL Server, when you drop a column you would expect that you will now see the space used by that column to be free. But that is not the case as ALTER TABLE DROP COLUMN is essentially a meta operation. SQL Server marks the column as dropped. But at the physical level, no data is removed.

Infact, even after you drop a column, SQL Server will continue to use space for the dropped column.

Now the solution to this is to do an index rebuild. Or if the column being dropped is a variable length column, you could do a "DBCC CLEANTABLE". But for fixed-length columns, the only solution is to CLUSTERED INDEX REBUILD. But doing a CLUSTERED INDEX REBUILD on a huge table could take a very long time + it needs additional space. Also, if you are not on ENTERPRISE edition of SQL Server, you cannot perform ONLINE rebuild. So basically your table will be blocked for the duration of the index rebuild.

Solution if your table is partitioned and you dont want to rebuild the entire clustered index:
1. If your table is already partitioned, you can simply do a clustered index rebuild on the current/future partitions. This will ensure that all rows going forward will not take space for the dropped column.

Solution if your table is not partitioned and you dont want to rebuild the entire clustered index:
1. You can convert your table into a partitioned table. You would partition on the column on which you have your clustered index. This will only work if your clustered index is on a incrementing int/bigint/datetime field.
ie, create a staging table with the new structure; swap in the existing table into the staging table; rename the existing table to something else; rename the staging table into the existing table.
2. Now you can rebuild the clustered index on the latest partition (rather than the whole partition).

The above solutions are only valuable/useful if you are working with a really huge table (billions of rows). For anything else, it may not make any sense.