STATISTICS_NORECOMPUTE – When would anyone want to use it?

Question: Recently I noticed an option with CREATE/ALTER INDEX called STATISTICS_NORECOMPUTE? I'm not sure I understand this option or why you'd ever want to use it? Can you explain?

Answer: In general, I don't recommend this option. But, before I get to why - let me clarify what the option does. I've heard a few folks say that it stops SQL Server from updating the statistics at the time of a rebuild; that is NOT what it does. Statistics are ALWAYS updated at the time of a REBUILD; this cannot be turned off (nor would you want to).

Instead, STATISTICS_NORECOMPUTE stops the database-wide auto-updating (auto update statistics) from updating the specific statistics for an index (or column-level statistic) that was created (or rebuilt) with this option turned on. NOTE: STATISTICS_RECOMPUTE only works when creating or rebuilding an index. If you want to stop the auto-updating of a column-level statistic you use NORECOMPUTE when creating that particular statistic.

The more difficult question is if/when this would make sense to turn off – for either an index or a column-level statistic. And, yes, there are some cases where this can be appropriate.

The case where it makes sense to be turned off is for tables that are large in size and have very skewed data distribution. For example, the ratio of line items per sale might be very evenly distributed between 1 and 3 items with an average of 2.1. Evenly distributed data tends to be easier to represent in a histogram (one of the pieces of information maintained in a statistic in SQL Server). However, sales per customer or sales per product might be much more uneven (and therefore skewed). And, when data distribution is skewed then then statistics are potentially less accurate in a histogram. This is further exacerbated by the process that SQL Server may choose of sampling your data. SQL Server will choose to sample your data (rather than reading every row) when the data they have to scan (ideally, an index but sometimes the table itself) is over 2,000 pages in size.

It’s further complicated because there are statistics on indexes as well as column-level statistics. So, I thought I’d put together a quick table to help you understand the combinations:

Statistics

Creation

Rebuild

Reorganize

Updating

Column-level

Can use sampling

n/a

n/a

Can use sampling

Index

FULLSCAN

FULLSCAN

Not updated

Can use sampling

When would you know it’s a problem?
If you run a query and you evaluate the estimated rows vs. the actual rows and the estimate seems significantly different from the actual, then you might have a statistics problem. If you update statistics with the following code:

UPDATE STATISTICS tablename indexname (or statisticsname)

Then, try the query again. If this solved the problem, then you know that you’re not updating statistics enough. If it does not solve the problem then try updating the statistics will a fullscan:

UPDATE STATISTICS tablename indexname WITH FULLSCAN

If the estimate and actuals are more accurate (and typically result in better plans) then you might want to turn off the auto-updating of the statistic and solely control this particular statistics update with a job that updates the statistic with a full scan. The positive is that your statistics are likely to be more accurate and then your plans will be as well. The negative is that updating a statistic with a full scan can be a more costly maintenance operation. And, you have to be especially careful that your automated process doesn’t get removed or deleted leaving this statistic to never get auto-updated again.

In summary, I don’t often recommend STATISTICS_NORECOMPUTE on indexes (or NORECOMPUTE on statistics) but there are cases when you might be able to better control the updating of statistics as well as allow for a FULLSCAN. In these cases (and when you’re certain that your automated routines will not be interrupted/deleted/modified), then using this option can be beneficial!