LessThanDot

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

The question is often asked: when will SQL Server statistics update if auto update stats is enabled?

The short answer: When auto update stats is enabled in a database, statistics will update when 20% + 500 rows have changed in the table. This change can be adding new rows, removing rows or updating rows.

If you attend sessions or read many tuning articles that involve statistics on the internet, you may have seen the statement, “20% + 500 rows” more than a few times. Some related information on when the 20% + 500 does actually come into play and how the cardinality of the table plays a role can be found in KB 195565, “Statistical maintenance functionality (autostats) in SQL Server”.

Specifically a section extracted…

The basic algorithm for auto update statistics is:

If the cardinality for a table is less than six and the table is in the tempdb database, auto update with every six modifications to the table.

If the cardinality for a table is greater than 6, but less than or equal to 500, update status every 500 modifications.

If the cardinality for a table is greater than 500, update statistics when (500 + 20 percent of the table) changes have occurred.

The best way to look at this is to give it a try and see if the statement is accurate.

Taking a closer look

Statistics are the lifeline of generating an efficient method for retrieving data by the optimizer. Statistics will base the cost in a form of the estimated amount of data that will be retrieved. This could mean the difference between operations such as physical join operations leading to sorts or inadequate estimation of memory allocation needs. If statistics are outdated or missing, execution plans can be inaccurate and cause severe performance issues.

Since statistics are so critical to cost estimation and plan generation, knowing how and when statistics are updated is just as critical. Of course, this knowledge isn’t just wasted space in the mix of knowing how SQL Server works but gives a person in charge of maintaining a database power to know how to maintain statistics correctly.

Given statistics will be automatically updated when 20% + 500 rows have changed, we can estimate based on the total number of rows or growth expectancy, when statistics may be outdated and cause a potential issue. Imagine a table that has 5000 rows of data in it, and that data changes often. This would indicate 5,000 * .2 + 500 = 1,500 rows would have to change before statistics would be updated. 1,500 rows to 5,000 isn’t truly a great deal of data when the 5,000 is being changed at a high rate. Now, think of a table that has 1,000,000 rows in it. This would equate to 1,000,000 * .2 + 500 = 200,500 rows before statistics will update. 200,500 rows is a much larger number and if it took a long time to reach that, but possibly reaches 190,000 quickly, we potentially have an issue and statistics could be poorly representing the data in the table.

20% + 500

Let’s run an example to see if the 20% + 500 really is accurate. To monitor the update of statistics, extended events will be used. This is also a great way to monitor your systems for auto update stats being heavily performed and potentially a reason to turn off auto update stats.

The test table now has 11,000 rows in it. We should be able to determine how many rows would need to change before statistics would automatically update by running the following statement.

Note: when the 20% + 500 row change count is reached, the statistics are flagged to be updated. The actual updating event will occur at the next time a query is issued and the statistics are needed. This is a key piece of information when data is updated often but seldom read

T-SQL

1

SELECTCOUNT(*)*.20 +500[When will they update?]FROM statsupdate

SELECT COUNT(*) *.20 + 500 [When will they update?] FROM statsupdate

This means that 2,700 rows would need to change before statistics will update. Of course, we do not have any statistics on the table at this point due to there not being a clustered index, nonclustered index or querying the table. To create some statistics to monitor, create the following nonclustered index.

T-SQL

1
2
3

CREATEINDEX IDX_Col1 ON statsupdate (Col1)INCLUDE (ID)
GO

CREATE INDEX IDX_Col1 ON statsupdate (Col1)
INCLUDE (ID)
GO

Viewing the statistics area in Object Explorer in SSMS, we can see the statistics were created.

First, review the execution plan from the above query. Check the estimation verses the actual rows that were returned.

As shown, the estimated and actual are equal in this execution.

Now, check to see if the auto_stats event was indeed captured.

As shown, we absolutely triggered automatically updating of the statistics by reaching the change count of 2700 rows in our test table (20% + 500).

Summary

Statistics are a crucial factor in how SQL Server and the Optimizer comes to an Execution Plan it will use to retrieve data. Knowing how configurations such as autostat and others that directly affect how statistics are updated, stored and what they are composed of, is also a crucial factor so we can maintain them, make better decision on them and troubleshoot them when potential performance issues arise.

Related Posts

Today I was checking an older server and decided to run sp_helpdb. On a bunch…

About the Author

Ted Krueger is a SQL Server MVP and Author that has been working in development and database administration and the owner of a successful consulting business, DataMetrics Consulting. Specialties range from High Availability and Disaster / Recovery setup and testing methods down to custom assembly development for SQL Server Reporting Services. Ted blogs and is also one of the founders of LessThanDot.com technology community. Some of the articles focused on are Backup / Recovery, Security, SSIS and working on SQL Server and using all of the SQL Server features available to create stable and scalable database services.