The Myth

I recently was attending a presentation where the presenter stated that the statistic with stats_id = 1 is always for the primary key (PK). The presenter made this statement based on the behavior that he has seen. However, is it true?

The relationship between index_id and stats_id

The first thing that needs understood is the relationship between indexes and statistics. In the sys.stats documentation, it states that every index has a statistics row with index_id = stats_id, as seen in this screen shot from that article:

Is the index with index_id = 1 special?

So now we need to consider whether there is any special meaning to an index with an index_id of 1? The index_id values are in the sys.indexes documentation, as seen in this screen shot:

Here we can see that index_id 1 means that the index is a clustered index. Note that there is no mention of a primary key. So where did this observed behavior come from? We need to dig a bit further into how SQL Server makes a primary key constraint.

Primary Key constraints

In the constraint documentation, in the remarks section, the second paragraph states that the primary key and unique constraints have an index created. In fact, unique indexes enforce these two constraints. At the top of this article, it discusses when those constraints create clustered or non-clustered indexes:

PK defaults to clustered. If there is a clustered index already, PK defaults to non-clustered. Note that this is the default behavior. You can specify that it should be created as non-clustered.

Now we can understand why the presenter’s observed behavior was that a PK always creates a statistic with stats_id of 1. This is the default behavior if that table doesn’t already have a clustered index.

Testing

Let’s run through a few examples showing these conditions. Let’s create a table with a million rows, and run some tests against it:

Create million row test

Transact-SQL

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

USEtempdb;

IFOBJECT_ID('dbo.Test')ISNOTNULLDROPTABLEdbo.Test;

CREATETABLEdbo.Test (

Col1INTEGERNOTNULL,

Col2INTEGERNOTNULL);

GO

WITHTens (N)AS(SELECT1UNIONALLSELECT1UNIONALLSELECT1UNIONALL

SELECT1UNIONALLSELECT1UNIONALLSELECT1UNIONALL

SELECT1UNIONALLSELECT1UNIONALLSELECT1UNIONALLSELECT1),

Hundreds(N)AS(SELECT1FROMTenst1,Tenst2),

Millions(N)AS(SELECT1FROMHundredst1,Hundredst2,Hundredst3),

Tally (N)AS(SELECTROW_NUMBER()OVER(ORDERBY(SELECTNULL))FROMMillions)

INSERTINTOdbo.Test (Col1,Col2)

SELECTN,NFROMTally;

Now that we have the test data, we will run a simple query to create a statistic:

Statistic with no PK

Transact-SQL

1

2

3

4

5

6

7

8

9

SELECT*

FROMdbo.Test

WHERECol2=3;

SELECTOBJECT_NAME(object_id)ASObjectName,

nameASStatsName,

stats_id

FROMsys.stats

WHEREOBJECT_ID=OBJECT_ID('dbo.Test');

This shows the following statistic:

This doesn’t say anything about the PK being stats_id 1 since there isn’t a PK. So, let’s create a clustered index, then add a PK:

CI and PK

Transact-SQL

1

2

3

4

5

6

7

8

9

10

-- add a CI and a PK

CREATECLUSTEREDINDEXIX1ONdbo.Test (Col2);

ALTERTABLEdbo.TestADDCONSTRAINTPK_TestPRIMARYKEY(Col1);

-- look at the stats again

SELECTOBJECT_NAME(object_id)ASObjectName,

nameASStatsName,

stats_id

FROMsys.stats

WHEREOBJECT_ID=OBJECT_ID('dbo.Test');

Here we can see that the clustered index (IX1) does have stats_id of 1, and the PK has stats_id of 3. This proves that when there is an existing clustered index, creating the PK will create a statistic with a stats_id other than 1.

Sidebar…

If you were to look at sys.indexes, you would see that these two indexes use index_id values of 1 and 3. The value 2 is skipped. It’s not because there used to be an index that was deleted after the index_id 3 index was created. It’s simply because of the relationship that index_id = stats_id, and there is already a statistic with stats_id = 2. When creating the index for the primary key, index_id 2 had to be skipped.

Now let’s recreate the test table (run the above code), and then add a non-clustered PK.

non-clustered PK

Transact-SQL

1

2

3

4

5

6

7

8

ALTERTABLEdbo.TestADDCONSTRAINTPK_TestPRIMARYKEYNONCLUSTERED(Col1);

-- look at the stats again

SELECTOBJECT_NAME(object_id)ASObjectName,

nameASStatsName,

stats_id

FROMsys.stats

WHEREOBJECT_ID=OBJECT_ID('dbo.Test');

Which returns:

This shows that the non-clustered PK (index_id > 1) created a statistics with a corresponding stats_id that is also > 1.

The only thing left to prove is that creating a PK where there is not a clustered index, and allowing the use of defaults, will create a stats_id = 1:

PK with defaults and no CI

Transact-SQL

1

2

3

4

5

6

7

8

9

ALTERTABLEdbo.TestDROPCONSTRAINTPK_Test;

ALTERTABLEdbo.TestADDCONSTRAINTPK_TestPRIMARYKEY(Col1);

-- look at the stats again

SELECTOBJECT_NAME(object_id)ASObjectName,

nameASStatsName,

stats_id

FROMsys.stats

WHEREOBJECT_ID=OBJECT_ID('dbo.Test');

This shows us that the statistic for the PK has a stats_id of 1:

Summary

When creating a primary key, it creates an underlying index and statistic for that index. The index will be the clustered index on the table if there is not a clustered index on the table already, and if the statement to create the primary key did not specify non-clustered. If there is an existing clustered index on the table, or it was specified to create a non-clustered PK, then a non-clustered index is created for the PK. The clustered index will have an index_id (and stats_id) of 1, and all other non-clustered indexes will have a >1 value.

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.