Faster Primary Key creation or enabling

Today, I have seen an application running a script to enable primary key constraint for a long time. The question is how to make it faster. In Data warehousing systems, it is normal that we disable Constraints, drop indexes and re-enable them after loading data. So, making them faster is very important as the other jobs are dependent on the initial data load.

I know this is nothing new for a DBAs, but many developers tends to over look this.

Here is my test case, initially prepared a new table with 7 million records in it.

The normal primary key creation took 6 minutes and 35.15 seconds. To create primary key it took 6.35 minutes!, I felt that is way too high. So, how to improve the constraint creation timings. The primary key creation to got two steps

A unique index creation

Enabling the constraint

The maximum time took was to create the index not to enable the primary key. If we make the index creation faster, all set. So, I decided to break the constraint creation into 2 parts a) Index creation b) Enabling primary key. Considering the type of tables – daily data load, can be re-created easily etc we need not keep recovery is in mind – if we lose the table we will create it. Also, we have enough CPU resources while other jobs are waiting to complete the data load.

As the system resources under utilised, I will use enough parallelism

As the table can be re-crated easily, I will disable redo generation while index creation.

With the above assumtpions in hand, I had created the index and primary key as follows.

Share this:

Like this:

A Primary Key constraint and a Clustered Index is not really the same in SQL Server. By default SQL Server enforces the Primary Key constraint with a Unique Clustered Index. But if you want you could instead enforce it with a Unique Non-Clustered Index. But that approach wouldn’t really make sense as a default, because you need to have specific problems to justify that approach.

It is worth noting that Oracle (11.2.0.4) still reads the whole table serially when it enables a primary key constraint despite the existence of an appropriate unique index. I don’t know why it does this. So on a huge table enabling a constraint is still long enough. But I approve that the whole process is faster when you have pre-created an index in parallel though.

I will check this, not sure why oracle has to scan the table when unique index is existing.

Vasiliy

August 24, 2017 at 3:01 am

By the way, in your example the index is not unique, so Oracle has to scan something to detect duplicates. I bet it scans the table, but it would be cheaper to scan the index. So would it be in the case of the unique index (but still it’d be more expensive than doing nothing).

Vasiliy

August 24, 2017 at 8:31 am

I am sorry. My bad. It was a flawed schema design. I blindly believed that primary key’s target columns had been constrained as not null (but they really should have been). I am currently in the middle of this process. I’ll tell if it doesn’t help (I believe it does).

In my own example when the index is not unique Oracle reads the entire index (FFS) and checks it for duplicates (SORT). As expected.