Constraints – Don’t make them DEFERRABLE or NOVALIDATE unless you need to. December 14, 2007

Back when Oracle8 was released, Oracle introduced a number of new features with regard to constraints.

The first was the option of making a constraint DEFERRABLE, meaning the policing of a constraint can be deferred until the issuing of the COMMIT, rather than during the execution of an individual statement. This gave application developers more freedom in how they designed code, particularly with regard to the order in which parent – child data is inserted and manipulated.

The second new option was the ability to enable a constraint with NOVALIDATE, meaning Oracle would enable the constraint but not bother to check existing data to ensure nothing violated the constraint.

This could be useful in a number of scenarios. For example, you have data that currently violates the constraint but have urgent business requirements to enable the constraint ASAP preventing further violations, with the intention of cleaning up the existing violations at some future time.

Or you know the data is OK, so to reduce the overheads associated with enabling the constraint (eg. reading all the data to check for violations), you enable the constraint “immediately” with NOVALIDATE, bypassing the “redundant” checking.

Both deferrable and novalidate constraints therefore imply there “could” be data at any given point in time that violates the constraint. Therefore Oracle also introduced the ability to have non-unique indexes (rather than unique indexes) policing either PK or Unique constraints. For deferrable or novalidate constraints, the index must in fact be non-unique, as a unique index would prevent any such (temporary) violations of PK or Unique constraints.

Now, there are a number of interesting and subtle differences in the manner in which Oracle manages and processes a Unique vs. a Non-Unique index (eg. the amount of storage they use, the amount of redo they generate, the number of latches they acquire). This will be discussed in another Blog entry some other day.

Today, I just want to focus on a couple of interesting little side-effects with regard to how the CBO deals (or doesn’t deal)with NOT NULL and CHECK constraints that have been created as Deferrable or Novalidate.

In 9i, the CBO was clever enough to know that if someone searched for a NULL value but the column had a NOT NULL constraint, there couldn’t possibly be any data matching the criteria. Providing you had an index on the column, the CBO would generate an execution plan that used the index, found no NULL values and returned an empty row set accordingly. If you had no index, the CBO would be forced to use a Full Table Scan. So the CBO actually used an index in an efficient manner to search for non-existent nulls.

BUT, if the NOT NULL constraint was either deferrable or novalidated, then Oracle couldn’t know there were no nulls, there just might be. Therefore, Oracle was forced into the FTS regardless of the existence of the constraint or index, as null values are not indexed (unless part of a concatenated index).

Since 10g, the CBO has become smarter. The NOT NULL example works in a very similar manner, except that the index is no longer required. If one searches for a NULL value on a column that has a NOT NULL constraint, the CBO automatically determines there can be no matching rows and returns the empty row set immediately with no LIOs. None, as accessing the data is simply not necessary.

BUT again, it can only do so if and only if the NOT NULL constraint is validated and nondeferrable, otherwise the CBO can’t guarantee no nulls.

Although we actually have applications that intentionally search for nulls on NOT NULL columns to return empty row sets, it’s not common that an application would perform such a search.

What is much more common is searching for a column value that simply doesn’t exist. If a column value doesn’t meet a business rule, it’s a good idea to police such business rules with Check constraints. 10g has extended the NOT NULL scenario to include Check constraints. If a search attempts to search for a column value that violates a check constraint, Oracle will immediately return an empty row set without performing any LIOs.

But once again, it can only do so if the check constraint has been validated and set as nondeferrable.

Making constraints deferrable or enabling them with novalidate can be useful. However, if possible, ensure constraints are not deferrable and validated as this provides the CBO with additional information regarding the columns that it might just put to good use.

Additionally, unless there’s a requirement to the contrary, use unique indexes rather than non-unique indexes to police uniqueness. But that’s a discussion for another day …

The index doesn’t really “enforce” the PK (or Unique) constraint as such. The index is there so that Oracle can efficiently determine whether the PK has been violated or not.

Let’s say we have a table called BOWIE and it has a column called ID which has a PK constraint. The table has 1 million rows. We now insert a new row with an ID of value 1234567. How can Oracle determine whether this value already exists in the table and hence whether the insert violates the constraint ?

Well if we didn’t have the index on the ID column, Oracle would have no choice but to perform a full table scan and check each and every ID value, ensuring none of them has a value 1234567. This would obviously be a very expensive operation and it would have to perform this FTS after each and every insert (or update on the PK).

No way. Oracle does us a favour and says if you want a PK or Unique constraint, you must have an index to make the process of checking for any possible violations nice and efficient. For with an index, Oracle simply has to navigate through the branch levels until it gets to the corresponding leaf block to determine whether the new (or updated) value exists or not.

If a suitable index doesn’t already exist, Oracle will automatically create one for you as it makes no sense not to have one (database performance would be non-existent).

Note it only has to check for the existence or otherwise of the newly inserted ID. Therefore the index it uses to perform this check could potentially itself be either a unique or non-unique index.

I think if you’re going to give this sort of advice, you should be upfront about the associated costs. In this case, the advice to use non-deferrable constraints means that indexes will be dropped without any word of warning if you ever seek to disable the constraint.

Similarly, the cost of following your advice to use unique indexes is: once you’ve issed a ‘disable constraint’ command, the constraint actually WON’T be disabled, unless you again don’t mind running off to drop some potentially very large indexes.

And the real kicker is that re-enabling the constraint or rebuilding those indexes will take exclusive table locks, perform a lot of physical I/O (probably) and clobber your buffer cache to death in the process.

Weigh that lot against the possible benefit to a few specific queries relying on a particular behaviour of the CBO and I’d suggest that the balance of convenience falls heavily on the use of non-unique indexes and thus deferrable constraints as a general rule.

I would certainly see no problem with advice along the lines of ‘replacing a deferrable with a non-deferrable constraint might prove a useful tuning technique for specific queries’. More sweeping than that, however, I wouldn’t consider gets to the heart of the issue.

(PS: my comments relate to deferrable unique and primary key constraints, obviously… I realise you were writing about NOT NULLS and CHECK constraints, too. I think you’re right about them, but as I say, I think the general advice not to use deferrable constraints has to be modified when it comes to primaries and uniques.)

All very good and valid points. And indeed, if one were likely to consider disabling such constraints it would be something worth considering.

Now your mileage may vary but seriously, how often do you disable PK constraints ? I can honestly say I can’t remember the last time I’ve had to disable or drop a PK constraint (or Unique constraint for that matter). It’s simply something I haven’t had to do or worry about, we just don’t have business requirements that require PKs to be disabled/dropped and I look after many databases and applications. If I ever did and I was worried about dropping my index, I’d use the KEEP INDEX clause right …

Now if one were in a similar environment, where disabling / dropping PK constraints is an extremely rare event, then perhaps the disadvantages of using non-unique indexes may have significantly more importance than the advantages.

Perhaps ” you should be upfront about the associated costs” with non-unique indexes ;)

For indeed non-unique indexes have associated costs and disadvantages.

For a start, a non-unique index may use up to 5-6% more storage than it’s unique equivalent. Now that may not sound like much, but that’s a potential saving of 5-6% by simply using a unique index, without any expensive periodic rebuilds, for the entire life of the index.

Perhaps even more significant is the latch contention between using a unique and a non-unique index. A non-unique index requires more than double the associated latches when accessed than a unique index. That’s more than doubling potential contention issues, each and every time an index is accessed, when using a non-unique index vs. a unique index.

There are other subtle disadvantages as well …

Like I said, your mileage may vary but I would much much rather configure environments such that performance can be optimised as easily and simplistically as possible for “normal” business critical database activities, rather than potentially reducing performance and scalability unnecessarily to cater for a specific event that may only occur as frequently as the Miami Dolphins winning the Superbowl.

Indexes get accessed and modified continuously, PK constraints very very rarely (if ever) get disabled or dropped (certainly in my experience). If you think it’s just “the possible benefit to a few specific queries” you’re missing the point.

Last point I would make is that the advantages of using non-unique indexes as you correctly highlight are well documented and understood by most DBAs. However, the disadvantages are not quite so well documented and understood and if I can raise awareness of some of those issues as I plan, then all well and good.

If fact, I’m responsible for various large scale, high availability databases and applications that quite simply if they became unavailable or became unusable would put real people’s lives at real risk.

It doesn’t get any more real than that…

I appreciate you still sweat over the time you accidentally dropped an index when you disabled a constraint but we simply can’t afford to make those types of mistakes. We have strict procedures and processes in place as well as a clear understanding of the consequences of these types of actions. We simply can’t afford to assume re-enabling a constraint is a 5 second job if it isn’t and must test and schedule such operations accordingly.

You may consider I’m perhaps in danger of losing touch with practical reality but all the advice I provide is based and applied in my “real-life” practical experiences. In short, I practice what I preach in real, mission critical, large scale environments each and every day and I’ve done so for many years.

Now you may wish to advise creating all constraints as deferrable as it saves typing and will automatically create non-unique indexes that you can’t accidentally drop. I totally disagree with such advice but that’s fine.

However I fundamentally take exception to the suggestion that understanding the consequences of dropping constraints, understanding and applying the advantages of unique indexes, avoiding the unnecessary use of deferrable or novalidate constraints, understanding and recognising indexes that need rebuilding, etc. etc. are somehow nostrums when applied to the real world.

Updated: I’ve removed a reference to the comparison of practical experience between Howard and I with apologies to Howard for making them.

What OS are you using? I reran your tests in ‘Check Constraints with 10g’ demo and got different results. When I queried using the ‘FLAG = 3′ filter, I still got 23 consistent gets and no ‘FILTER’ operation in the execution plan.

Henry, this is precisely why I like to show examples and demos, so others can determine for themselves whether or not something works in their environment, whether something can be expanded or tested further, etc. etc.

Well, I’m afraid I do disable constraints quite a bit. I won’t say it’s something that a good application should do, but then this isn’t a good application! It is, however, a real world one.

We also do a lot of poor-quality data loads: often the quickest way of tracking down the bad records is to quickly disable the constraint, do the load, re-enable the constraint and then check out the exceptions table. I had to do that about a dozen times last week, actually. Again, the real fix is to improve the feed in the first place… but I only manage databases and systems, not wave magic perfectability wands!

And I don’t have latch contention to speak of (latches being automatically managed these days, I’d hope not, too!)

Again, it’s a question of balance of costs/benefits. The frequency with which primary keys (say) get disabled is maybe quite or very low… but the costs of an inadvertent index drop is likely very high. Meanwhile, the frequency of index access and modification is no doubt very high, but the costs associated with extra latching or 6% extra blocks are likely (I suspect!) negligble.

Unless they aren’t, of course, in which case: fine, your advice is sensible.

It’s like seatbelts (legal requirements aside): how often have you crashed your car? A rare event (I hope!), yet the costs if it ever happens are huge. Therefore, you go to the slight inconvenience of buckling up each time, just in case.

I just think that most databases, most of the time, will find the rare potential costs of unique indexes huge and unpalatable and the common potential costs of non-unique indexes essentially bearable and of no major consequence. There will always be exceptions to that, but I think strong advice to do or not do something should be tailored at the generality, not the exception.

We could debate the proposition that non-uniqueness ought to be “the generality case” until the cows come home, of course, because it largely depends on personal experiences which no two DBAs are likely to interpret the same way; but the fact that such debate is possible is why I baulk at any advice that appears to mandate or proscribe certain things in a fairly didactic manner.

Well, that’s fine and dandy. How many on your team, Richard? How big’s your IT budget?

Right, now divide by 10 or more. *That’s* the people I have in mind. Places that don’t use words like “mission critical”, but know that having the database down during the day means lost business they can’t afford to lose.

I advise creating constraints deferrable by default unless there are good reasons not to, because it is the simplest, most automatic, least-hassle way of achieving a desirable outcome. It’s got nothing at all to do with saving a bit of typing (a disingenuous ‘dig’ on your part, I might add).

You shouldn’t, either, take exception to figments of your own imagination. The nostrums I was talking about was short, sharp, didactic advice that misses the subtlety of things and the demands of the real world. Which means all those things you list as me having called a ‘nostrum’ never got included in that description in the first place.

Your last two sentences are just very sad. You mean you’ve run of technical arguments to advance so you just have to resort to personal attack? Feeble.

I read both the blogs (Richard’s and Howard), i sometimes wonder why people get so protective about their argument. Why can’t they agree to disagree and not get personal in their remarks? Make me wonder….

If you look at Henry’s comment in this thread (number 7), he got the same results as you using 10.2.0.3 on RedHat AS4. Looks like Oracle on Linux works differently in this regard than on other platforms.

Dale,
are you thinking about negative or positive performance implications?
To give an example with positive impact: We run a data synchronization interface between two databases. After passing through a staging area, all staged data gets merged into existing tables that are modeled with foreign keys.
The FK contraints are deferrable (check contraints, btw, aren’t); primarily, to import all the data on a “all-or-nothing” basis, but secondly, run time of the batch process is better with all the FK constraints checked in bulk after the commit.
So far we haven’t run into any issues with sub-optimal SQL plans related to deferred FKs.

could some one please tell me what is the difference in making a constraint deferred and making a constraint disable?

when the index will be automatically created by the system for a column? and when we disable the constraint will that index gets dropped? after re-enabling the constraint can we recreate the index for the column?

Brian has already answered the difference between a disabled and a deferrable constraint.

Very simplistically, an index is created when a constraint is created/enabled providing an index doesn’t already exist that could be used. An index is dropped when a constraint is dropped/disabled, providing it’s a unique index.