A comment by Robert in Part II of this series reminded me of another subtle difference between Unique and Non-Unique Indexes. Now this difference is likely to be of minimal consequence to most applications as most applications don’t generally have problems with Primary Key (PK) or Unique Key (UK) constraint violations (and if they do, this is likely to be the least of their worries). But it’s a interesting difference nonetheless, something to keep in the back of your mind and a little tit-bit to end the year on.

When a row is inserted into a table or when a PK or UK is modified, Oracle of course needs to ensure that either the PK or UK constraint is not violated. If the constraint is policed via a Unique index, as previously discussed, Oracle knows the value must and can only ever be unique and so performs the constraint validation before the Unique index is actually modified. If the PK or UK is violated, the Unique index can not possibly have been changed as all the associated index entries must always be unique and so only the undo (and redo) of the changes associated with the table data blocks are actually generated and need to be subsequently rolled back.

However, if the PK or UK constraint is policed via a Non-Unique index, the mechanism for applying the changes differs somewhat. As the index is Non-Unique, as previously discussed, Oracle is not quite so certain as to the state of play and performs the constraint validation after the associated changes are made to the Non Unique index. If the PK or UK constraint is violated, both undo and redo of the Non-Unique index has been generated and both changes to the table data blocks and the index blocks need to be rolled back.

This means there’s an extra cost associated with violating a constraint if the constraint is policed via a Non-Unique Index vs. a Unique index. When performing media recovery, it also means that there’s an additional cost associated with performing the recovery. Obviously the more frequent the constraint violations, the greater the overall penalties. Also, the larger the PK or UK values, the greater the penalties.

As mentioned, this difference in behaviour between Unique and Non-Unique Indexes is unlikely to be an issue. However, in applications or environments where there may be a significant number of such violations, it may be something to keep in the back of your mind.

To those of you who celebrate Christmas and happen to stumble occasionally across these humble Blog entries, I would just like to wish you all a wonderful, joyful Christmas and a safe and prosperous New Year.

I’ve now been doing this Blogging thing for nearly two weeks, hasn’t time just flown by !!

Reaction to the Blog has generally been really positive so thank-you all for your encouraging comments both on and offline. If you haven’t done so already, please let me know what you think about some of the things I’ve covered thus far. Has it been at all useful, interesting, thought provoking, educational, or has it generally been boring, overlong, impractical and in danger of losing touch with reality.

Are there any specific things or topics you might like covered or discussed in the future or should I just spend more time talking about Radiohead ?

The most significant difference between a Unique and a Non-Unique index is of course the simple fact that in one index, all index entries MUST be unique and in the other index there can be duplicates of an index entry.

Although an obvious distinction between the two, it’s also a crucial difference as well.

When Oracle uses a Unique Index to scan for a specific value (via an equality predicate on all indexed columns or when policing a constraint ), there can only be one of two possible results. The value can exist returning at the very most one value or the value doesn’t exist returning 0 values. That’s it, 1 row or none. The value either exists or it doesn’t.

This fact means Oracle doesn’t have to worry about a whole bunch of things when dealing with Unique indexes during equality or unique checking processes. It doesn’t have to check the next index entry just in case there’s a second or more entries with the same value. It doesn’t have to worry about the potential of having to skip across to the next leaf page if the specific value it reads happens to be the maximum value in the current leaf page. It doesn’t have to worry about pointers to these “adjacent” leaf blocks changing on it due to block splits. It doesn’t have to concern itself with potentially visiting more than the one table data block during the index access operation.

Life is simple, it’s either 1 row or none.

Not so for Non-Unique indexes. With a Non-Unique index, there are no such guarantees. With a Non-Unique index, there are 3 categories of possibilities. An index scan could return 0 rows, it could return 1 row or it could return more than one row. It could potentially need to go and visit more than the current leaf block to return all the matching rows. It could potentially need to go and visit more than one table block.

Life’s not quite so “simple” for a Non-Unique index.

Note also and most importantly that life gets no easier for a Non-Unique index that polices a PK or Unique key constraint.

Even though there’s a PK or Unique constraint on a column, to Oracle, it’s just another Non-Unique index with the same “vague” possibilities. Remember that PK and Unique constraints can be enabled with NOVALIDATE meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index. Remember constraints can be DEFERRABLE, meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index.

This means that Oracle has to concern itself with a number of additional overheads, including having to “check” the next index entry, “just in case” it matches the required index value. It has to concern itself even with the possibility of having to visit the next index leaf block, “just in case”.

You will note when Oracle performs an equality search using a Unique Index, Oracle will perform an “INDEX UNIQUE SCAN” because the index entries MUST be unique.

You will note however when Oracle performs an equality search using a Non-Unique index, even if there’s a PK or Unique constraint of the column(s), Oracle will perform an INDEX RANGE SCAN, because it needs to scan multiple index entries “just in case”.

So are there any actual implications as a result of any of this ?

Yes.

When Oracle actually reads an index and processes the associated blocks in the buffer cache(s), Oracle uses a number of latches. These latches are used primarily to “protect” memory structures from concurrent activity. Very simplistically, by grabbing a latch, Oracle effectively performs a “lock” on the associated memory structure, perform whatever activity needs to be performed and releases the latch. These latches get grabbed and released (hopefully) extremely quickly (order of 1/10s of ms), but it’s a non zero value.

The issue with latches is that they’re a point of serialisation. If two (or more) processes want a specific latch, one (or more) has to wait. Latches also burn CPU. Only a teensy weeny bit at a time but some CPU nonetheless. They burn CPU while acquiring the latch and if fail due to latch contention, while attempting again and again to acquire the latch. They also burn CPU while performing the specific operation necessary of the latch.

Basically, the more latches, the greater the potential for contention, the greater the potential for latch related wait activity and perhaps most important of all, more CPU is required. In busy systems, there can be massive numbers of latch events and the best way to tune these events is to reduce where possible the number of latches required by the database environment. It’s one of the key reasons we try and reduce LIOs in a database as much as possible, to reduce the latch and CPU load on the system.

Because of the differences highlighted between Unique and Non-Unique indexes, the number and manner of latches required between the two indexes differs. And it differs significantly …

When using a Unique Index, Oracle required 3 consistent gets (one for the index root block, one for the leaf block and one for the table block). BUT, each consistent get was a consistent gets – examination, a special type of consistent get which only requires 1 latch (rather than the standard 2 latches).

So that’s a sum of 3 latches.

However, when using a Non-Unique index, Oracle required 4 consistent gets (one for the index root block, one for the leaf block, one for the table block and an additional one to recheck the leaf block for any duplicate index entries). BUT, only the 1 consistent read (for the index root block) was actually the “cheaper” consistent gets – examination, the other 3 were the more costly 2 latch variety.

So that’s a sum of 7 latches.

3 latches for the Unique index and 7 latches for the Non-Unique index.

That’s an increase of 133.3% in latches between the two types of indexes.

Now, the height of the index will change the ratio of latch difference between the two indexes. Also, in a busy system, there could potentially be differences in the types of latches used due to the current state or additional activity in a block.

However, the potential difference in latch requirements between a Unique or Non-Unique index can be very significant. But does a few additional latches here and there really make much of a difference ?

Well, of course it depends. On small scale systems with smaller loads, fewer indexes, fewer users and excess resources, the noticeable differences may be negligible.

However, in larger scale (especially OLTP) environments, a particular index may be accessed 100s or maybe 1000s of times a second. There may be 1000s of tables with 1000s of corresponding PK and Unique constraints policed by 1000s of Unique (or Non-Unique) indexes. It’s therefore not really of question of a few latches here or there. It’s a question of potentially a very significant proportion of overall latch related overheads.

Potentially when accessed, Non-Unique indexes could be generating double the latch related overheads for equality unique scan or unique checking index activity. Remember, the best way to tune latches and reduce latch contention is to simply reduce the requirement and load for latches.

The overall reduction in CPU and latch related wait activity could be significant between Unique and Non-Unique indexes because by using Non-Unique indexes you in the order of double the latches required for such activities.

Note also this doesn’t require any special parameters to be set or special tuning or monitoring by the DBA. It simply requires using Unique indexes to police PK or Unique constraints when there are no requirements of Non-Unique indexes. You then potentially gain a benefit each and every time the index is used for unique scan accesses.

Guess what type of access is extremely common in large scale OLTP environments …

The next time you complain about high CPU consumption or high latch contention and you’re tuned the application to death, just ask yourself how many Non-unique indexes are policing your PK or Unique Key constraints …

Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:

“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”

Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.

Let me explain why.

Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!

Let’s start by mentioning constraints again.

Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.

Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).

Lets say a table is Range Partitioned on column ‘A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!

Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.

Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.

This is actually a good thing.

If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).

It actually makes a lot of sense to do this.

Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.

Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.

If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.

Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.

In other words, the rules apply equally to both Unique and Non-Unique indexes.

So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint* to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.

Based on a great question by Alberto Dell’Era in my “Differences Between Unique/Non-Unique” blog entry (comment 9), I thought it might be a useful exercise to show how I go about confirming my understanding of a specific concept by trying to develop a little test case or demo that can illustrate the concept. My “magic incarnation” if you like ;)

The basic question was does the ROWID that constitutes an additional column in a Non-Unique index determine whether a particular row entry is the maximum or equivalent entry or not. Because by implication, this can determine and influence whether Oracle performs the generally preferred 90-10 splits rather than 50-50 block splits for indexed column values that at least equal the maximum value.

The answer is yes because the ROWID column is just another column in the index row entry and is simply treated the same. But how to actually “illustrate” and show this ?

I needed a way therefore to insert a ROWID that was always going to be the maximum ROWID value for a Non-Unique index. Then insert a whole bunch of subsequent ROWIDs of a lesser value than the maximum and inspect via index statistics whether the type of block splits changed from 90-10 to 50-50 block splits. Remember with the Object Number being equal (if it’s there at all), the next significant portion of the ROWID is the Relative File Number.

The plan was (reasonably) simple. Create a tablespace with one data file and fill it with something. Then add a second data file and use this to store the start of my table of interest (and of course create the index). This will create a whole bunch of rows with ROWIDs of a higher Relative File Number than those in the first data file. Then drop the first table and ensure the second table uses the free space created in the first data file. That way, a whole bunch of ROWIDs can be created that are less than existing ROWIDs because it would be using ROWIDs from the first data file, which has a lesser Relative File Number.

It’s the usual process I go through with these things. Find something that’s of interest, have some idea on how I think things work, come up with plans or strategies that will illustrate whether or not what I think is true (ensuring that somewhere in the process I include at least one reference to David Bowie ;). I can then later take the initial strategies and expand them for all applicable database options and features. Then see if anything changes between database versions and platforms.

The benefit of then showing these demos is that others can see exactly how I came to a conclusion, potentially try them out for oneself and perhaps see holes or flaws or shortfalls in the strategy or expand or tailor them for individual requirements or environments.

I’ve had a number of comments regarding my earlier blog entry where I recommended avoiding Deferrable and Novalidate constraints unless you need them and consider using Unique Indexes rather than Non-Unique Indexes where possible.

Why such a recommendation, aren’t Unique and Non-Unique indexes practically the same thing when it comes to policing constraints ?

Sure one index explicitly prevents the insertion of duplicates while the other doesn’t. Yes, dropping/disabling a constraint policed by an automatically created Unique index causes the index to be dropped if you forget the KEEP INDEX clause.

But that’s about it, right ?

Well, if you need a constraint to be deferrable, then you must create (either implicitly or explicitly) a Non-Unique index. If you want to enable a constraint with novalidate, then again you can only do so with a Non-Unique index in place policing the constraint.

It does all rather sound like Non-Unique indexes have all the advantages and allows for all the flexibility one could want. Non-Unique indexes allows for both deferrable and novalidate constraints, they don’t get dropped when the associated constraint is dropped / disabled and they can actually police both PK and Unique constraints.

What possible benefits are there in Unique Indexes ?

Well, providing you don’t need your constraints to be deferrable, you validate your constraints when they get created/enabled and you don’t go around dropping PK and/or Unique constraints on too regular a basis (or remember the KEEP INDEX clause if you don’t want your index dropped when you do), then there are a number of reasons why you may just want to consider using Unique indexes over Non-Unique indexes.

There are actually a number of key differences between Unique and Non-Unique indexes, both in the manner in which they’re stored by Oracle and in the manner in which they get processed.

In Part I, I’m just going to focus on the differences in how Oracle physically stores index entries.

In actual fact, there’s really no such thing as a Non-Unique index in Oracle. In order for Oracle to be able to determine the location of any specific index row entry and for Oracle to be able to determine an appropriate “order” for each index row entry, internally, Oracle coverts all Non-Unique indexes into a Unique index. It does this by using the associated ROWID of the index row entry as an additional “column”. As each ROWID is unique, this effectively makes all index entries in a Non-Unique index unique as well. Oracle uses the unique combination of the Non-Unique index value and the associated ROWID to then determine the appropriate order and hence appropriate location within the index structure in which to store the index row entry.

By Oracle making the ROWID an additional column, it also has to allocate an additional byte per index row entry in order to store the length of this column. That’s one teeny weeny little byte extra for each and every index row entry.

So what ?

Well, for indexes that don’t have a particularly large index key length, that one byte can be a significant proportion of the overall key length. Now Oracle needs to allocate 2 byes per row entry for various flags and locking information, it requires 6 bytes for the rowid and 1 byte for each column entry. That’s 9 bytes minimum plus the length of the indexed value itself.

Well how large is a typical unique index entry? Well that of course all depends and some PK / (and especially) Unique values can be quite large. But many many PK values are simply sequenced based numerical values, created nice and small so as to reduce overheads when stored in dependent child tables.

Table test1 is created with a Non-Unique Index, table test2 is created with a Unique Index. The demo shows a partial block dump of a leaf block from each index, highlighting how the Non-Unique index requires an additional byte per index row entry.

The Unique index manages to hold 533 leaf entries in the block while the Non-Unique index could only hold 500. Comparing the total sizes of the two indexes, the Unique index required 1875 leaf blocks while the Non-Unique index required 1999 leaf blocks.

That’s an increase of approximately 6.6% in leaf blocks required for the Non-Unique index to store exactly the same number of index entries as the Unique Index (in this particular example).

That’s 6.6% less storage, that’s a reduction of 6.6% in block splitting and block allocations, that’s a reduction of 6.6% in the cost of full index scans, that’s 6.6% less memory required to cache the index, etc. etc.

The point here is that these savings don’t require any expensive, periodic rebuilding of indexes. They doesn’t require any additional fancy scripts or additional monitoring and processing. The DBA doesn’t have to calculate irrelevant statistics or demand scheduled outages to claim these savings.

This a getting more “dollars for your buck” freebie from Oracle purely and simply by using a Unique index instead of an Non-Unique index.

Note also that not one or two but ALL of your numeric based PKs have the potential to get these types of savings. Obviously the larger the actual PK or Unique key values, the lesser a byte is in proportion to the overall key length and the less percentage savings.

But it’s not a bad payback for many many of your indexes, purely and simply by using Unique indexes instead of Non-unique indexes where possible …

This is but one of the benefits of using Unique Indexes. More (potentially significant) advantages to follow …

The question of whether the Buffer Cache Hit Ratio (BCHR) is a useful metric or not is one of those discussions in Oracle that seems to crop up on a regular basis. I thought it might be worth briefly mentioning the topic here.

The BCHR represents the percentage of LIOs in which the required block is already loaded in the buffer cache(s). The higher the BCHR value, the greater the percentage of blocks accessed directly from memory and the lower the subsequent ratio of physical reads. A “higher” BCHR is generally considered a good thing as it’s a good thing to find required blocks in memory (right ?), a “lower” BCHR is generally considered not such a good thing as it’s bad to perform a higher ratio of physical reads (right ?).

The first problem with the BCHR of course is that it’s often miscalculated with many not appreciating for example the impact of direct reads on the actual physical reads used in BCHR calculations.

Assuming the BCHR is correctly calculated, the second problem with the BCHR is that it represents a database wide average. Averages are always dangerous things as they generally completely hide specific details and anomalies. Unless the average represents and can guarantee some consistent database metric or behaviour, then the average quickly becomes just a number, without any actual or inherent meaning.

A specific BCHR value at any specific point in time doesn’t actually tell us anything meaningful about the performance of specific tasks within the database. A database generally performs 100s or 1000s or 10000s of tasks at any given point of time. Unless all tasks or a significant percentage of tasks exhibit the same performance issue, then a single database-wide metric will be unable to determine issues with these specific tasks. The “average” figure hides details of the specific issue.

Therefore, at what point or at what value does an average figure provide meaning ?

The third problem with the BCHR is that these specific tasks within the database sometimes perform activities that are “good” and “efficient” but can result in the BCHR either going up or down or remain unchanged. Sometimes these activities can be “bad” and “inefficient” but can also result in the BCHR either going up or down or remain unchanged. Therefore without understanding what these specific activities might be, it’s simply impossible to know whether a specific change in the BCHR is good or bad.

Let’s assume we have a BCHR of 90%. Is this a good thing or is this a bad thing ? Is database performance an issue with a BCHR at 90% or is database performance good ? Is database performance good or bad generally or are there specific processes within the database that are problematic ? The answer of course is that it entirely “depends” and a value of 90% or 95% or 50% can’t in of itself answer any of these questions.

We can have a BCHR at 90% and performance can be terrible. It can be terrible at a database wide level due to any number of problems or issues or it can be terrible for specific users using specific applications or processes.

We can have a BCHR at 90% and performance can be perfect, with all users achieving optimal response times.

A BHCR of 90% is useless on it’s own. We need to go and perform all manners of additional checks to ensure the database is “healthy”.

However, even those who claim the BCHR is a meaningful and useful tuning metric generally agree and admit the BCHR on its own has no inherent usefulness and that it needs to be used in combination with other database “checks”. They generally claim that it’s the BCHR when monitored and used from a historical point of view with fluctuations of its value over time that makes the BCHR useful.

Really ?

Let’s again assume the BCHR has been at 90% for the past few weeks (or whatever time-frame) and it has now dropped to 85%. The Quest Spotlight monitor is flashing red and obviously something has “changed”. However, has it changed for the better, for the worse or has it had no noticeable impact on the “health” of the database (perhaps simply the specific workload has changed) ?

Well again, without looking at other specific metrics, one can’t possibly answer these questions. Perhaps we have an important process or (processes) that have suddenly started performing expensive, inefficient Full Table Scans. That’s not good, as the extra logical and physical IOs have impacted response times detrimentally. Things have indeed gone worse. Or perhaps we have a process that was performing an extremely inefficient nested loop operation, reading the same cached table numerous of times that is now performing the same function much more efficiently, reducing LIOs significantly. Response times may have improved and things are indeed better. Or perhaps there’s a large batch program or report that needs to be urgently run during normal business hours that’s resulting in lots of physical IOs to the database, but is not actually impacting the more important online transactional systems. Actually, the health of the database may not have changed at all.

Therefore, having a BCHR that has dropped to 85% (or whatever value ) doesn’t actually tell us much other than something may have changed. But it may have changed for the worse or the better or not significantly at all. There’s no way of knowing without performing further database checks.

Let’s assume the BCHR has gone from 90% to 95% (or whatever higher value). The Quest Spotlight monitor is flashing nice and green but something appears to have changed. However, has it changed for the better, for the worse or has it had no noticeable impact on the “health” of the database ?

Well again, without looking at other specific metrics, one can’t possibly answer these questions. Perhaps we have a key process or (processes) that was previously performing expensive, inefficient Full Table Scans that are now performing efficient index scans. That’s good, the reduction in logical and physical IOs have impacted response times positively. Things have indeed gone better. Or perhaps we have a process that was previously performing efficiently that has suddenly started to perform extremely inefficient nested loop operations, reading the same cached table numerous of times, increasing LIOs significantly causing the overall BCHR to increase as well. Response times may have plummeted and things are much worse. Or perhaps there’s a series of large batch programs or reports that usually run during normal business hours but the reporting section are on a Xmas lunch and haven’t bothered to run them today resulting in a reduction of physical IOs to the database, but is not actually impacting the more important online transactional systems. Actually, the health of the database may not have changed.

Therefore, having a BCHR that has increased to 95% (or whatever value ) doesn’t actually tell us much other than something may have changed. But it may have changed for the worse or the better or not significantly at all. There’s no way of knowing without performing further database checks.

Let’s assume the BCHR has not changed at all and is still sitting at 90% (or whatever value). The Quest Spotlight monitor is flashing nice and green but nothings appears to have changed. However, has nothing really changed, or could things now be seriously wrong with the database ?

Well again, without looking at other specific metrics, one can’t possibly answer these questions. Perhaps we have a key process or (processes) that was previously performing expensive, inefficient Full Table Scans and are now performing efficient index scans. That’s good, the reduction in logical and physical IOs have impacted response times positively. Things have indeed gone better but because the BCHR is a database-wide metric, this improvement made have gone unnoticed. Or perhaps at the same time we have a process that was previously performing efficiently that has suddenly started to perform extremely inefficient nested loop operations, reading the same cached table numerous of times, increasing LIOs causing response times to plummet and making key business processes much worse. But again because the BCHR is a database-wide metric, the overall BCHR may not have been impacted.

Or of course, one of hundreds of things have caused serious database performance issues while the BCHR remains totally unchanged …

Therefore, having a BCHR that has remains unchanged doesn’t actually tell us much either. The database made be running better than previously, the database may be having serious problems or the “health” of the database may remain unaltered.

So, the BCHR at any specific value doesn’t tell us much without having to check other database metrics as well.

The BCHR increasing doesn’t tell us much without having to check other database metrics as well.

The BCHR decreasing doesn’t tell us much without having to check other database metrics as well.

The BCHR remaining unchanged doesn’t tell us much without having to check other database metrics as well.

Note the database metrics we would need to check (for example, those queries using the most LIOs, those queries using the most PIOs, those queries using the most CPU, those queries being executed the most, those queries with excessive response times, causes of the most significant waits in the database, specific application/process response times, etc. etc. etc…) are exactly the same in all the above scenarios.

The biggest problem of all with the BCHR is that regardless of it’s values, or whether it goes up or down or remains unchanged, we need to perform precisely the same database checks regardless as it doesn’t tell us whether the “health” of the database has improved, got worse or remains unchanged.

If we need to perform the same actions regardless of the BCHR, then I suggest the answer to whether the BCHR is useful or not is a big and resounding no.

It’s been a full 10 years since they released what is probably my favourite album of all time, the classic “OK Computer”. Since then, they’ve become one of the most influential and innovative bands around, with Thom Yorke’s distinctive vocals and Jonny Greenwood’s guitar work creating their uniquely beautiful “sound”.

A new Radiohead album is always in big event in my life and it’s been a long wait since their last album was released in 2003, “Hail To The Thief”. For the past few months I’ve been enjoying their new release “In Rainbows”.

It created quite a bit of press and controversy when Radiohead decided to release the new album as a download only release with no record label affiliation, with the buyer choosing how much to pay for it. You could pay as little (0.45p was the minimum I think) or as much as you liked. As a marketing move, it was a very very clever initiative.

It did however distract attention away somewhat from the music itself and what is quite simply a stunning collection of songs. From the bouncy opening “15 Steps”, to haunting beauty and tragedy of the closing “Videotape”, it’s arguably their best album since “OK Computer”. A wonderful surprise is the release finally of the 1997 era “Nude”, a live favourite for years which Thom has finally got right in the studio. I just love “Jigsaw Falling Into Place”, it just has so many twists and turns but the highlight for me is Weird Fishes/Arpeggi which just sends shivers down my spine every time I hear it. It’s incredibly emotional and has a profound sense of sadness but at the same time an incredible energy that perfectly encapsulates what music is all about.

As my Christmas treat (as I’ve been a really really good boy this year), I’ve got on order the “In Rainbows” box-set which includes a second CD of unreleased tracks. I can’t wait.

However, the album is released commerciallyin CD and vinyl formats via the usual channels on the 31st December. Do yourself a favour, buy “In Rainbows” and enjoy what I’m sure will be considered a truly classic album in the years ahead.

Back when Oracle8 was released, Oracle introduced a number of new features with regard to constraints.

The first was the option of making a constraint DEFERRABLE, meaning the policing of a constraint can be deferred until the issuing of the COMMIT, rather than during the execution of an individual statement. This gave application developers more freedom in how they designed code, particularly with regard to the order in which parent – child data is inserted and manipulated.

The second new option was the ability to enable a constraint with NOVALIDATE, meaning Oracle would enable the constraint but not bother to check existing data to ensure nothing violated the constraint.

This could be useful in a number of scenarios. For example, you have data that currently violates the constraint but have urgent business requirements to enable the constraint ASAP preventing further violations, with the intention of cleaning up the existing violations at some future time.

Or you know the data is OK, so to reduce the overheads associated with enabling the constraint (eg. reading all the data to check for violations), you enable the constraint “immediately” with NOVALIDATE, bypassing the “redundant” checking.

Both deferrable and novalidate constraints therefore imply there “could” be data at any given point in time that violates the constraint. Therefore Oracle also introduced the ability to have non-unique indexes (rather than unique indexes) policing either PK or Unique constraints. For deferrable or novalidate constraints, the index must in fact be non-unique, as a unique index would prevent any such (temporary) violations of PK or Unique constraints.

Now, there are a number of interesting and subtle differences in the manner in which Oracle manages and processes a Unique vs. a Non-Unique index (eg. the amount of storage they use, the amount of redo they generate, the number of latches they acquire). This will be discussed in another Blog entry some other day.

Today, I just want to focus on a couple of interesting little side-effects with regard to how the CBO deals (or doesn’t deal)with NOT NULL and CHECK constraints that have been created as Deferrable or Novalidate.

In 9i, the CBO was clever enough to know that if someone searched for a NULL value but the column had a NOT NULL constraint, there couldn’t possibly be any data matching the criteria. Providing you had an index on the column, the CBO would generate an execution plan that used the index, found no NULL values and returned an empty row set accordingly. If you had no index, the CBO would be forced to use a Full Table Scan. So the CBO actually used an index in an efficient manner to search for non-existent nulls.

BUT, if the NOT NULL constraint was either deferrable or novalidated, then Oracle couldn’t know there were no nulls, there just might be. Therefore, Oracle was forced into the FTS regardless of the existence of the constraint or index, as null values are not indexed (unless part of a concatenated index).

Since 10g, the CBO has become smarter. The NOT NULL example works in a very similar manner, except that the index is no longer required. If one searches for a NULL value on a column that has a NOT NULL constraint, the CBO automatically determines there can be no matching rows and returns the empty row set immediately with no LIOs. None, as accessing the data is simply not necessary.

BUT again, it can only do so if and only if the NOT NULL constraint is validated and nondeferrable, otherwise the CBO can’t guarantee no nulls.

Although we actually have applications that intentionally search for nulls on NOT NULL columns to return empty row sets, it’s not common that an application would perform such a search.

What is much more common is searching for a column value that simply doesn’t exist. If a column value doesn’t meet a business rule, it’s a good idea to police such business rules with Check constraints. 10g has extended the NOT NULL scenario to include Check constraints. If a search attempts to search for a column value that violates a check constraint, Oracle will immediately return an empty row set without performing any LIOs.

But once again, it can only do so if the check constraint has been validated and set as nondeferrable.

Making constraints deferrable or enabling them with novalidate can be useful. However, if possible, ensure constraints are not deferrable and validated as this provides the CBO with additional information regarding the columns that it might just put to good use.

Additionally, unless there’s a requirement to the contrary, use unique indexes rather than non-unique indexes to police uniqueness. But that’s a discussion for another day …

Outlier values are basically values that sit way way outside the standard range of a column’s normal value range.

Data can be a funny thing and sometimes there are values that are naturally “exceptional”. However, very commonly, outlier values are used by applications to represent bizarre default values, to avoid confusion with legitimate values. For example, I look after an application that uses the American Date Of Independence as it’s “default” date.

Usually, these weird outlier values are used to avoid nulls values, as nulls can be problematic and can not be indexed (well actually you can index a null column but we’ll leave that for another blog entry).

However, outlier values while (maybe) solving one problem, can introduce some very significant problems in return.

Firstly, the CBO “hates” outlier values as it potentially totally screws up the CBO’s selectivity calculations. The selectivity of a range scan is basically calculated by the CBO to be the number of values in the range of interest divided by the full range of possible values (IE. the max value minus the min value). Therefore if this calculation is invalidated by a massive and disprotionate “hole” in the full range of possible values, the CBO can get things horribly wrong.

Additionally, indexes “hate” outlier values as it prevents Oracle using the 90-10 block split to keep indexes nice and compact and is forced to use 50-50 block splits instead. Basically a 90-10 block split is considered if and only if the index entry to be inserted is equal or greater than the current maximum value. An outlier value that is also the maximum value, usually means monotonically increasing values (such as sequences, dates, etc.) don’t actually insert the maximum value. Therefore, not only do indexes perform 50-50 splits but this 50% of free space is never used, as all new values are all almost, but not quite, maximum values.

Hundreds of years ago, most people thought the world was flat and if you travelled far enough, you’ll simply fall off the edge of the world. It seemed like a reasonable thing to believe, I mean the world kinda looks flat doesn’t it? It just didn’t make sense to think otherwise. Also, as most people didn’t really travel very far in their lives, it wasn’t a big deal to most people. The chances of falling off the world were remote. Very few actually pushed the boundaries to find out the truth; whether the world was flat or not just wasn’t relevant or important to most people.

Well about 10 – 15 years or so ago, most people thought Oracle was “flat”. OK, not flat exactly, but certainly that a segment with one extent was better for performance than say 20 extents, that having a PCTINCREASE of 1 reduced fragmentation, that a high BCHR meant all was likely well with the database, that indexes should be rebuilt on a regular basis etc. etc. I mean it all sounds like it makes sense doesn’t it? Many new to Oracle came from other databases and technologies where these sorts of things were perfectly acceptable practices in their old environments, so it made sense they were necessary in Oracle too. They were also all based or founded on some level of fact or half truths but half truths have a habit of becoming full truths and over time get set in stone as cold, hard facts.

In those days, it really wasn’t much of an issue anyways. Databases were relatively small, anything over a few GB was considered enormous so rebuilding and reorganising things, although inconvenient, was all very plausible. Very few databases were truly 24 x 7 with most having those long, dark, lonely nights fully available to play around with the database, without disrupting business processes.

There was no Internet in those days or it was just emerging and so communication in all things Oracle was mainly through books and conferences where relatively few “experts” dominated. There were very very few that pushed the boundaries, most people were content to simply read and hear they were spending those dark nights doing something constructive. Everyone suggested rebuilding indexes on a regular basis was a good thing and improved performance, so it must be right, right ?

However, things changed. Over time, databases kept getting bigger and bigger with hardware not keeping up with shrinking maintenance windows. Organisations demanded more and more availability and those lonely nights became as mission critical as the traditional 9am-5pm.

The Internet became a hugely important communication vehicle with people able to ask questions and get instant answers from the other side of the world. Suddenly someone could make an erroneous comment and instantly someone could correct them.

This enabled a whole new wave of Oracle professionals to make an impact. The previously few recognised authors and noted experts were swamped by a whole new breed of Oracle expert, who were able to easily communicate and share their knowledge through websites and a mass of Oracle related forums. They weren’t satisfied with simply rehashing Oracle manuals and books, but treated truly understanding Oracle as a scientist would try and truly understand their specialised field. Jonathan Lewis, Steve Adams, Tom Kyte, Anjo Kolk, Mogens Norgaard etc. become well known in the Oracle sphere and began changing how the majority saw and understood Oracle. People questioned authority and began testing and proving things for themselves.

Rebuilding indexes on a regular basis not only became questionable (as did all the other myths), it become unfeasible and impracticable. A recent survey by the IOUG suggested 35% of Oracle shops surveyed had at least one database over 1TB. DBAs started looking after more and more databases, looking after bigger and bigger databases, looking after more and more complex database environments.

Do we really need to rebuild these damn indexes ? As these myths were investigated, tested, analysed and pulled part, the answer become more and more obvious.

No.

It simply required a clear and accurate understanding of how indexes actually behaved for the truth to be known, as researched and documented by the new breed. It also required enough Oracle people needing a solution to their maintenance issues, needing to actually justify business unavailability, for actual facts to start replacing myths.

Most indexes never need to be rebuilt. Never, ever.

Most.

So is the answer above really no ? Is the world truly not flat and safe to walk on for as long as we could want ?

If you were to blindly walk (for example) to the Grand Canyon in the USA, someone could indeed plunge off the “end of the world” to their doom. Some indexes if not rebuilt could cause performance issues if not rebuilt.

Here lies one of the main reasons why these Oracle myths never seem to die, they’re often based on half truths and it’s determining which half is which that can be problematic (although the ratio is more accurately 999-1 than 50-50). Determining that 0.1% is not necessarily a straight forward process, it can’t always be determined by a simple formula or rule of thumb.

Also human nature can be a funny thing. People are often either too lazy or too busy or both to want a complex answer to a complex question. People love easy solutions, even if it’s not actually the right solution. The path of least resistence is often the most traveled path, even if it leads to a sudden drop at a cliff’s edge …

But why in this information age, is finding the answer of what is truth and what is myth so problematic ? With so much good, clear, accurate information out there, with so many respected and knowledgeable experts out there spreading a true understanding of Oracle, why can the truth still be so hard to determine ?

The answer is “Information Pollution“.

Historically, there has been so much inaccurate information written and published about Oracle, there have been so many “wrongs” written as “rights”, that technical accuracy can often be “clouded” by this pollution. Also, some of the older breed simply refuse to accept today’s realities. They desperately cling to old habits else fear losing creditability.

Oracle Corportation itself must accept some of the blame. They’ve historitically documented things incorrectly, they still have metalink notes that are inaccurate, host forums that spread misinformation. Although Oracle produces incrediably detailed documentation and resource rich web sources, Oracle itself is one of the big “Information Polluters”.

So, there are still many people who spread myths. There are still many people who think all indexes should be rebuilt regularly if they experience lots of DML, that indexes should be rebuilt if they’re have a height greater than some level.

There a literally thousands of Oracle newbies coming into the scene each year. How can they tell what is right from wrong (or half right from half wrong) when they have no prior knowledge or experience with Oracle ?

Not easily.

And if they pick the wrong source and learn inaccurate information and adopt inefficient processes, guess what ? They in turn become information polluters and so the clouds potentially thicken …

And so these Oracle myths continue to persist and never seem to quite disappear. So is Oracle in danger of “Global Warming” ?

No, because thankfully, the amount of good information far outweighs the bad. Oracle as a community is producing far more “Oxygen” than “Carbon-Dioxide”. The number of people who think one extent is better or that indexes need constant rebuilding, as percentage of the whole, is dwindling and dwindling over time. More and more people appreciate the world really is a lovely round planet because they have to know. Just as they have to travel to the other side of the world, they have to maintain larger and larger databases with less and less disruption to their businesses.

Importantly, there’s also a very easy and generally very accurate way to determine the right from wrong, the wheat from the chafe.

Simply prove it.

If something within Oracle (or anything really) is right and accurate and correct, then it can generally be proven. The Jonathan Lewis and Steve Adams and Tom Kytes of this world not only accurately document how an Oracle process works and functions, they generally prove it with clear, consistent, repeatable test cases. They don’t simply say a=b they actually demonstrate and prove it.

Whereas those that deal in myths and inaccuracies by definition are unable to prove it. They makes claims and suggestions and generalisations but lack that one essential ingredient that differentiates them from those that deal with facts and technical accuracies.

Proof.

And without proof, one can clearly pick out the “Information Polluters”. The next time you read something from somewhere, just ask yourself this one simple question. Are they simply stating something as fact or are they actually attempting to prove something as fact. Can that proof be copied across into your particular environment ? Is it still accurate ? Can you extend and expand on the claims and is it still accurate ?

An answer in the positive is a hell of a lot better than an answer that consists of nothing more than a “because it just is” …

Why are there so many Oracle related myths ?

Global Warming !!

Is it too late ?

Not if we meet the Oracle Kyoto Treaty and demand clean oxygen-based proofs ;)

The issue of when to rebuild indexes crops up again and again and again. Unfortunately, so do the same incorrect, myth-filled uninspired responses which considering how important and “key” (no pun intended) indexes are to database design and performance generally, is very puzzling indeed.

In the coming days I’ll post why I believe these index related myths simply refuse to go away …

This presentation was originally written several years ago but is still as relevant today as it’s always been.

New in 11g are “Invisible Indexes”, which are basically indexes that exist and are maintained by Oracle but are “invisible” to the CBO. Specific sessions can be set to see these invisible indexes as necessary.

Potentially useful if one has a problematic (and very large) index causing performance issues that you want to make invisible until the specific issue is addressed without the expensive of having to drop and latter recreate the index. Also useful if you want to introduce a new index but want it to be invisible until it’s been given a workout first in a specific “test” session.

Occasionally, as I come across interesting Oracle Database related issues, I’ll post my thoughts and opinions and who knows what else and perhaps, just maybe, others may find it interesting or useful as well.

However, will try and focus on the specific topic of Oracle Indexes as it’s such a huge and important area in any Oracle database design. It’s also an area in which there is much confusion, more than it’s far share of myths and popular misconceptions and one in which DBAs, Developers, Database Designers and the like all have a key role to play.

Let the fun begin …

search

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle Corporation.

Contact Details

If you wish to contact me directly, please do so at richard.foote@bigpond.com