There’s been an interesting recent discussion on the OTN Database forum regarding “Index blank blocks after large update that was rolled back“. Setting aside the odd scenario of updating a column that previously had 20 million distinct values to the same value on a 2 billion row table, the key questions raised are why the blank index leaf blocks and why the performance degradation after the update failed and rolled back.

The key point to make is that an Update is actually a delete/insert operation in the context of indexes. So if we perform a large update, all the previous indexed values are marked as deleted in the index and the new values re-inserted elsewhere within the index structure, potentially filling up a whole bunch of new leaf blocks. If we then decide to rollback the transaction (or the transaction fails and automatically rolls back), then all these newly inserted index entries are deleted potentially leaving behind now empty new leaf blocks in the expanded index structure. Here’s the thing, Oracle will roll back changes to index entries but not changes to the index structure such as block splits.

If an index scan is forced to navigate through these empty leaf blocks, this can indeed potentially have a detrimental impact on subsequent performance.

However, depending on whether the index is Unique or Non-Unique and the type of update being performed, the impact on the index could be quite different.

Now the interesting thing to note here is that for each ID value, we temporarily have the same value twice as we progress and update each ID value (for example, for ID=1, it becomes 2 which already exists. Then the previous ID=2 becomes 3 which already exists, etc.). As the index is Non-Unique, this means when we update say ID=1 to 2, we need mark as deleted the index entry with ID=1 and insert a new index entry with an ID=2. When we update the previous ID=2 to 3, we again mark as deleted the previous indexed value of 2 and insert a new index entry of 3. Etc. Etc.

As we only have 10% of free space available in the index before the update, by updating all rows in this fashion, it means we have to keep performing 50-50 block splits to fit in the new index entries in the corresponding leaf blocks. This effectively results in the index doubling in size as we now have twice the number of index entries (with the previous index entries now marked as deleted).

However, having now performed all these index block splits, if we now roll back the update transaction, it simply means that all the new index entries are deleted and the delete byte removed from the previously deleted entries, with the index structure retaining its newly bloated size. The resultant index block splits are not rolled back. If we look at a new index tree dump:

As all the inserts now occurred in the right-hand most side of the index, Oracle allocated a bunch of new index leaf blocks via 90-10 block splits to store all the new index entries. After the rollback however, all these new entries were removed leaving behind nothing but these new empty leaf blocks which are still part of the overall index structure.

Query performance now depends on what part of the index we need to access.

If we just want to select a single value, then no problem as the ID column is effectively unique and we just need to generally access down to the one leaf block:

4 consistent gets is about as good as it gets for a non-unique Blevel 1 index.

Larger index range scan might need to access additional leaf blocks as they now only contain 1/2 the number of index entries than before, although the additional overhead of such scans would still likely be minimal as most of the work is associated with visiting the table blocks.

One of the worst case scenarios would be having to now plough through all these empty leaf blocks as with the following search for the max ID value:

Oracle uses the Index Full (Min/Max) Scan by starting with the right-most leaf block but as it’s empty, Oracle is forced to make its way across through all the empty leaf blocks until it finally hits upon the first non-empty leaf block that contains the max ID. The excessive 32 consistent gets is due to having to access all these new empty blocks.

In Part I, I discussed how global indexes can now be asynchronously maintained in Oracle 12c when a table partition is dropped or truncated. Basically, when a table partition is dropped/truncated with the UPDATE GLOBAL INDEXES clause, Oracle simply keeps track of the object numbers of those table partitions and ignores any corresponding rowids within the index during subsequent index scans. As such, these table partition operations are very fast and efficient as the global indexes are not actually maintained during the partition operation, but importantly, continue to remain in a usable state.

If we look at a partial 11g global index block dump after dropping a table partition (eg. the MUSE_ID_I in the previous demo):

We notice that all index entries that reference the dropped table partition are marked as deleted. They all have a D (deleted) flag set and have been locked by the drop table partition transaction in ITL slot 2. So prior to Oracle 12c, to update global indexes on the fly was a relatively expensive operation as it required all the associated index entries to be deleted from the global indexes.

However, if we look at a block dump of the same index in an Oracle 12c database following a table partition being dropped:

We notice there are no deleted index entries, the index remains totally untouched by the drop table partition operation. So the good news is that dropping/truncating a table partition while updating global indexes is extremely fast and efficient while the indexes remain hunky dory as subsequent index range scans can ignore any rowids that don’t reference existing table partitions of interest.

However, the bad news is that during subsequent index DML operations, Oracle does not know which index entries are valid and which are not and so the space used by these “orphaned” index entries can not be automatically reclaimed and reused as it can with conventionally deleted index entries. Therefore, we need some other way to clean out the orphaned index entries.

There are a number of possible ways to do this. One way is to simply rebuild the global index (or index partition):

Effective, but relatively expensive as this requires the entire index structure to be rebuilt from scratch. Depending on the scale and distribution of the orphaned index entries, another possibly cheaper alternative is to use the new CLEANUP coalesce clause:

This visits each index leak block and removes all the orphaned index entries as part of the coalesce process. Note this is a more “powerful” version of coalesce as a standard coalesce is not aware of orphaned index entries and will only coalesce the index without actually removing the orphaned index.

Yet another possible option is to simply wait for the PMO_DEFERRED_GIDX_MAINT_JOB job to run (scheduled by default during the 2am maintenance window) to clean out orphaned index entries from all currently impacted global indexes. Yet another alternative is to manually run the dbms_part.cleanup_gidx procedure which is in turn called by this job:

We notice the last index partition has now been cleaned out and no longer has orphaned index entries.

So with the new asynchronous global index maintenance capabilities of the Oracle 12c database, we can perform a much faster and more efficient drop/truncate table partition operation while keeping our global indexes in a usable state and leave the tidying up of the resultant orphaned index entries to another time and method of our convenience.

Previously, I discussed how an ALTER INDEX … COALESCE is going to be less expensive in terms of using resources than an equivalent ALTER INDEX … SHRINK SPACE COMPACT (or ALTER INDEX … SHRINK SPACE) as the Coalesce doesn’t have to concern itself with ensuring all leaf blocks at the physical end of the index segment have all been moved to allow for the storage to be de-allocated from the index segment. If you just want to de-fragment an index and not necessarily reduce the overall space allocated to the segment, use Coalesce rather than the Shrink options as it’s cheaper.

But what about an ALTER INDEX … REBUILD, when, if ever, should it be used ?

Well the answer is as with most things Oracle, it depends.

Scenario One.

We have a table and the application deletes historical data but in a manner in which leaf blocks are not being entirely emptied. Basically, older stuff is removed, but it’s only removed in a random manner, from approximately the “earlier” 10% of the table. The index is sequenced which means only those leaf blocks in the “left-most” 10% of the index structure are impacted but all this deleted space is “deadwood” as new index entries are only being inserted into the “right-most” part of the index.

Note that basically 90% of the index is fine and very well utilised, it’s only 10% of the index that’s problematic. Of the problem 10% of leaf blocks, there’s plenty of free or deleted space, with many leaf blocks almost but not quite empty.

Coalesce (and indeed Shrink) will basically run through these 10% of fragmented leaf blocks and will merge the index row entries into as few leaf blocks as possible. With the 90% of blocks that are fine, Coalesce will basically read and then ignore them from any processing as there’s nothing that can be done for them.

Rebuild on the other hand will take an entirely different approach. It will (generally) read the entire existing index structure and will build a brand new, bright and shining index segment. As part of this process, it will rebuild the entire index, it has no choice (assuming the index isn’t partitioned, but that’s another story) and will rebuild the 90% of the index that was actually perfect to begin with. Rebuilding 90% of something that doesn’t need rebuilding doesn’t sound particularly efficient and indeed it isn’t. As a result, the index rebuild will use substantially more resources and generate substantially more redo than an equivalent Coalesce (or Shrink Space).

Scenario Two.

We have an application that deletes data and it deletes data throughout the entire index structure. The deletes are significant with a substantial proportion of the overall rows having been deleted. Additionally, the table is not going to be repopulated with anything like the same volume of data or it won’t be repopulated for a substantial period of time. As such, all this deleted index space is “deadwood” as it’s not going to be used any time soon, if at all.

Now typically in this sort of scenario, it’s of course the table as much as the associated indexes that needs to be rebuilt. That’s a key point. However, maybe Full Table Scans are not an issue for this table so the wasted space in the table is not of urgent concern. Maybe the table in not in an ASSM tablespace or in a database that supports a Table Shrink command and maybe moving the table is not an immediate option due to availability concerns. For whatever reason (or lack of reason), the index needs to be de-fragmented.

Note it’s the entire index that’s problematic here and there could be portions of the index that have very few remaining index entries.

Now poor Coalesce (and indeed Shrink) has a bit of an issue here. They both merge index entries from two blocks into the one block where it can. However, if leaf blocks are really empty, these merged index entries may in turn be merged and moved again with index entries from yet another leaf block. And maybe yet again with another leaf block. And again and again … So a specific index entry may actually be moved into several different leaf blocks during the entire process. Each of these moves requires resources and generates redo and takes time.

Now the rebuild has an entirely different approach. As mentioned, it will basically (generally) read the entire exisiting index structure and will build a brand new one, but importantly as it does so will only have to locate a specific index entry once and once only. Also, as it’s the entire index structure that’s problematic, there’s no issue with fixing the entire index, as it’s all “broken”.

As a result of only having to deal with an existing index entry the once vs. the Coalesce which may relocate a specific index entry many times, the index rebuild is going to be substantially more efficient and potentially use significantly less resources and generate less redo.

Basically, Coalesce is particularly efficient and uses less resources when the percentage of the overall index structure that’s problematic and fragmented is relatively small (less than approximately 20-25% of leaf blocks). Rebuild is particularly efficient when the percentage of the overall index structure that’s problematic and fragmented is relatively large and the average degree of fragmentation within an index leaf block is relatively high. Note Pre 10g, an index needed to have at least 50% free space less pctfree in neighbouring leaf blocks for a Coalesce to be effective.

Now Rebuild (and Rebuild Online) potentially have locking implications that need to be considered although as we’ll see later, 11g has addressed some of these issues …

As already discussed, ALTER INDEX COALESCE in 10g onwards works in a very similar manner to ALTER INDEX SHRINK SPACE.

However, there are a number of key differences.

The first thing to point out is that each command has a slightly different purpose.

Coalesce is designed specifically to reduce fragmentation within an index but not to deallocate any freed up blocks which are placed on the freelist and recycled by subsequent block splits.

Shrink is designed specifically to reduce the overall size of an index segment, resetting the High Water Mark (HWM) and releasing any excess storage as necessary.

The key difference being that Shrink must reorganise the index leaf blocks in such a way that all the freed up, now empty blocks are all grouped together at “one end” of the index segment. All these blocks can then be deallocated and removed from the index segment. This means that specific leaf block entries must be removed from these specific blocks, in order to free up the leaf blocks in this manner.

Although Coalesce in 10g performs the operation in a similar manner to that of the Shrink Space, it can be more “lazy” in how it deals with the subsequent empty blocks and places then on the segment freelist as necessary.

COALESCE and SHRINK SPACE COMPACT are logically equivalent commands. Both options will “defragment” an index by “merging” index entries where possible thus reducing the number of blocks within the logical index structure. Both will result in the same number of leaf blocks within the index and both will result in the index height not being changed.

However, there are two key differences.

1) The SHRINK SPACE COMPACT option has the disadvantage of being more expensive to process as it has to concern itself with ensuring all necessary blocks can be emptied from the physical “end” of the index segment to be subsequently deallocated. This will result in more undo and redo being generated during the defragmentation of the index than would have been generated by the same corresponding COALESCE command.

2) The SHRINK SPACE COMPACT option has the advantage of being able to immediately deallocate the empty blocks, thereby reducing the actual size of the index segment by issuing a subsequent SHRINK SPACE option (although of course this can be performed in the one step by issuing SHRINK SPACE in the first place). However, the COALESCE option will not be able to just deallocate the free space. A subsequent Index SHRINK SPACE command on a previously coalesced index will require additional undo and redo than that of a previously “Shrunk” index as the necessary empty blocks are removed from the freelist and redistributed to allow for the de-allocation of blocks and the resetting of the High Water Mark of the index segment.

Note also that the Shrink option can only be used in Automatic Segment Space Management (ASSM) tablespaces.

Use Coalesce when the intent is to just defragment an index, knowing that the freed leaf blocks will be recycled by subsequent block splits, as it uses less resources than an equivalent Index Shrink Space.

Use Shrink Space when the intent is to reduce the actual storage allocated to an index, for example in the scenario where a table has permanently reduced its size and the index is unlikely to reuse the freed storage.

I thought it might be worth mentioning some interesting changes in the manner in which the ALTER INDEX … COALESCE command works since Oracle 10g.

Basically the purpose of the COALESCE option is to reduce free space within the Leaf Blocks of an index. This is achieved by effectively performing a Full Index Scan of the leaf blocks, comparing the free space available in neighbouring blocks. In 9i, the basic method was to logically start with the left most leaf block and see if it could be coalesced or merged with the 2nd left most block. This required the sum of used space within these 2 blocks to be less than 100% of a block less the PCTFREE value. If so, the contents were merged with the contents of one block placed in the other and with the now empty leaf block removed from the index structure and placed on the index freelist.

It then looked at the 2nd leaf block (which might now be the first block if previously coalesced) and 3rd leaf blocks to see if these could be coalesced. If so they were merged and the empty block placed on the freelist.

And so on and so on until all leaf blocks had been traversed and all possible leaf blocks coalesced.

Note branch blocks are not directly merged during this process, except to be updated with modified pointer information if a leaf block coalesce had taken place. However, if enough leaf blocks are removed such that the branch block contains no more pointers to leaf blocks (or other intermediate branch blocks), it’s also removed from the index structure. However, there must always be at least one branch block from each level remaining hence the height of an index always remains the same during a coalesce operation.

Note if no leaf block had 50% or more free space, nothing would be coalesced as no two consecutive leaf blocks would have sufficient free space in which to be coalesced.

In 10g, the Coalesce operation has been modified somewhat.

An index no longer requires the sum of used space plus PCTFREE in adjacent blocks to be less than 100% of a block be effectively coalesced. For example, the free space in a block can be 25% in one leaf block and just 25% in the adjacent block (hence the combined used space alone being 150% of a block) and 10g can effectively coalesce these leaf blocks together.

The issue of when to rebuild indexes crops up again and again and again. Unfortunately, so do the same incorrect, myth-filled uninspired responses which considering how important and “key” (no pun intended) indexes are to database design and performance generally, is very puzzling indeed.

In the coming days I’ll post why I believe these index related myths simply refuse to go away …

This presentation was originally written several years ago but is still as relevant today as it’s always been.