Updates and Indexes Part II (Down Is The New Up) February 9, 2009

In Updates and Indexes Part I, I described how there’s no such things as an “update” operation as such on a index and that an update is effectively a delete followed by an insert operation.

I also showed how Oracle only marks index entries as deleted and doesn’t physically delete the index entry at the time of the transaction.

This leads some folk into (incorrectly) thinking indexes that experience lots of update (or delete) operations need to be frequently rebuilt as the delete space might accumulate and waste space within the index structure over time .

However I also showed how deleted space is actually generally automatically reused by Oracle. For example, all it takes is one subsequent tiny little insert operation into a leaf block for all deleted entries within the leaf block to be automatically cleaned out by Oracle.

Therefore just because an index experiences lots of update activity doesn’t necessarily mean the index needs to be rebuilt. Generally, all deleted space is reused and is effectively nothing but free and available space within the index.

To emphasise this key point and attempt to really get the message across, I thought it might be worth going through a little demo of an index that does indeed experience lots of update operations and see what impact it actually has on the index.

Let’s first just create a simple little table that will have two numeric columns:

SQL> create table bowie (id number, value number);

Table created.

Now let’s populate the first column with a monotonically increasing unique identifier and the second column with a random number:

Specifically note the number of leaf blocks and the overall btree space usage of the index. Also note that as the index has only just been created, we currently have no deleted index entries.

OK, next we’re going to update the index column with a new random number, one row at a time, for effectively 50% of all rows in the entire table. The table currently has 50,000 rows and we will update every other row (that’s 25,000 rows for those mathematically challenged) with a new random value. Now that will be effectively 25,000 separate delete operations and 25,000 separate insert operations.

There are some folk who would think such an exercise would result in 25,000 deleted index entries which need to be cleaned up at some point via an index rebuild.

There are some folk who would think updating 50% of all index column values would result in so much wasted space that the index would grow in an inefficient manner in order to store all these deleted index entries, likely causing performance issues.

There are some folk who think that such a regular and high proportion of updates on an index column in a table is a clear indication that such an index should be rebuilt on a regular basis.

Well, let’s see what happens. First, let’s update every other row in the entire table with a new random number for the value column:

The first thing to note is that there aren’t actually 25,000 deleted index entries at all, even though we’ve just updated 25,000 index entries. There are just 60 deleted entries currently in the index. Just 60, that’s it !!

Why ?

Because the index is effectively just a random based index and while we may update (and hence effectively delete) an index entry from a specific leaf block, at some later time we’re likely to insert another index entry into this same leaf block, thereby cleaning out any deleted entries it may contain. Effectively, almost all the deleted index entries are being automatically cleaned out by subsequent random inserts throughout the index structure.

The relative handful of currently marked deleted index entries (60) happen to exist in leaf blocks that have not had a subsequent insert since the last delete operation in the specific leaf block. But even these deleted entries will eventually be cleaned out and the space reused by any other subsequent inserts in the specific leaf blocks.

The deleted space is simply not an issue, the vast majority of it has been cleaned out and reused and those entries that haven’t yet been cleaned out will likely be reused by subsequent insert operations anyways.

If we look at the actual space used by the index, after 50% of all index entries have been updated and we note that the index has not changed at all. It has exactly the same number of leaf blocks and is using exactly the same amount of btree space. (Note: because the index values are random, it’s likely that the new values will not be exactly distributed as it was previously and there might be the possibility that the odd index leaf block could fill and split as it contains more associated values than previously. Regardless, as the index grows with more index entries, this is not going to be an issue anyways).

There is nothing “wrong” or inefficient or fragmented with this index, even though we’ve just updated (and hence deleted) 50% of all its index entries.

With deleted index entries being trivial and likely to be reused at some later time anyways and with the index having the same leaf blocks and btree space as it did when the index was newly created, rebuilding such an index would be a total and utter waste of time and resources.

Again, just because an index column is frequently updated, it doesn’t necessarily mean the index is a candidate for a periodic index rebuilds. Any such suggestions are simply misguided …

Advertisements

Share this:

Like this:

Related

Richard, as a point of reference, how about showing an example where there is guaranteed wasted space? And another example where there is no wasted space at all. Then, perhaps, every other case would fall in between, which may be easier to relate to.

Richard, if the new values in the updates do not fall within the previously used value range and they leave a few entries in each block the index will grow and some space will not be reused, right? This same behavior is also observed in delete operations, take an index on a monotonically increasing date column, a process archiving rows to another system based on the date column and deleting them afterwards may leave some rows behind because of some business constraints.

If the index values are randomly inserted/updated/deleted, then generally any deleted space is simply reused.

If however, there are portions of an index that have deleted entries AND there are no subsequent inserts in these portions of the index AND the leaf blocks continue to contain some non-deleted, then yes, this deleted space will not be cleaned up.

I was careful in the post to emphasis that we are dealing with a random index and as such is unlikely to need a rebuild because of update activity.

If Yas you update the values as you’ve done AND you never update values to ranges within leaf blocks where the deletes have previously occured AND you never insert new values to ranges within leaf blocks where deletes have previously occured AND you still have some non-deleted index entries in these leaf blocks (eg. delete only every other index entry) AND you have SQL that are impacted by these deleted entries, then you have an index that is a candidate for a rebuild/coalesce/shrink as described in the above presentation.