December 27, 2010

Index Rebuilds

There are many suggestions floating around the internet about identifying which Oracle indexes to rebuild. One of these involves running the validate (or analyze index validate) command on an index and checking the resulting figures in view index_stats to determine whether or not del_lf_rows exceeds some percentage (usually 20%) of lf_rows and rebuilding the index if so. Since this suggestion came up, yet again, in a recent OTN thread about Oracle index rebuilding I thought I’d write up a quick note highlighting one of the obvious flaws in the argument.

I’m not going to bother pointing out the threat inherent in the table-locking when you use the “validate index” command (or the “analyze index validate” equivalent) but I think it’s worth making some comments about the misunderstanding built into this policy. So let’s start by building some data, creating an index on it, then deleting 90% of the data:

I haven’t bothered to collect statistics in this code as I’m not interested in execution plans, only in the amount of data deleted and what this does to the physical structure of the index. Here’s the the output of my script starting from the moment just after I’ve created the index:

My first delete statement got rid of 90% of the data leaving the 4,000 rows where mod(id,100) was between zero and nine. So my second delete has eliminated 10% of the remaining 4,000 rows. Let’s see what we get when we validate the index:

How wonderful – by deleting a few more rows we’ve got to a state where we don’t need to rebuild the index after all !

Warning

This note is not intended to tell you when you should, or should not, rebuild an index. It is simply trying to highlight the fact that anyone who thinks that a value exceeding 20% for del_lf_rows / lf_rows is a sensible indicator does not have a proper understanding of how indexes work, and is basing their assumption on an over-simplistic model. In this case the obvious error in this model is that it allows you to miss indexes which might actually benefit from some action; but there’s also the possibility that a little random slippage in timing may persuade you to rebuild an index because you happened to check it in the window between deletion and re-use.

The problem is based on a fundamental misunderstanding, which is this: if you believe that an index entry reserves a spot in an index and leave a little hole that can never be reused when it is deleted (unless you re-insert exactly the same value) then inevitably you will think that the del_lf_rows figure is some sort of measure of actual space that could be reclaimed.

But, as the Oracle myth busters like Richard Foote have been saying for years, that’s not how Oracle’s B-tree indexes work. When you delete an index entry, Oracle marks it as deleted but leaves it in place. When you commit your transaction Oracle does nothing to the index entry – but other processes now know that the entry can be wiped from the block allowing the space to be re-used.

This is what del_lf_rows is about – it’s the number of rows that are marked as deleted by transactions that have committed; and since the validate command can only run if it has locked the table in exclusive mode, any index entries marked for deletion will inevitably be committed deletes. So after I had deleted (and commited) 36,000 rows there were 36,000 entries in the index marked as deleted and committed; when I deleted a few more entries my second transaction wiped the deleted entries from any leaf blocks it visited, tidying the blocks (with a “heap block compress”) before marking a few more rows as deleted.

The upshot of this is that many systems (especially OLTP systems) will see del_lf_rows as a fairly small fraction of lf_rows because most systems tend to do little updates scattered randomly across lots of leaf blocks – constantly wiping out the rows marked as deleted by earlier transactions.

It’s really only in the case where a single large delete has taken place in the recent past that you’re likely to see a lot of deleted rows still in the index when you validate it and – as most people are now aware – a process that is supposed to do a very large delete is a process that should be considered in the design phase as a candidate for special treatment such as dropping/disabling some indexes before the delete then rebuilding afterwards. It won’t be a process where you will have to validate the index to decide roughly how much data you’ve deleted, and where in the index that data was, it’s a process where you’ll know what’s going to happen before it happens.

Related

Notes 122008.1, 989093.1, 989186.1 are available for Oracle’s idea about rebuilding index.
I have a question after reading note 989093.1: as long as the blevel for particular index is not high, we can always use “coalesce” to pack the space never reused. Is it correct?

Metalink (MOS) Doc ID 122008.1 was mentioned in a recently release book that I reviewed. My book review included the following regarding that mention:
“Page 726 states to check Metalink (MOS) Doc ID 122008.1 for “the officially authorized script to detect indexes that benefit from rebuilding.” That Metalink article states that the criteria is not valid and the script has been revised to meet “current standards and functionality.” That means that the suggested criteria for rebuilding that is printed in the book regarding 20% or more of deleted index entries or a depth of more than 4 levels is invalid, as had previously been pointed out to the book author as invalid in a couple of OTN discussion threads (reference1 reference2 reference3)”
One of Jonathan’s comments in June 2008 apparently played a role in the removal of the original version of Doc ID 122008.1, see:http://forums.oracle.com/forums/thread.jspa?messageID=4505978

I think that Doc ID 989093.1 contains potentially misleading information if someone were to quickly read that Doc ID – I think that the maintainer should better clarify that the common reasons for rebuilding indexes “Indexes are often rebuilt on a regular basis”, “clustering factor becomes out of sync”, “index becomes fragmented” are in fact refuted by the Doc ID and not listed as justifications for rebuilding indexes.

Doc ID 989186.1 is a bit interesting as it does not use the ANALYZE INDEX command. Maybe I need to spend a little more time looking at that script to determine the usefulness of the script.

I’ve just used the feedback page on Doc Id 989093.1 to point out the ambiguity in the phrasing of the note at that point. A better introductory line would have been something like: “The most common claims used to justify rebuilding indexes are:”… The next paragraph starting “In fact … ” then flows more readily as the correction to the claims.

I don’t like describing a complex issue in a single sentence – it makes it too easy for people to jump to the wrong conclusions.

It is often the case that (a) rebuilds are a waste of time and (b) when they are not a waste of time then a coalesce is a more sensible option than a rebuild – but if you’re thinking about changing every line of code that reads “alter index … rebuild” to use a coalesce instead, take a close look at the penultimate line of the MOS note (my emphasis): “it is strongly advised not to rebuild indexes on a regular basis but instead use proper diagnostics.”

If you follow the “Further reading on rebuilding indexes” link at the end of the article you will find a note I have made about the cost of the coalesce command.

[…] In fact, if you want to avoid some nonsensical index rebuild criteria based on DEL_LF_ROWS, all you need to do is simply delete yet more rows, as so wonderfully here explained by Jonathan Lewis. […]