I previously discussed how Index Advanced Compression can automatically determine not only the correct number of columns to compress, but also the correct number of columns to compress within specific leaf blocks of the index.

However, this doesn’t mean we can just order the columns within the index without due consideration from a “compression” perspective. As I’ve discussed previously, the column order within an index can be very important (especially with regard the use of the index if the leading column of the index is not specified in the SQL), including with regard to the possible compression capabilities of an index.

Advanced Index Compression does not change this and if we order columns inappropriately, one of the consequences can be the index simply can’t be compressed.

But this time, I’m going to create the index with the columns the other way around than I had in the previous post. The effectively unique ID column is now the leading column in the index, followed by the CODE column that indeed has many duplicate values. There is a “myth” that suggests this is actually a more “efficient” way to order an index, put the column with most distinct values first in the index. This is of course not true (yes, I’ve covered this one before as well).

As the leading column is effectively unique, we simply can’t now compress this index effectively. That’s because compression requires there to be duplicate index entries starting with at least the leading column. If the leading column has few (or no) duplicates, then by compressing the index Oracle is effectively creating a prefixed entry within the leaf block for each and every index entry. The whole point of index (or table) compression is to effectively de-duplicate the index values but there’s nothing to de-duplicate if there are no repeating values in at least the leading column of the index.

It actually results in a bigger, not smaller index. The leaf blocks has gone up from 2363 to 3115.

Unlike the previous post where the columns in the index were the other way around, if we attempt to just compress the first column, it makes no difference to the inefficiency of the index compression because the number of prefix entries we create remains exactly the same:

The good thing with Advanced Index Compression is that we can “give it a go”, but it will not result in a larger index structure. If there’s nothing to compress within a leaf block, Oracle just ignores it and moves on to the next leaf block. If there’s nothing to compress at all within the index, the index remains the same as if it’s not been compressed:

So the index is now back to 2363 leaf blocks, the same as if it wasn’t compressed at all. No it hasn’t helped, but at least it hasn’t made things worse.

So the order of the columns still plays a vital part in the “compress-ability” of the index, even with Index Advanced Compression at your disposal. If both the ID and CODE columns are referenced in your code, then having CODE as the leading column of the index would both improve the manner in which the index can be compressed and make a Skip-Scan index scan viable in the case when the CODE column might not occasionally be specified.

Now, if we change the leading column and create some duplicates (in this case, we update about 10% of the rows to now have duplicate values in the leading ID column):

With a whole bunch of IDs with a value of 42, the non-compressed index now has 2338 leaf blocks. Yes, 10% of the leading columns have duplicates, but 90% of the index doesn’t and remains effectively unique. So if we try and compress this index now:

The compressed index now has 2941 leaf blocks and is still larger than the 2338 non-compressed index. Yes, it’s compressed the 10% of the index that it could, but the inefficiencies in dealing with the other 90% has resulted in an overall larger index. So not too good really.

Again, compressing just the leading ID column doesn’t improve matters:

In fact, at 2977 it’s even worse than compressing all the index because by compressing both columns, we could also effectively compress the duplicate CODE columns as well within that 10% of the index where we had duplicate ID values. With compressing just the ID column, we don’t get the benefit of compressing the duplicate CODE values. So not very good either.

In either case, compressing the index is ineffective as we end up with a bigger, not smaller index.

We now have a index structure at just 2265 leaf blocks that is indeed smaller than the non-compressed index (2338 leaf blocks) because Oracle can now compress just the 10% of index where compression is effective and just ignore the rest of the index (90%) where compression is ineffective.

The best of both worlds, where Index Advanced Compression can compress just the part of an index where it effectively can and ignore and not make matters worse in any parts of the index where index compression is ineffective.

I’ve discussed Index Advanced Compression here a number of times previously. It’s the really cool additional capability introduced to the Advanced Compression Option with 12.1.0.2, that not only makes compressing indexes a much easier exercise but also enables indexes to be compressed more effectively than previously possible.

Thought I might look at a multi-column index to highlight just how truly cool this new feature is in automatically managing the compression of indexes.

OK, the key thing to note here is that the leading CODE column in the index only has 10 distinct values and so is repeated very frequently. However, the second ID column is effectively unique such that the index entry overall is also likewise effectively unique. I’ve created this index initially with no compression, but with a PCTFREE 0 to make the non-compressed index as small as possible.

I’ve previously discussed how index compression basically de-duplicates the indexed values by storing them in a pre-fixed table within the index leaf block. These pre-fixed entries are them referenced in the actual index entries, meaning it’s only now necessary to store repeated values once within a leaf block. Only repeated index values within an index leaf block can therefore be effectively compressed.

In this example, it would be pointless in compressing both indexed columns as this would only result in a unique pre-fixed entry for each any every index entry, given that the ID column is unique. In fact, the overhead of having the pre-fixed table for each and every index entry would actually result in a larger, not small overall index structure.

To show how compressing the whole index would be a really dumb idea for this particular index:

The COMPRESS option basically compresses the whole index and we note that rather than creating a smaller, compressed index structure, the index is in fact bigger at 3120 leaf blocks.

However, as the leading CODE column in the index only has 10 distinct values and so is heavily repeated, it would make sense to just compress this first CODE column only in the index. This of course requires us to fully understand the data associated with the index.

We can do this by specifying just how many leading columnsto compress (in this case just 1):

We note the index is indeed smaller than it was originally, now at just 2002 leaf blocks.

So this requires us to make the correct decision in how many columns in the index to compress. Getting this wrong can result in a worse, not better overall index structure.

Now with Advanced Index Compression, we don’t have to make this decision, we can simply let Oracle do it for us. As discussed previously, Oracle can go through each leaf block and decide how to best compress each leaf block. In this case, it can automatically determine that it’s only beneficial to compress the CODE column throughout the index.

OK, so for about 10% of rows, the ID column value is indeed repeated with the value 42. However, for the remaining 90% of rows (and hence index entries), the ID column remains effectively unique. So we have this 10% section of the index where ID is indeed heavily repeated with the value 42, but everywhere else within the index the ID remain unique.

OK, just compressing the CODE column has indeed resulted in a smaller index structure (just 1977 leaf blocks) as it did before.

Without Advanced Index Compression we have the option to not compress the index (the result is average), compress both columns (the result is worse) or compress just the leading column (the result is better). It’s an all or nothing approach to index compression with the best method decided at the overall index level.

We don’t have the option to compress just the leading column when it makes sense to do so, but to also compress both columns in just the 10% portion of the index where it also makes sense to do so (when we have lots of repeating 42 values for ID).

We do have this option though with Advanced Index Compression and indeed this is performed automatically by Oracle in just those leaf blocks where it’s beneficial because the decision on how to compress an index is not performed at the overall index level but at the leaf block level. As such, Advanced Index Compression has the potential to compress an index in a manner that was simply not possible previously:

We notice the index is now even smaller at just 1941 leaf blocks than it was when just compressing the leading column as we now also compress the CODE column in just that 10% of the table where we also had repeating ID values.

I can’t emphasise enough just how cool this feature is !!

In fact, I would recommend something I don’t usually recommend and that is rebuilding all your indexes at least once (where you know the leading column has some repeated values) with the Advanced Index Compression option, so that all indexes can be compressed to their optimal manner.

Once upon a time, Oracle Support had a note called Script: Lists All Indexes that Benefit from a Rebuild (Doc ID 122008.1) which lets just say I didn’t view in a particularly positive light :) Mainly because it gave dubious advice which included that indexes should be rebuilt if:

Deleted entries represent 20% or more of current entries

The index depth is more than 4 levels

It then detailed a script that ran a Validate Structure across all indexes in the database that didn’t belong in either the SYS or SYSTEM schema.

This script basically read through and sequentially locked all tables (maybe multiple times) in the database in order to list indexes that might not actually need a rebuild while potentially missing out on some that do. I could write a script that achieved the same result with far less overheads. For example, SELECT index_name FROM DBA_INDEXES where index_name like ‘A%’ and owner not in (‘SYS’, ‘SYSTEM’) would achieve a very similar result🙂

Thankfully, note 122008.1 was eventually removed from My Oracle Support (MOS) some time ago, interestingly soon after I discussed the ramifications of this script in my Oracle Index seminars🙂

I recently stumbled upon another related note on MOS regarding index rebuilds, Index Rebuild, the Need vs the Implications (Doc ID 989093.1). Although not perfect (for example while it mentions ANALYZE INDEX VALIDATE STRUCTURE can now be performed online, doing so means that INDEX_STATS is not populated making it a little pointless in this context), it is a significant improvement on the previous note and certainly well worth a read for Oracle newbies.

It also references a script to investigate a b-tree index structure (Doc ID 989186.1) that doesn’t rely on the Validate Structure of an index, making it a far less problematic to use, while also keeping a useful history of index characteristics. Also worth checking out.

A common question that gets asked is does Oracle access the index itself or the parent table during an index rebuild to extract the necessary data for the index ? Thought it might be worth a blog post to discuss.

Now if the index is currently in an UNUSABLE state, then Oracle clearly can’t use the existing index during the index rebuild operation. So we’ll assume both table and index are hunky dory.

OK, to setup the first demo (using 11.2.0.1), we create and populate a table and index with the index being somewhat smaller than the parent table as is most common:

As is common, the table is somewhat larger than the corresponding index.

Now in my first demo, I’m just going to perform a normal offline Index Rebuild. I’ll however trace the session to see what might be happening behind the scenes (the good old alter session set events ‘10046 trace name context forever, level 12’; still does the job). I’ll also flush the buffer cache as well to ensure the trace file shows me which blocks from which object get accessed.

There’s lots of information of interest in the resultant trace file, well, for someone with an unhealthy interest in Oracle indexes anyways🙂 However, the portion that’s of direct interest in this discussion is to see which object Oracle accesses in order to read the necessary data for the index rebuild. The trace file will contain a relatively extensive section with the following wait events (the following is just a short sample):

We can see that Oracle has accessed the data from the Index itself, using multi-block reads. As the index is the smallest segment that contains the necessary data, Oracle can very efficiently read all the required data (the expensive bit) from the index itself, perform a sort of all the data (as a multi-block read will not return the data in a sorted format) and complete the rebuild process relatively quickly. Note the table is locked throughout the entire index rebuild operation preventing DML operations on the table/index and so for an offline index rebuild, Oracle can access the Index segment without complication.

I’m going to repeat the same process but this time perform an Online index rebuild operation:

This time, the much larger BOWIE parent table has been accessed. So with an Online rebuild, Oracle is forced to use the parent table to access the data for the rebuild operation due to the concurrency issues associated with changes being permitted to the underlying table/index during the rebuild process. So although an online index rebuild has availability advantages, it comes at the cost of having to access the parent table which can result in much additional I/O operations. So if you don’t have availability concerns, an offline index rebuild is probably going to be the more efficient option.

In fact, Oracle can be quite clever in deciding which object to access with an offline rebuild …

In this next example, I’m going to create another table/index, only this time the index is somewhat larger than the parent table. This scenario is less common but certainly possible depending on circumstances:

In this case, the smaller alterative index is actually accessed. So it might not be the table or the index being rebuilt that gets accessed, but the smallest segment that contains the data of interest which in this case is another index entirely.

My final little demo brings me back to the subject of secondary indexes on Index Organized Tables (IOTs) I’ve been recently discussing. In this example, I create an IOT and a much smaller secondary index:

In this case, we see that the much larger IOT PK segment is accessed and not the smaller secondary index. When rebuilding the secondary index of an IOT, Oracle has no choice but to access the parent IOT PK segment itself as of course the secondary index doesn’t contain all the necessary information required for the index rebuild operation. The physical guess component within the secondary index might be stale and the only way for Oracle to determine the correct current address of all the rows is to access the IOT PK segment. This is another disadvantage of secondary indexes associated with IOTs, even offline index rebuilds must access the potentially much larger IOT PK segment in order to ensure the correctness of the physical guess components of the logical rowids.

So the general answer of whether an index rebuild accesses the table or index is that it depends and that it could very well be neither of them …

Excellent !! This quiz created quite a bit of debate and it was nice to sit back and read some interesting discussions.

The Clustering Factor is basically the measurement of how well aligned the data in the underlining table is in relation to the index and is the number of table related I/Os required to read the entire table via a full index scan. A value of the CF approaching the number of blocks in the table suggests the data is reasonably well sorted/clustered in relation to the index (although the CF could in theory be somewhat less than the blocks in the table of course). A value of the CF approaching the number of index entries suggests the data is not particularly well sorted/clustered in relation to the index and means we may need to re-visit the same table block numerous times to get the required data, thus decreasing the efficiency of using the index, increasing the costs associated with using the index and therefore decreasing the likelihood of the CBO using the index.

So for an index rebuild to actual have an impact on the CF on an index, means either the rows in the table needs to change or the order of the index entries needs to change.

However, when we typically rebuild an index, it has no impact at all on the table and so can’t possibly change the order of the rows there. Additionally, no matter how fragmented or inefficient the index structure might be, an index rebuild doesn’t change the order of the index entries either as they’re always sorted within the index in the order of the indexed columns.

Therefore an index rebuild typically has no impact at all on the CF of an index, no matter the current value of the CF.

However, there is an exception to this rule.

If we rebuild the index and change it from a NOREVERSE index to a REVERSE index, we now do change the order of the index. Significantly so, as the index entries are now in the order of the index column values when reversed. Therefore this can in turn significantly change the CF of an index.

Conversely, if we rebuild an index and change it from REVERSE to NOREVERSE, we likewise significantly change the order of the index entries and hence the value of the CF.

Of course, it’s always nice to see new ideas and something I hadn’t considered was Gary Myer’s comment regarding changing the logic behind a Function-Based Index prior to a rebuild …

So the moral of this story is that no matter how poorly fragmented the index, how high or low the current CF of an index might be, rebuilding an index in order to improve the CF is a futile exercise and will change less than diddly-squat, except in the above mentioned special circumstances.

Today’s question has been prompted by various recent comments regarding the Clustering Factor (CF) of an index and how to change the CF requires a reorg of the underlining table.

It used to be quite a common myth that if the CF of an index was greater that “X” or based on some nonsensical formula the CF was greater than “Y”, then rebuilding the index somehow made everything better again. I believe it’s now much more commonly accepted that rebuilding an index does not change the CF of an index in any way. By rebuilding an index, the actual order of the index entries remains unchanged as does the order of the rows within the table and so the resultant CF can’t possibly change.

Pick any index, no matter how bad the CF or badly fragmented the index or table, take fresh statistics and after rebuilding the index (and fresh statistics if compute statistics on the index isn’t performed), the CF will remain the same.

However, there are nearly always exceptions …

Give an example of when rebuilding an index will significantly change the actual CF of an index, assuming 100% accurate statistics are gathered before/after the index rebuild on a table with no DML.

One of the main reasons suggested for performing periodic index rebuilds is to improve “performance”. After rebuilding indexes, applications now run so much faster. Users notice a significant improvement in performance. And so on.

There are of course situations when rebuilding specific indexes can indeed improve performance, here’s but one example I’ve discussed previously.

However, the question I always ask when someone claims an index rebuild has made things run faster is to simply ask why. Why is such a simple, but powerful question. Why have things improved ? What has specifically changed as a result of rebuilding the index, that Oracle has now reduced the overall work associated with performing the activity, to the point that things run noticeably faster.

Knowing why is really important because it confirms that indeed there was an improvement and that it was indeed associated directly with the index rebuild. It means when a similar situation arises again, you know how to directly resolve the problem appropriately and effectively next time. Also knowing why means you can determine the specific root cause, perhaps preventing things from deteriorating so bad in the first place, such that rebuilding the index becomes unnecessary. Prevention being the best cure …

Now the most common answer I get for why rebuilding an index has been so beneficial is because the index is now so much smaller after the rebuild that the overheads of reading the index have substantially reduced. If the index is smaller, it means one can read the same amount of index related data with less I/Os. Less I/Os means better performance.

For example, if you can reduce the index by half, you can fit the index into only half the number of leaf blocks and that’s a 50% performance improvement right there.

Well, firstly it assumes that the index has indeed reduced by half. It would actually be a relatively unusual index for it to be so poorly fragmented or for it to have so much “wasted” space that it could be rebuilt into only half the number of leaf blocks.

Possible but somewhat unusual.

However, it also assumes that by having a 50% performance improvement, reading the index blocks constitutes the vast majority of the work. Again possible in some scenarios.

With most index related activities though, reading the index blocks actually constitutes only a small percentage of the overall work. In most cases, the index only contributes a very small percentage of the overall I/O activity. Therefore by potentially only reducing a small percentage of the overall work by rebuilding just the index, the overall impact is generally going to be minimal.

I thought I might perform some basic mathematics to illustrate and put into perspective just what little impact index rebuilding can have in improving performance, even if by doing so the index dramatically reduces in size, because the index itself actually constitutes only a small percentage of the overall costs.

Let say we have one of these more unusual indexes that is so poorly fragmented that it has approximately 50% wasted space throughout the entire index structure. Let’s say rebuilding such an index reduces the overall size of the index by half.

Before the index rebuild, an index has 50% of wasted space and say:

Height of 3

1 Root Block

50 Intermediate Branch Blocks

20,000 Leaf blocks

After the rebuild, the index has no wasted space and has now:

Height of 3

1 Root Block

25 Intermediate Branch Blocks

10,000 Leaf Blocks

Let’s assume the table contains 2M rows and that they fit in 100,000 blocks (i.e. the index is about 1/10 that of the table and the average row size is such that we fit say 20 rows on average per block). Let’s also assume there’s nothing special about this index and that it has an “average” clustering factor of 1M, before and after the rebuild ;) 1M being somewhere in the middle of possible clustering factor values.

The first thing to note is that the height remains the same after such a rebuild, even though the index is only now half the size. It would be extremely unlikely and the index would have to be particularly small and within a very narrow range of sizes for all the contents of all the intermediate branch blocks to fit within just the one root block. The only way for the index height to reduce down from 3 would be for the contents of all intermediate branches to fit within the root block. Possible, but again quite unusual.

OK, let’s look at the cost of various scans before and after the rebuild, using the index costing formula I’ve discussed recently.

If we’re after just one row (a unique index perhaps), then to read the one row before the rebuild would be:

Or in percentage terms, a reduction of I/Os of approximately 1%. That’s just 1 tiny little percent …

So even an index that accesses 10,000 rows, a reasonable number and at 0.5% a reasonable percentage of the overall table, even an index that has reduced in size by half, a substantial reduction in size, only reduces the overall number of I/Os by an unimpressive 1% for such a query in the above scneario.

Would reducing I/Os on such a query by 1% really improve performance “substantially” ? Will users really notice much of a difference ?

It’s of course all in the numbers and in how much work the actual index is performing, in how many I/Os are actually performed by the index itself and in how much of a reduction in the overall I/Os an index rebuild will actually contribute. I’ll leave it to you to plug in different ranges of selectivity to see what impact rebuilding such an index with the above characteristics might have.

The point is that for the vast majority of index related queries, most of the work is performed in getting the data out of the table, not the index.

Therefore, reducing the size of an index, even by possibly a substantial amount, may not necessarily reduce the overall I/Os associated with a query if the index only performs a tiny fraction of all the work. You could eliminate the index entirely and providing Oracle could still magically retrieve just the 0.5% of rows of interest in the above example, performance for such a query would unlikely improve “significantly”.

So not only must an index reduce in size but the activities associated with directly reading the index must constitute a significant proportion of the overall work (eg. fast full index scan, index only scans, etc.) or something else must have changed for performance to have improved “significantly”.

Regular readers of this blog can no doubt pick the various inaccuracies and strawman arguments with this article. It’s was nice however to reminisce and take a stroll down memory lane back to the 80’s and 90’s when many DBAs indeed did spend every Sunday afternoon rebuilding indexes and the such during so-called maintainance windows.

However, if you’re like me and now work on sites where there is no such Sunday maintainance window because your users actually require and demand 24 x 7 access to their applications, because organisations still want to sell their products and services online during Sunday afternoons, because governments still want border control applications functioning on Sunday afternoons, because consumers still want access to their bank savings on Sunday afternoons, because police still need to access critical information about possible criminal activities on Sunday afternoons, because airlines still want to fly aircraft on Sunday afternoons, etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. ……..

then perhaps the article may not be that useful to you afterall.

Of course, maybe you do have a maintainance window of some description but then have that other really annoying problem associated with modern databases, of them being big, really really big and so more difficult to maintain. Back in the 80’s and 90’s, databases were relatively small by todays standards and it was conceivable for indexes to be rebuilt with little or no thought. There was generally little point, but at least you could get away with it. However, if you’re like me and you have databases with indexes that total in the many terabytes, it just isn’t feasible or practical to rebuild all such indexes, even if you wanted to.

No. If you do have a maintainance window, you may want to make sure the valuable time is spent well, on activities that actually make a difference.

Of course you may also look after databases like I do where the percentage of indexes that actually benefit from a rebuild is not around 30% as suggested by the article but significantly below 1%. Therefore rebuilding 99+% of indexes for no practical gain or purpose may not be considered a “wise” use of maintainance opportunities.

One thing this article did make me ponder though is the number of database sites out there where the DBAs demand and force their businesses and associated end-users into having forced downtimes, into having their applications and business processes impacted and made unavailable (say) every Sunday morning, not because it’s actually necessary or because there’s a business or technical benefit but just because their DBAs say or think they still need to follow the processes and maintainance activities of the 80’s and 90’s. Interesting question that …

One last point I would make with this article (which BTW will change and be modified in the future, I can guarantee that, just save a copy and see). Why would you bother with validating the structure of all indexes if you plan to rebuild them all anyways ? And who is the more inept ? Someone in the Netherlands who attempts to run a script that attempts to validate structure of all indexes or someone who writes a script to validate structure indexes with an ALTER INDEX command😉

Personally, I spend every Sunday afternoon relaxing and enjoying the weekend, playing sport, perhaps doing a bit of gardening and on a lovely sunny day, just sitting back with family and friends enjoying a barbecue and a few cold beers.

Meanwhile, all my database applications just run on happily along with no impact on business activities at all …

Time to look a little at an Index Fast Full Scan (IFFS) with respect to moving indexes into a larger block tablespace.

To start, a few points about an IFFS. When performing an IFFS, Oracle will read each and every block in the index structure, including all branch and leaf blocks (up to the HWM), using multi-block read operations. So although an IFFS has its place, it’s a relatively expensive operation, especially if the index is large, as it needs to read all blocks in the current index structure. Therefore, it’s not generally considered a “common” operation, as in traditional index range scans, especially in OLTP environments. Effectively, an IFFS is the index equivalent of a Full Table Scan (FTS) and is performed in a similar fashion. During an IFFS, Oracle basically treats the index as if it were an smaller version of the table from which it can extract the necessary data.

So if an IFFS were indeed to be very common and could be performed so much more effectively in a larger block tablespace, this begs the question why not move tables in a larger block tablespace as well, so FTS can have the same benefits as the index. And if you move both tables and indexes into a larger block tablespace, why not create the database in the larger block and have done with it? Which brings us around to the question of selecting an appropriate block size for the database, which is a different discussion to the so-called benefits of just moving indexes into a larger block tablespace. For another day perhaps …

Now, although Robin’s demo clearly shows consistent gets are basically halved when an index is rebuilt with double the block size (which all sounds very impressive), what it doesn’t highlight however is that during an IFFS, approximately the same amount of data is still actually being read. Although there are certainly some efficiencies in having fewer logical reads, if each consistent gets is more expensive while the overall data being read is similar, you’re simply not going to get an enormous performance boost that simply halving consistent gets might suggest. Claims that things will suddenly run twice as fast (or 120 times as fast) purely by moving indexes into a larger block tablespace are shall we say “exaggerated” to say the least. However, as all index blocks within an index are being read with no subsequent table accesses, an IFFS is going to be as good as it gets when we talk about any possible performance improvements of moving indexes into a larger block tablespace.

Again, it’s all very simple to “give it a go” and see for yourself exactly what improvements one might expect. Move an index into a larger block tablespace and see those IFFS fly. Using exactly the same setup from Robin’s demo as in Parts I and II, let’s see the actual differences …

Note that response times are roughly 11.23 secs, mostly consisting of CPU at 10.96 secs. Note also consistent gets are 1000 x 421 as one would expect from Robin’s demo. Now lets see how things differ when we move the index into a 16K block tablespace.

Note there’s only a slight improvement in response times and CPU usage of a couple of percentage points, even though the number of consistent gets has halved. Now your milage may vary a tad but what you will not see is things suddenly using 1/2 the resources or running twice as fast (or running 120 times as fast). Because at the end of the day, both sets of IFFS are reading and processing approximately the same amount of data.

Note also that any such possible minor improvments in IFFS performance is likely going to be cancelled out or more likely overtaken by the increased costs associated with more typical index range scans due to the index not actually reducing in height as discussed in Part II. Again, what’s more common, smaller index range scans or IFFS ?

Robin’s demo and the demo above however assumes the index is fully cached, with no physical I/O (PIO) operations being performed. Now as an IFFS needs to read the entire index, it’s quite likely that an IFFS would require some physical I/O, especially if the index is large. Surely then, if an index is in a larger block tablespace, such physical I/Os would be more efficient because Oracle would be able to read more data with a multi-block read based on a larger block size?

The answer is unfortunately no. Another (rather poor) analogy to get the point across.

You go to the bank to take some money out. You’re only allowed to take out a maximum of $1000 per day from your account. You ask for this in $50 notes and you get 20 x $50 = $1000 in total. Your mate wants to take out more money and asks for the money in $100 notes. However, as the limit is set to $1000 a day, regardless of the denomination of the bank notes, he simply gets 10 x $100 = $1000. You both end up with exactly the same amount of money, it’s just that you have more bank notes.

An Oracle multi-block read works in exactly the same manner. There’s a maximum amount of data you can read which is calculated by the default block size x db_file_multiblock_read_count. If the default block size is 8K and the db_file_multiblock_read_count = 8, then the maximum size of a multiblock read is 8K x 8 = 64K.

If you attempt to perform a multi-block read using a non-default block size, Oracle simply divides the 64K by the default block size to determine how many blocks to read. So if you now attempt to perform a multi-block read from a 16K tablespace, Oracle will only read 16K x 4 blocks = 64K. The I/O size is the same regardless of the blocksize.

And this makes perfect sense. If you’ve tuned the size of a multi-block read perfectly with the default block size in mind, why would you want to (say) suddenly double this perfect size when you double the block size in another tablespace. If doubling a multi-block read size were to suddenly improved things, why not also double the multi-block read size for the default block size as well …

Again, it’s very easy and simple to test how changing the block size in a specific tablespace makes no difference to the behaviour of an associated multi-block read.

First flush the buffer cache and trace a session, while performing a multi-block read via an IFFS , first with a default 8K block size index:

We notice that Oracle is now only reading 4 blocks at a time, ensuring that the same 64K is read each time. So increasing the blocksize of an index doesn’t suddenly impact the manner or efficiency in which multi-block PIOs are performed.

Next, we’ll perform a similar test as before, comparing the difference between the 8K and 16K block index during an IFFS but this time with PIOs introduced. There are various ways one could do this. Make the buffer cache too small to fit an index or make the index too big for the buffer cache. Keeping with Robin’s demo, I’ll use exactly the same index definitions but this time run the following procedure in another session that constantly flushes the buffer caches. So in one session:

We now note that the 16K block index has the slightly slower response time at 36.59 secs and uses a tad more CPU at 4.23 secs. As soon as we introduce PIOs, the PIOs contribute significantly to the overall costs and may well result in the larger block index being more costly depending on how the PIO is performed. So even in Robin’s example, depending on the nature of PIOs, even an IFFS can potentially run slower with an index in a larger block tablespace. Remembering of course that larger indexes are less likely to be fully cached and are more likely to incur PIOs during an IFFS.

Here’s a demo I’ve used before that uses a different approach to highlight the same issue (which doesn’t rely on a different session “interfering” with things). Another excellent example is this one by Greg Rahn on this OTN thread where he shows an IFFS performing a tad slower in a larger block tablespace. As I’ve been saying, just give it a go and see for yourself.

The next time you see someone reference Robin’s demo as some sort of “proof” that by moving indexes into a larger block tablespace and (perhaps) reducing the number of consistent reads, the index will somehow be flatter and/or more efficient, just remember that the index may not necessarily be flatter, it might actually have the same height and that subsequent index operations, including index range scans and IFFS could very well be more expensive, not less expensive, to perform.

The next time you see someone claim that by moving indexes into a larger block tablespace, performance has suddenly improved by 100% or that things suddenly run 120 times faster, just ask one simple question. Why ?Ask what else has changed, what else might be contributing to the incredible performance improvements, because when you actually test things out for yourself, you’ll noticed results are not anywhere near as “impressive”.

In Part I I looked at Robin Schumacher’s “classic” example of the “so-called” benefits of rebuilding an index into a larger block tablespace. I started by highlighting that by simply rebuilding an index in a larger block tablespace and (say) halving the number of associated index blocks, it doesn’t necessarily mean the index will result in a “flatter structure” , that the index height will reduce. Robin’s demo is in fact a perfect example of this.

Let me start Part II by following this up with a little story …

“I went with a friend of mine to get some cash from the bank the other day in order to buy the latest David Bowie Box-Set. I got my cash in $50 notes but being an efficient, cost saving sort of bloke, my mate got out the same amount of cash in $100 notes. Although we both had the same amount of cash, his wallet was that little bit more compact and “efficient” than mine as he had less actual bank notes to carry.

However, when we got to the record store, we were surprised to discover that the actual “cost” of the David Bowie Box-Set was 2 bank notes of any domination, but with no change being given. Therefore, it cost me 2 x $50 notes to make my purchase. Unfortunately for my mate, it cost him 2 x $100 for the same thing as $100 notes were the smallest denomination he could use. Yes, I was a little bit mean not lending him some of my $50 notes but I was a little disappointed myself for not having any $5 notes on me at the time😉 ”

OK, it’s not a perfect analogy but you perhaps get the point …

If you have to pay with a quantity of bank notes and you only have larger bank notes, you end up paying more than you would if you could only have paid with the same number of smaller denomination bank notes.

If you have to pay for an index scan in database blocks and you’re forced to use larger index blocks, you actually end up paying more if you have to read the same number of index blocks anyways.

This is one of the potential dangers with rebuilding indexes in a larger block tablespace. And like I said, Robin’s little demo is a perfect example of this. You might indeed reduce and halve the number of index blocks but this might not be sufficient to actually flatten the index structure and reduce the actual height of the index. The height of the index after rebuilding the index can remain the same, so the minimum cost of performing a small range scan increases. Even if you reduce the index height, you can still end up paying more if the savings do not compensate you enough for the additional overhead associated with now having to read and process larger index blocks.

Therefore, you potentially start paying more for smaller index range scans because you might not actually reduce the number of index blocks you visit for these types of index scans.

Taking exactly the same table/index definitions and data used to replicate Robin’s example in Part I, let’s see if there’s any difference in the costs associated with performing a number of small index range scans after rebuilding the index in a 16K block tablespace.

Again, it’s a simple case of just giving it a go and see for yourself. What resources are used if you perform a series of small index scans ? Do things really run faster ? Do we really use less resources ? Having rebuilt such an index in a larger block tablespace and halved the number of associated leaf blocks, are we really better off ?

Let’s begin by setting up the same example as Robin’s demo as before …

We note that elapsed times have increased and we have now increased our overall CPU consumption to 91.03 CPU seconds as well.

As we can see, there has been no advantage with rebuilding the in index in the 16K block tablespace for these smaller index scans. In fact, there’s actually been an increase in the overall elapsed times and an increase in the overall CPU. Performance has not improved but has in fact worsened overall for these queries after rebuilding the index in a larger block tablespace.

You begin to get the point …

And of course, indexes are generally far more typically to be used in small index range scan operations than they are in performing Index Fast Full Index Scans. Just compare the numbers of index fast full scans vs. index fetch by key operations in your databases if you want some indication.

However, Robin’s specific example used a SQL statement that performed an Index Fast Full Scan. Surely, such operations would improve dramatically if indexes were only rebuilt in a larger block tablespace ? Surely such operations would be able to read data more quickly, especially if we have to go to disk, as we would be able to read more data with each associated multiblock read if the index were built in a larger block tablespace ? Surely it would be worth all the extra effort and management considerations ?

Things will run at least 2 times faster, maybe even 150 times faster, right 😉

We’ll see how the results can be a tad “disappointing” in the next post …

Although the CREATE INDEX … ONLINE and ALTER INDEX … REBUILD ONLINE options have been available for a long while, they can still introduce locking issues in highly active databases.

Oracle requires a table lock on the index base table at the start of the CREATE or REBUILD process (to guarantee DD information) and a lock at the end of the process (to merge index changes made during the rebuild into the final index structure).

These locks have two implications. Firstly, if there’s an active transaction on the base table of the index being created or rebuilt at the time one of these locks is required, the indexing process will hang. This will of course impact the time it takes to complete the indexing process. However the second far more serious issue is that any other active transactions on the base table starting after the indexing process hangs will likewise be locked and be prevented from continuing, until the indexing process obtains and releases its locks. In highly concurrent environments with many transactions, this can cause serious disruptions to the response times of these impacted transactions. Of course, depending on the time the initial locking transactions take to commit or rollback, this backlog of locked transactions can be quite significant.

Oracle11g has made some improvements in the locking implications regarding creating or rebuilding indexes online.

During the creation or rebuilding of an index online, Oracle still requires two associated table locks on the base table at the start and end of indexing process. If there’s an active transaction on the base table at the time one of these locks is required, the indexing process will still hang as its done previously until all these prior active transactions have completed. No change so far.

However, if the indexing process has been locked out and subsequent transactions relating to the base table start afterwards, these transactions will no longer in turn be locked out by the indexing table locks and are able to complete successfully. The indexing process no longer impacts other concurrent transactions on the base table, it will be the only process potentially left hanging while waiting to acquire its associated lock resource.

This means it may not be quite so “risky” to urgently introduce that new index or rebuild that troublesome index during core business hours due to the reduced locking implications introduced in 11g.

Previously, I discussed how an ALTER INDEX … COALESCE is going to be less expensive in terms of using resources than an equivalent ALTER INDEX … SHRINK SPACE COMPACT (or ALTER INDEX … SHRINK SPACE) as the Coalesce doesn’t have to concern itself with ensuring all leaf blocks at the physical end of the index segment have all been moved to allow for the storage to be de-allocated from the index segment. If you just want to de-fragment an index and not necessarily reduce the overall space allocated to the segment, use Coalesce rather than the Shrink options as it’s cheaper.

But what about an ALTER INDEX … REBUILD, when, if ever, should it be used ?

Well the answer is as with most things Oracle, it depends.

Scenario One.

We have a table and the application deletes historical data but in a manner in which leaf blocks are not being entirely emptied. Basically, older stuff is removed, but it’s only removed in a random manner, from approximately the “earlier” 10% of the table. The index is sequenced which means only those leaf blocks in the “left-most” 10% of the index structure are impacted but all this deleted space is “deadwood” as new index entries are only being inserted into the “right-most” part of the index.

Note that basically 90% of the index is fine and very well utilised, it’s only 10% of the index that’s problematic. Of the problem 10% of leaf blocks, there’s plenty of free or deleted space, with many leaf blocks almost but not quite empty.

Coalesce (and indeed Shrink) will basically run through these 10% of fragmented leaf blocks and will merge the index row entries into as few leaf blocks as possible. With the 90% of blocks that are fine, Coalesce will basically read and then ignore them from any processing as there’s nothing that can be done for them.

Rebuild on the other hand will take an entirely different approach. It will (generally) read the entire existing index structure and will build a brand new, bright and shining index segment. As part of this process, it will rebuild the entire index, it has no choice (assuming the index isn’t partitioned, but that’s another story) and will rebuild the 90% of the index that was actually perfect to begin with. Rebuilding 90% of something that doesn’t need rebuilding doesn’t sound particularly efficient and indeed it isn’t. As a result, the index rebuild will use substantially more resources and generate substantially more redo than an equivalent Coalesce (or Shrink Space).

Scenario Two.

We have an application that deletes data and it deletes data throughout the entire index structure. The deletes are significant with a substantial proportion of the overall rows having been deleted. Additionally, the table is not going to be repopulated with anything like the same volume of data or it won’t be repopulated for a substantial period of time. As such, all this deleted index space is “deadwood” as it’s not going to be used any time soon, if at all.

Now typically in this sort of scenario, it’s of course the table as much as the associated indexes that needs to be rebuilt. That’s a key point. However, maybe Full Table Scans are not an issue for this table so the wasted space in the table is not of urgent concern. Maybe the table in not in an ASSM tablespace or in a database that supports a Table Shrink command and maybe moving the table is not an immediate option due to availability concerns. For whatever reason (or lack of reason), the index needs to be de-fragmented.

Note it’s the entire index that’s problematic here and there could be portions of the index that have very few remaining index entries.

Now poor Coalesce (and indeed Shrink) has a bit of an issue here. They both merge index entries from two blocks into the one block where it can. However, if leaf blocks are really empty, these merged index entries may in turn be merged and moved again with index entries from yet another leaf block. And maybe yet again with another leaf block. And again and again … So a specific index entry may actually be moved into several different leaf blocks during the entire process. Each of these moves requires resources and generates redo and takes time.

Now the rebuild has an entirely different approach. As mentioned, it will basically (generally) read the entire exisiting index structure and will build a brand new one, but importantly as it does so will only have to locate a specific index entry once and once only. Also, as it’s the entire index structure that’s problematic, there’s no issue with fixing the entire index, as it’s all “broken”.

As a result of only having to deal with an existing index entry the once vs. the Coalesce which may relocate a specific index entry many times, the index rebuild is going to be substantially more efficient and potentially use significantly less resources and generate less redo.

Basically, Coalesce is particularly efficient and uses less resources when the percentage of the overall index structure that’s problematic and fragmented is relatively small (less than approximately 20-25% of leaf blocks). Rebuild is particularly efficient when the percentage of the overall index structure that’s problematic and fragmented is relatively large and the average degree of fragmentation within an index leaf block is relatively high. Note Pre 10g, an index needed to have at least 50% free space less pctfree in neighbouring leaf blocks for a Coalesce to be effective.

Now Rebuild (and Rebuild Online) potentially have locking implications that need to be considered although as we’ll see later, 11g has addressed some of these issues …

The issue of when to rebuild indexes crops up again and again and again. Unfortunately, so do the same incorrect, myth-filled uninspired responses which considering how important and “key” (no pun intended) indexes are to database design and performance generally, is very puzzling indeed.

In the coming days I’ll post why I believe these index related myths simply refuse to go away …

This presentation was originally written several years ago but is still as relevant today as it’s always been.