A recent question on the OTN Forums Reg: Index – Gathering Statistics vs. Rebuild got me thinking on a scenario not unlike the one raised in the question where a newly populated index might immediately benefit from a coalesce.

Shows us that the index is fully populated with just the default 10% pctfree of free space (most leaf blocks have an avs of 819/820 free bytes).

The forum question mentions a scenario where 70% of the table is archived away. This is done by storing in a temporary table the 30% of required data, followed by a truncate of the original table and the 30% of data being re-inserted. Something like the following:

We notice that vast areas of the index now has 50% of free space, not the default 10% it had previously.

The “problem” here is that when data was stored in the temp table, there was nothing to guarantee that the data be physically stored in the same order as the ID column. ASSM tablespaces will effectively pick random free blocks below the high water mark of the table so that although data might well be ordered within a block, the blocks within the table might not be logically ordered with respect to the data being inserted.

A simple select from the temp table displaying the “first” 20 rows of the table will illustrate this:

SQL> select * from bowie_temp where rownum <=20;
ID NAME
---------- ------------------------------------------
701717 DAVID BOWIE
701718 DAVID BOWIE
701719 DAVID BOWIE
701720 DAVID BOWIE
701721 DAVID BOWIE
701722 DAVID BOWIE
701723 DAVID BOWIE
701724 DAVID BOWIE
701725 DAVID BOWIE
701726 DAVID BOWIE
701727 DAVID BOWIE
701728 DAVID BOWIE
701729 DAVID BOWIE
701730 DAVID BOWIE
701731 DAVID BOWIE
701732 DAVID BOWIE
701733 DAVID BOWIE
701734 DAVID BOWIE
701735 DAVID BOWIE
701736 DAVID BOWIE

Note that the first selected range of values starts with 701717 and not 700001.

Therefore, when the data is loaded back within the table, the data is not necessarily ordered as it was previously and an insert into a full leaf block might not necessarily have the largest ID column currently within the table/index. That being the case, a 50-50 block split is performed rather than the 90-10 block splits that only occur when it’s the largest indexed value being inserted into a full leaf block. 90-10 block splits leaves behind nice full leaf blocks, but 50-50 block splits leaves behind 1/2 emptied blocks that don’t get filled if they don’t house subsequent inserts.

The 50-50 leaf block split is resulting in a larger index but most importantly, these areas of free space are not going to be used by subsequent monotonically increasing ID index values.

To reclaim this effectively unusable index storage, the index would benefit from an immediate coalesce.

Of course, the best cure would be to prevent this scenario from occurring in the first place.

One option would be to insert the data in a manner that ensures the effective ordering of the indexed data:

We notice by inserting the data in ID order within the table (via the “order by” clause), we guarantee that an insert into a full leaf block will indeed be as a result of the highest current index entry and that 90-10 block splits are performed. This leaves behind perfectly full leaf blocks and a nice, compact index structure. The index now only has 600 leaf blocks, significantly less than before with no “wasted” storage.

The index leaf blocks are chock-a-block full with just 5 bytes free, not enough space for another index entry.

Another alterative would of course be to make the indexes unusable before re-inserting data into the data and rebuild the index later with a pctfree of 0, a safe and appropriate value for monotonically increasing values. This has the added advantage of significantly improving the performance of the bulk insert operation.

If the data isn’t monotonically increasing, then 50-50 block splits are fine as the resultant free space would indeed be effectively used.

Update: 7 July 2015

And as Jonathan Lewis kindly reminded me, another method to avoid this issue is to simply use an insert Append. This will record all the key entries as it goes, sort them and populate the index much more efficiently in one step:

The key thing to note here is that the leaf block has two Interested Transaction List (ITL) slots, each of which use 24 bytes. Two is the default number of ITL slots in an index leaf block (index branch blocks only have 1 by default) and are used by transactions to store vital information such as transaction id, locking information, location of undo and SCN details. However, the first slot (No. 1) is only used by recursive transactions such as those required to perform index block splits and can’t be used for standard user-based transactions. I discuss this in my (in)famous Rebuilding The Truth presentation.

Now my quiz demo had two concurrent delete transactions occurring within the same leaf block(s) but with effectively just the one free ITL slot available for the two transactions. Ordinarily, Oracle would just allocate another ITL slot so both transactions can both concurrently delete the different index entries within the same leaf block. However, Oracle is unable to simply add another ITL slot in this scenario as it requires 24 bytes of free space and there is only the 10 or 11 bytes free in our leaf blocks.

In a similar scenario with a table segment, being unable to allocate another ITL slot like this would result in a nasty ITL wait event for one of the transactions. But for indexes, there is a “naturally occurring” event that results in plenty of additional free space as required.

The index block split.

So rather than have one transaction having to hang and wait for a ITL slot to become available (i.e. for the other transaction to complete), Oracle simply performs a 50-50 block split and allocates the additional ITL slot as necessary, if both transactions still occur within the same leaf block after the block split.

In my quiz demo, both delete transactions were actually performed on index entries that existed in the other half of the block split. Therefore, the number of ITL slots in the first leaf block remains at just the default two and the kdxlende value which denotes deleted index entries remains at 0.

If we look at a partial block dump of the other half of the leaf block split, now the second logical leaf block (as identified by kdxlenxt 25167140=0x1800524):

The first thing we notice is that this leaf block as three, not the default two ITL slots. As both concurrent delete transactions deleted entries from this particular leaf block, an additional ITL slot was allocated as we now have plenty of free space.

The kdxlende value is set to 2 as we now have the two index entries marked as deleted (these are index entries 220 and 221 within the block). Index entry 220 was deleted by the transaction logged in ITL slot 2 and index entry 221 was deleted by the transaction logged in the new ITL slot 3.

So having two concurrent transactions wanting to delete from the same “full” leaf block resulted in the leaf block performing a 50-50 block split with a new leaf block being added to the index in order to accommodate the additional required ITL slot.

I was very careful when deleting rows from the table to cause maximum “damage” to the index. Both delete transactions in my demo effectively deleted every 500th pair of index entries. As there was previously approx. 533 index entries per leaf block, this resulted in every leaf block in the index splitting in this exact manner as all the leaf blocks had two index entries deleted. This is why the deletes resulted in the index practically doubling in size.

The only index leaf block that didn’t have to split was the very last leaf block as it had plenty of free space (2019 bytes) to accommodate the additional ITL slot. This last leaf block only had 1995 bytes of free space after the deleted, as it lost the 24 bytes due to the additional ITL slot being allocated. You can see these numbers in the tree dumps (following is the tree dump after the delete operations):

Be very careful allocating a pctfree of 0 to indexes as it may not ultimately help in keeping the indexes as compact as you might have hoped, even if you don’t insert new index entries into the existing full portions of the index.

Thanks to all of those that had a go at the quiz and well done to those that got it right :)

One of the things I’ve seen at a number of sites is the almost fanatical drive to make indexes as small as possible because indexes that are larger than necessary both waste storage and hurt performance.

Or so the theory goes … :)

In many cases, this drives DBAs to create or rebuild indexes with a PCTFREE set to 0 as this will make the index as compact and small as possible.

Of course, this is often the very worst setting for an index to remain small because the insert of a new index entry is likely to cause a 50-50 block split and result in two 1/2 empty leaf blocks (unless the index entry is the maximum current value). Before very long, the index is back to a bloated state and in some sad scenarios, the process is repeated again and again.

A point that is often missed though is that it doesn’t even take an insert to cause the index to expand out. A few delete statements is all that’s required.

To illustrate I create my favorite little table and populate it with a few rows:

We note the index only has 19 leaf blocks and that most leaf blocks have 533 index entries and only an avs (available free space) of some 10 or 11 bytes. Only the last leaf block is partly full with some 2019 free bytes.

That’s fantastic, the index really is a small as can be. Trying to use index compression will be futile as the indexed values are effectively unique.

I’m now going to delete just a few rows. Surely deleting rows from the table (and hence entries from the index) can only have a positive impact (if any) on the index structure.

In my last post, I discussed how both 1/2 empty and totally empty leaf blocks can be generated by rolling back a bulk update operation.

An important point I made within the comments of the previous post is that almost the exact scenario would have taken place had the transaction committed rather than rolled back. A commit would also have resulted with the leaf blocks being 1/2 empty in the first example (with the previous index entries now all marked as deleted) and with effectively empty leaf blocks in the second example (with the previous leaf blocks all now containing index entries marked as deleted). The important aspect here is not the rollback but the fact that update statements result in the deletion of the previous indexed value and the re-insertion of the new value. (BTW, it’s always a useful exercise to read through the comments on this blog as this is often where some of the best learning takes place due to some of the really nice discussions) :)

That said, the previous post used a Non-Unique index. Let’s now repeat the same scenario but this time use a Unique Index instead.

So let’s start with another table with the same data but this time with a unique index on the ID column:

Now we notice a bit of a difference already. Here, the index consists of 20 leaf blocks with 513 index entries in most leaf blocks whereas the non-unique index had 21 leaf blocks and just 479 index entries per leaf block. One of the advantages of unique indexes over non-unique as I’ve discussed previously.

Let’s now perform our first bulk update where I increment the ID of each value by 1:

Now with the non-unique index, this resulted in the index doubling in size as we created an additional index entry for each and every row. After the rollback, we were effectively left with an index that not only was twice the size but had only 1/2 empty leaf blocks.

With a unique index though, things differ. The most important characteristic of a unique index of course is that each index value can only ever exist once, each index entry must be unique. So for a unique index, the rowid is not actually part of the indexed column list, but treated as additional “overhead” or metadata associated with the index entry.

When we perform our update here, we’re effectively replicating each value, except for the very last ID value where 10001 doesn’t exist. But with the first row, when the ID=1 becomes 2 after the update, we already have an index entry with an ID value of 2 (the second row). So Oracle can mark the first index entry as deleted (as ID=1 no longer exists) but rather than insert a new index entry simply update the rowid associated with the unique index entry with the ID of 2. Oracle then updates the rowid of the index entry with a value of 3 with the rowid of that previously referenced ID=2 . And so on and so on for all the other index entries except for index value 100001 which has to be inserted as it didn’t previously exist. So Oracle nicely maintains the consistency of the index during the single update operation by effectively recycling the existing index entries.

The net result is that the index remains the same size as the index entries are not reinserted as they are for a non-unique index. The effective change that occurs during this update is that the first index entry is marked as deleted and one new index entry is added at the very end.

If we look at a partial block dump of the first leaf block before the rollback operation:

We notice that the first index entry is marked as deleted (as we now no longer have an ID=1) but all the other index entries have been “recycled” with their updated rowids. Note how the rowid of the deleted index entry (01 80 01 57 00 00) is now associated with the second index entry (which is effectively now the first index entry now).

Now Oracle can’t recycle the existing index entries as the new values don’t currently exist within the index. So Oracle is indeed forced to mark all the existing index entries as deleted and insert new index entries into the index. These new index entries all exist in the right hand most side of the index, resulting in 90-10 block splits with additional index leaf blocks being added to the index. If we rollback this transaction, it will result in all the new index entries being removed, leaving behind these new empty leaf blocks just as with the non-unique index example.

The index has indeed bloated in size as a result of the update. Note that the index would be the same size had the transaction committed, except that the leaf blocks that currently contain data would effectively be empty and contain nothing but deleted index entries while the empty leaf blocks would all contain the new indexed values.

So depending on the update operation, a unique index can potentially reuse existing index entries if the new column values existed previously in other rows. If not, then the usual delete/insert mechanism applies.

There’s been an interesting recent discussion on the OTN Database forum regarding “Index blank blocks after large update that was rolled back“. Setting aside the odd scenario of updating a column that previously had 20 million distinct values to the same value on a 2 billion row table, the key questions raised are why the blank index leaf blocks and why the performance degradation after the update failed and rolled back.

The key point to make is that an Update is actually a delete/insert operation in the context of indexes. So if we perform a large update, all the previous indexed values are marked as deleted in the index and the new values re-inserted elsewhere within the index structure, potentially filling up a whole bunch of new leaf blocks. If we then decide to rollback the transaction (or the transaction fails and automatically rolls back), then all these newly inserted index entries are deleted potentially leaving behind now empty new leaf blocks in the expanded index structure. Here’s the thing, Oracle will roll back changes to index entries but not changes to the index structure such as block splits.

If an index scan is forced to navigate through these empty leaf blocks, this can indeed potentially have a detrimental impact on subsequent performance.

However, depending on whether the index is Unique or Non-Unique and the type of update being performed, the impact on the index could be quite different.

Now the interesting thing to note here is that for each ID value, we temporarily have the same value twice as we progress and update each ID value (for example, for ID=1, it becomes 2 which already exists. Then the previous ID=2 becomes 3 which already exists, etc.). As the index is Non-Unique, this means when we update say ID=1 to 2, we need mark as deleted the index entry with ID=1 and insert a new index entry with an ID=2. When we update the previous ID=2 to 3, we again mark as deleted the previous indexed value of 2 and insert a new index entry of 3. Etc. Etc.

As we only have 10% of free space available in the index before the update, by updating all rows in this fashion, it means we have to keep performing 50-50 block splits to fit in the new index entries in the corresponding leaf blocks. This effectively results in the index doubling in size as we now have twice the number of index entries (with the previous index entries now marked as deleted).

However, having now performed all these index block splits, if we now roll back the update transaction, it simply means that all the new index entries are deleted and the delete byte removed from the previously deleted entries, with the index structure retaining its newly bloated size. The resultant index block splits are not rolled back. If we look at a new index tree dump:

As all the inserts now occurred in the right-hand most side of the index, Oracle allocated a bunch of new index leaf blocks via 90-10 block splits to store all the new index entries. After the rollback however, all these new entries were removed leaving behind nothing but these new empty leaf blocks which are still part of the overall index structure.

Query performance now depends on what part of the index we need to access.

If we just want to select a single value, then no problem as the ID column is effectively unique and we just need to generally access down to the one leaf block:

4 consistent gets is about as good as it gets for a non-unique Blevel 1 index.

Larger index range scan might need to access additional leaf blocks as they now only contain 1/2 the number of index entries than before, although the additional overhead of such scans would still likely be minimal as most of the work is associated with visiting the table blocks.

One of the worst case scenarios would be having to now plough through all these empty leaf blocks as with the following search for the max ID value:

Oracle uses the Index Full (Min/Max) Scan by starting with the right-most leaf block but as it’s empty, Oracle is forced to make its way across through all the empty leaf blocks until it finally hits upon the first non-empty leaf block that contains the max ID. The excessive 32 consistent gets is due to having to access all these new empty blocks.

In Part I, I quickly ran through how to setup an encrypted tablespace using Transparent Data Encryption and to take care creating indexes outside of these tablespaces.

Another method of encrypting data in the Oracle database is to just encrypt selected columns. Although the advantage here is that we can just encrypt sensitive columns of interest (and that the data remains encrypted within the buffer cache), this method has a number of major restrictions, especially in relation to indexing.

To first set the scene, I’ll start by creating and populating an unencrypted table:

We can see the 4 columns of each row and note that the length of the ID and CODE columns are 3 and 2 bytes respectively. We can also see that the hex values of the CODE column (col 1) are all the same: c1 2b (as they all have a value of 42).

OK, time to encrypt some columns. I’ll re-create the table but this time encrypt both the ID and CODE columns using all the default settings:

We see the reason why we have fewer rows per block is that the encrypted columns have significantly increased in size. Where previously they were just 3 and 2 bytes respectively, both the ID and CODE columns are now 52 bytes in length. The actual size would in part depend on the encryption algorithm used (some algorithms round to the next 8 bytes), in this example I used the default AES192.

With AES192, the length of the column is rounded to the next 16 bytes. However, if we simply encrypt a columns as is, it would mean a column value would be encrypted to the same value when using the same encryption key. This means a malicious person could potentially attempt to reverse engineer a value by inserting known column values and seeing if the generated encrypted values are the same as those in the table. To prevent this, Oracle by default adds “Salt”, which is basically a random string, to the column value being encrypted to make it now impossible to reverse engineer the inserted value. This adds another 16 bytes to the length of the column value. If we look at the second CODE column (col 1) in the block dump, we notice they all have different encrypted values even though they all have the same actual value of 42 within the table.

So that’s 32 bytes accounted for. The remaining 20 bytes is a result of TDE adding a Message Authentication Code (MAC) to each encrypted value for integrity checking purposes.

Clearly, having columns that increase so significantly due to encryption will also have an impact on any associated indexes as they will likewise not be able to contain as many entries per index block and hence be significantly larger.

However, the more pressing issue is that by adding salt to the encryption process, there is no easy deterministic way Oracle can associate an actual indexed value with the encrypted value when accessing and modifying the index. As a result, Oracle simply doesn’t allow an index to be created on any column that has been encrypted with salt.

If we want to encrypt a column and have the column indexed, we must encrypt the column without salt. Additionally, if you want to make the index more efficient without the overheads associated with MAC integrity checks, you may also want to encrypt the columns with the NOMAC option.

We see the length of the encrypted columns has dropped back down to 16 bytes, still more than the unencrypted columns but less than the 52 bytes required for the encrypted columns with both salt and MAC enabled.

Note though that the CODE column (col 1) while encrypted all now have the same encrypted hex value (9e d8 3b 95 65 60 43 df 2c e2 b0 85 ae 5e 87 61) without the salt applied. So the encrypted data is that little bit less secure but we can now successfully create B-tree indexes on these encrypted columns:

SQL> create index bowie_id_i on bowie(id);
Index created.

This however doesn’t end the various restrictions associated with indexing column encrypted columns as we’ll see in the next post.

Database security has been a really hot topic recently so I thought I might write a few posts in relation to indexing and Transparent Data Encryption (TDE) which is available as part of the Oracle Advanced Security option.

To protect the database from unauthorized “backed-door” accesses, the data within the actual database files can be encrypted. This helps prevent privileged users such as Database and System Administrators who have direct access to these files from being able to inappropriately access confidential data by simply viewing the data directly with OS editors and the such.

There are two basic ways one can use Oracle TDE to encrypt data within an Oracle database; via Column or Tablespace encryption. Column-based encryption has a number of restrictions, especially in relation to indexing which I’ll cover in later posts. To start with, my focus will be on Tablespace encryption.

With Tablespace encryption, all objects within the tablespace are encrypted in the same manner using the same encryption key and so has far fewer restrictions and issues.

It’s relatively easy to setup up tablespace encryption although note you must have the Oracle Advanced Security Option.

First, ensure you have an entry in your sqlnet.ora that identifies the location of your Oracle Security Wallet, such as:

This wallet contains the master key used in turn to encrypt the various column and tablespace encryption keys used within the database to encrypt the actual data. Do not lose this wallet as without it, you will not be able to access any of your encrypted data !!

Next, open the database keystore using the password specified above such as:

OK, to set the scene, we’re going to first create and populate a table in an unencrypted tablespace (note I’ve created tiny tablespaces here just so I can quickly open the file with a basic windows notepad editor):

If we were to replicate this data in an unencrypted tablespace, then of course the associated data would be visible within the database data files again. And it’s of course very easy to potentially replicate table data, either accidently or maliciously, via the creation of database indexes. For example, the default tablespace of a user might be in an unencrypted tablespace or a privileged user who we are trying to protect ourselves from has the create any index privilege.

So if we create an index in an unencrypted tablespace (I’m also adding a new row to easily identify it from the previous values inserted with the initial BOWIE table):

Bowie begins by saying “I don’t know when your birthday is, but I know Ziggy does not know too.”

Now the first part is obvious, Bowie only knows the month and there are at least 2 possible dates for every month, so there is no way of Bowie knowing my birthday based on just the month. However, the second part is new information, Bowie knows that Ziggy doesn’t know either. How can he claim this ?

Well, there are 2 special dates on the list, May 19 and July 18 as these are the only dates that have a unique day. If Ziggy was given one of these days (18 or 19), he would instantly know my birthday. However, Bowie who only knows the month knows that Ziggy doesn’t know my birthday. He can only claim this if the month is not May and not July because if the month he was given was one of these months, Ziggy might then know my birthday.

So my birthday must be in either June or August and we can eliminate both May and July from further consideration.

Now Ziggy can likewise perform this logic and he now knows the month of my birthday must be either June or August. He now says “At first I don’t know when your birthday is, but now I know”.

The first part is again obvious as Bowie has already stated that Ziggy doesn’t at first know my birthday. But the second part is new information, he now knows. The only way he can now know is if my birthday is one of June 13, August 15 or August 16 as these are the only dates in June and August that have a unique day. If my birthday was on either June 14 or August 14, there is no way Ziggy could now know my birthday as they both have the same day.

So we’re now down to just these 3 possible dates.

Now Bowie can also likewise perform this logic and that my birthday must be one of June 13, August 15 or August 16. The final piece of information is Bowie saying “Then I also know when your birthday is.”. The only way he can say this is if my birthday is on June 13 as this is the only remaining date with a unique month. If my birthday had been either August 15 or August 16 then there is no way for Bowie is now know.

Therefore my birthday must indeed be June 13.

So congratulations to all of you who managed to get the correct answer :)

An Interesting Observation

Now here’s an interesting thing. 8 people got the correct answer. But equal on top of popular answers with 8 responses also was July 16. Why was July 16 such a popular answer, when the month of July was actually one of the earliest set of dates we could eliminate from consideration. It also wasn’t one of the unique day dates which is also a common answer ? What’s so special about July 16 that so many people picked this answer ?

Well, the original Singapore question I referenced in my original blog piece in which the students had to work out Cheryl’s birthday had a correct answer of, you guessed it, July 16. You see, not only did I change the names from the original question, but I also changed the dates as well. However, I made sure that July 16 was one of the possible 10 dates :)

Now I don’t know how everyone got to their answer but it strikes me as an interesting coincidence that so many people got the same answer as to the original Singapore exam question. I wonder how many people looked up the answer from the Singapore question and just assumed the same answer fit here as well ? How many people saw enough information in the question for them to assume an answer, without being careful enough regarding the actual information laid out before them ? These are two similar but yet different questions which have two different answers.

And isn’t this such a common trait, especially in the IT world when trying to diagnose a performance issue. You see a problem in which you’ve seen a similar symptom before and assume that the root cause was the same as it might have been previously, even though the information at hand (if only you looked carefully enough) is entirely different and doesn’t actually lend itself to the same root cause.

Just because the reason for lots of log file sync waits before was slow write performance to the redo log files doesn’t necessarily mean it’s the same reason today. Just because you have slow RAC cluster waits previously due to interconnect issues doesn’t necessarily mean it’s now the root cause. Quickly jumping to the wrong conclusion is such an easy mistake to make when trying to diagnose a performance problem. One needs to look carefully at all the facts on hand and with an Oracle database, there is no excuse to not look at all the relevant, appropriate facts when diagnosing a performance problem.

No, just because someone previously had a brain tumour when they had a headache doesn’t necessarily mean your headache is the result of a brain tumour :)

I enjoy solving problems and this one took me a few minutes to work it out. However, at the end of the process, it occurred to me that I used a similar process to how I’ve often solved performance issues with Oracle databases. In fact, this question kinda reminded me of a performance issue that I had only just recently been asked by a customer to help resolve.

One needs to clearly understand the question being asked. One also needs to focus and understand the data at hand. Then use a process of elimination to both rule out and just as importantly rule in possible answers (or root causes to performance issues). Eventually, one can then narrow down and pinpoint things down to a specific solution (or set of answers).

So for example, the database is running exceptionally slow globally at certain times, why ? So it looks like it’s because there are lots of cluster related waits at these times, why ? So it looks like it’s because writing to the redo logs is really slow at these times, why ? And so on and so on.

If you can work out the solution to this problem in a reasonably timely manner, then in all likelihood you have good problem solving skills and the makings of a good Oracle DBA. You just need to also like donuts and good whiskies :)

I’ve reproduced the question here, changing the names to protect the innocent.

“Bowie and Ziggy just become friends with me, and they want to know when my birthday is. I give them 10 possible dates:

May 13 May 15 May 19

June 13 June 14

July 16 July 18

August 14 August 15 August 16

I then tell Bowie and Ziggy separately the month and the day of my birthday respectively.

Bowie: I don’t know when your birthday is, but I know Ziggy does not know too.

Ziggy: At first I don’t know when your birthday is, but now I know.

Bowie: Then I also know when your birthday is.

So when is my birthday ?”

Feel free to comment on what you think the answer is but please don’t give away how you might have worked it out. For those interested (or for those that don’t check out the solution on the web first :) ), I’ll explain how to get to the answer in a few days time.

Like I said, if you get it right, you should consider a career as an Oracle DBA !! And here’s a link to an excellent whisky: Sullivans Cove Whisky :)

Just like Storage Indexes (and conventional indexes for that manner), they work best when the data is well clustered in relation to the Zone Map or index. By having the data in the table ordered in the same manner as the Zone Map, the ranges of the min/max values for each 8M “zone” in the table can be as narrow as possible, making them more likely to eliminate zone accesses.

On the other hand, if the data in the table is not well clustered, then the min/max ranges within the Zone Map can be extremely wide, making their effectiveness limited.

In my previous example on the ALBUM_ID column in my first article on this subject, the data was extremely well clustered and so the associated Zone Map was very effective. But what if the data is poorly clustered ?

To illustrate, I’m going to create a Zone Map based on the poorly clustered ARTIST_ID column, which has its values randomly distributed throughout the whole table:

Another difference between an index and Zone Map is that there can only be the one Zone Map defined per table, but a Zone Map can include multiple columns. As I already have a Zone Map defined on just the ALBUM_ID column, I can’t just create another.

So I’ll drop the current Zone Map and create a new one based on both the ARTIST_ID and ALBUM_ID columns:

As the data in the table is effectively ordered based on the ALBUM_ID column (and so is extremely well clustered in relation to this column), the min/max ranges for each zone is extremely narrow. Each zone basically only contains one or two different values of ALBUM_ID and so if I’m just after a specific ALBUM_ID value, the Zone Map is very effective in eliminating zones from having to be accessed. Just what we want.

However, if we look at the Zone Map details of the poorly clustered ARTIST_ID column (again just a partial listing):

We notice the ranges for most of the zones is extremely large, with many actually having a min value of 1 (the actual minimum) and a max of 100000 (the actual maximum). This is a worst case scenario as a specific required value could potentially reside in most of the zones, thereby forcing Oracle to visit most zones and making the Zone Map totally ineffective.

We notice we are forced to perform a very high number of consistent gets (101,614) when returning just 100 rows, much higher than the 2,364 consistent gets required to return a full 100,000 rows for a specific ALBUM_ID and not far from the 135,085 consistent gets when performing a full table scan.

We need to improve the performance of these queries based on the ARTIST_ID column …

But if we now reorganise the table so that the clustering attribute can take effect:

SQL> alter table big_bowie move;
Table altered.

We notice the characteristics of the Zone Map has change dramatically. The previously well clustered ALBUM_ID now has a totally ineffective Zone Map with all the ranges effectively consisting of the full min/max values:

Consistent gets has reduced dramatically down to just 175 from the previously massive 101,614.

As is common with changing the clustering of data, what improves one thing makes something else significantly worse. The previously efficient accesses based on the ALBUM_ID column is now nowhere near as efficient as before:

“I have a question regarding bitmapped indexes verses index compression. In your previous blog titled ‘So What Is A Good Cardinality Estimate For A Bitmap Index Column ? (Song 2)’ you came to the conclusion that ‘500,000 distinct values in a 1 million row table’ would still be a viable scenario for deploying bitmapped indexes over non-compressed b-tree indexes.

Now b-tree index compression is common, especially with the release of Advanced Index Compression how does this affect your conclusion? Are there still any rules of thumb which can be used to determine when to deploy bitmapped indexes instead of compressed b-tree indexes or has index compression made bitmapped indexes largely redundant?”

If you’re not familiar with Bitmap Indexes, it might be worth having a read of my previous posts on the subject.

Now Advanced Index Compression introduced in 12.1.0.2 has certainly made compressing indexes a lot easier and in many scenarios, more efficient than was previously possible. Does that indeed mean Bitmap Indexes, that are relatively small and automatically compressed, are now largely redundant ?

The answer is no, Bitmap Indexes are still highly relevant in Data Warehouse environments as they have a number of key advantages in the manner they get compressed over B-Tree Indexes.

Compression of a B-Tree index is performed within a leaf block where Oracle effectively de-duplicates the index entries (or parts thereof). This means that a highly repeated index value might need to be stored repeatedly in each leaf block. Bitmap index entries on the other hand can potentially span the entire table and only need to be split if the overall size of the index entries exceeds 1/2 a block. Therefore, the number of indexed values stored in a Bitmap Index can be far less than with a B-tree.

However, it’s in the area of storing the associated rowids where Bitmap Indexes can have the main advantage. With a B-tree index, even when highly compressed, each and every index entry must have an associated rowid stored in the index. If you have say 1 million index entries, that’s 1 million rowids that need to be stored, regardless of the compression ratio. With a Bitmap Index, an index entry has 2 rowids to specify the range of rows covered by the index entry, but this might be sufficient to cover the entire table. So depending on the number of distinct values being indexed in say a million row table, there may be dramatically fewer than 1 million rowids stored in the Bitmap Index.

To show how Bitmap Indexes are generally much smaller than corresponding compressed B-Tree indexes, a few simple examples.

In example 1, I’m going to create a B-Tree Index that is perfect candidate for compression. This index has very large indexed values that are all duplicates and so will compress very effectively:

In example 2, I’m going to create an index that still almost a perfect case for compressing a B-Tree Index, but far less so for a Bitmap Index. I’m going to create enough duplicate entries to just about fill a specific leaf block, so that each leaf block only has 1 or 2 distinct index values. However, as we’ll have many more distinct indexed values overall, this means we’ll need more index entries in the corresponding Bitmap Index.

So we have a relatively large indexed column that has some 1385 distinct values but each value just about fills out a compress leaf block. If we look at the compression of the index, we have reduced the index down from 9568 leaf blocks to just 1401 leaf blocks. Again, a very impressive compression ratio.

Unlike the previous example where we had just the one value, we now have some 1385 index entries that need to be created as a minimum for our Bitmap Index. So how does it compare now ?

Although the Bitmap Index is much larger than it was in the previous example, at just 464 leaf blocks it’s still significantly smaller than the corresponding compressed 1401 leaf block B-Tree index.

OK, example 3, we’re going to go into territory where no Bitmap Index should tread (or so many myths would suggest). We going to index a column in which each value only has the one duplicate. So for our 1 million row table, the column will have some 500,000 distinct values.

With relatively few duplicate column values, the compression of our B-Tree Indexes is not going to be as impressive. However, because the indexed values are still relatively large, any reduction here would likely have some overall impact:

So even in this extreme example, the Bitmap Index at 5740 leaf blocks is still smaller than the corresponding compressed B-Tree Index at 6017 leaf blocks.

In this last example 4, it’s a scenario similar to the last one, except the index entries themselves are going to be much smaller (a few byte number column vs. the 60 odd byte varchar2). Therefore, the rowids of the index entries will be a much larger proportion of the overall index entry size. Reducing the storage of index values via compression will be far less effective, considering the prefix table in a compressed index comes with some overhead.

We notice the index has actually increased in size, up to 2065 leaf blocks from 1998. The overheads of the prefix table over-ride the small efficiencies of reducing the duplicate number indexed values.

Is still smaller at 1817 leaf blocks than the best B-Tree index has to offer.

So the answer is no, Bitmap Indexes are not now redundant now we have Index Advanced Compression. In Data Warehouse environments, as long as they don’t reference column values that are approaching uniqueness, Bitmap Indexes are likely going to be smaller than corresponding compressed B-Tree indexes.

In Part I, I discussed how Zone Maps are new index like structures, similar to Exadata Storage Indexes, that enables the “pruning” of disk blocks during accesses of the table by storing the min and max values of selected columns for each “zone” of a table. A Zone being a range of contiguous (8M) blocks.

I showed how a Zone Map was relatively tiny but very effective in reducing the number of consistent gets for a well clustered column (ALBUM_ID).

In this post, we’re going to continue with the demo and look at what happens when we update data in the table with a Zone Map in place.

So lets update the ALBUM_ID column (which currently has a Zone Map defined) for a few rows. The value of ALBUM_ID was previously 1 for all these rows (the full range of values is currently between 1 and 100) but we’re going to update them to 142:

We notice the maximum is still defined as being 100. So the update on the table has not actually updated the contents of the Zone Map. So this is a big difference between Zone Maps and conventional indexes, indexes are automatically updated during DML operations, Zone Maps are not (unless the REFRESH ON COMMIT option is specified).

If we look at the state of Zone Map entries that have a minimum of 1 (the previous values of ALBUM_ID before the update):

We see the Zone Map was still used by the CBO. The number of consistent gets has increased (up from 2364 to 3238) as we now have to additional access all the blocks associated with this stale zone, but it’s still more efficient that reading all the blocks from the entire table.

If we want to remove the stale zone entries, we can refresh the Zone Map or rebuild it (for ON DEMAND refresh):

We notice nothing has appreciably changed, the Zone Map is still being used but the number of consistent gets remains the same as before. Why haven’t we returned back to our previous 2364 consistent gets ?

Well, as the range of possible values within the updated zone is now between 1 and 142, the required value of 42 could potentially be found within this zone and so still needs to be accessed just in case. We know that the value of 42 doesn’t exist within this zone, but Oracle has no way of knowing this based on the possible 1 to 142 range.

Hence Zone Maps work best when the data is well clustered and the Min/Max ranges of each zone can be used to limit which zones need to be accessed. If the data was not well clustered and the values within each zone mostly had ranges between the min and max values, then Oracle wouldn’t be able to effectively prune many/any zone and the Zone Map would be useless.

Sometimes, a few pictures (or in this case index block dumps) is better than a whole bunch of words :)

In my previous post, I introduced the new Advanced Index Compression feature, whereby Oracle automatically determines how to best compress an index. I showed a simple example of an indexed column that had sections of index entries that were basically unique (and so don’t benefit from compression) and other sections with index entries that had many duplicates (that do compress well). Advanced Index Compression enables Oracle to automatically just compress those index leaf blocks where compression is beneficial.

If we look at a couple of partial block dumps from this index, first a dump from a leaf block that did have duplicate index entries:

The red section is a portion of the index header that determines the number of rows in the prefix table of the index (kdxlepnro 1). The prefix table basically lists all the distinct column values in the leaf blocks that are to be compressed. The value 1 denotes there is actually only just the 1 distinct column value in this specific leaf block (i.e. all index entries have the same indexed value). This section also denotes how many of the indexed columns are to be compressed (kdxlepnco 1). As this index only has the one column, it also has a value of 1. Note this value can potentially be anything between 0 (no columns compressed) up to the number of columns in the index. The (Adaptive)reference tells us that Index Advanced Compression has been used and that the values here can change from leaf block to leaf block depending on the data characteristics of the index entries within each leaf block (a dump of a basic compressed index will not have the “Adaptive” reference).

The green section is the compression prefix table and details all the unique combinations of index entries to be compressed within the leaf block. As all indexed values are the same in this index (value 42, internally represented as c1 2b hex), the prefix table only has the one row. prc 651 denotes that all 651 index entries in this leaf block have this specific indexed value.

Next follows all the actual index entries, which now only consist of the rowid (the 6 byte col 0 column) as they all reference psno 0, which is the unique row id of the only row within the prefix table (row#0).

So rather than storing the indexed value 651 times, we can just store the index value (42) just the once within the prefix table and simply reference it from within the actual index entries. This is why index compression can save us storage, storing something once within a leaf block rather than multiple times.

If we now look at a partial block dump of another index leaf block within the index, that consists of many differing (basically unique) index entries:

We notice that in the red section, both kdxlepnro 0 and kdxlepnco 0 (Adaptive) have a value of 0, meaning we have no rows and no columns within the prefix table. As such, we have no prefix table at all here and that this leaf block has simply not been compressed.

If we look at the actual index entries, they all have an additional column now in blue, that being the actual indexed value as all the index values in this leaf block are different from each other. Without some form of index entry duplication, there would be no benefit from compression and Index Advanced Compression has automatically determined this and not bothered to compress this leaf block. An attempt to compress this block would have actually increased the necessary overall storage for these index entries, due to the additional overheads associated with the prefix table (note it has an additional 2 byes of overhead per row within the prefix table).

I’ll next look at an example of a multi-column index and how Index Advanced Compression handles which columns in the index to compress.

I’ve finally managed to find some free time in the evening to write a new blog piece :)

This will have to be the record for the longest time between parts in a series, having written Part IV of this Index Compression series way way back in February 2008 !! Here are the links to the previous articles in the series:

As I’ve previously discussed, compressing an index can be an excellent way to permanently reduce the size of an index in a very cost effective manner. Index entries with many duplicate values (or duplicate leading columns within the index) can be “compressed” by Oracle to reduce both storage overheads and potentially access overheads for large index scans. Oracle basically de-duplicates repeated indexed column values within each individual leaf block by storing each unique occurrence in a prefix section within the block, as I explain in the above links.

But it’s important to compress the right indexes in the right manner. If indexes do not have enough repeated data, it’s quite possible to make certain indexes larger rather than smaller when using compression (as the overheads of having the prefix section in the index block outweighs the benefits of limited reduction of repeated values). So one needs to be very selective on which indexes to compress and take care to compress the correct number of columns within the index. Oracle will only protect you from yourself if you attempt to compress all columns in a unique index, as in this scenario there can be no duplicate values to compress. This is all discussed in Part II and Part III of the series.

So, wouldn’t it be nice if Oracle made it all a lot easier for us and automatically decided which indexes to compress, which columns within the index to compress and which indexes to simply not bother compressing at all. Additionally, rather than an all or nothing approach in which all index leaf blocks are compressed in the same manner, wouldn’t it be nice if Oracle decided for each and every individual leaf block within the index how to best compress it. For those index leaf block that have no duplicate entries, do nothing, for those with some repeated columns just compress them and for those leaf blocks with lots of repeated columns and values to compress all of them as efficiently as possible.

Well, wish no more :)

With the recent release of Oracle Database 12.1.0.2, one of the really cool new features that got introduced was Advanced Index Compression. Now a warning from the get-go. The use of Advanced Index Compression requires the Advanced Compression Option and this option is automatically enabled with Enterprise Edition. So only use this feature if you are licensed to do so :)

The best way as always to see this new feature in action is via a simple little demo.

To begin, I’ll create a table with a CODE column that is populated with unique values:

So I’ve fabricated the data such that the values in the CODE column are effectively unique within 75% of the table but the other 25% consists of repeated values.

From an index compression perspective, this index really isn’t a good candidate for normal compression as most of the CODE data contains unique data that doesn’t compress. However, it’s a shame that we can’t easily just compress the 25% of the index that would benefit from compression (without using partitioning or some such).

If we create a normal B-Tree index on the CODE column without compression:

We notice that the compressed index rather than decrease in size has actually increased in size, up to 2684 leaf blocks. So the index has grown by some 25% due to the fact the index predominately contains unique values which don’t compress at all and the resultant prefix section in the leaf blocks becomes nothing more than additional overhead. The 25% section of the index containing all the repeated values has indeed compressed effectively but these savings are more than offset by the increase in size associated with the other 75% of the index where the index entries had no duplication.

However, if we use the new advanced index compression capability via the COMPRESS ADVANCED LOW clause:

We notice the index has now indeed decreased in size from the original 2157 leaf blocks down to 2054. Oracle has effectively ignored all those leaf blocks where compression wasn’t viable and compressed just the 25% of the index where compression was effective. Obviously, the larger the key values (remembering the rowids associated with the index entries can’t be compressed) and the larger the percentage of repeated data, the larger the overall compression returns.

With Advanced Index Compression, it’s viable to simply set it on for all your B-Tree indexes and Oracle will uniquely compress automatically each individual index leaf block for each and every index as effectively as it can for the life of the index.