As Exadata Storage Indexes (SI) are purely memory only structures located on the Exadata storage servers, care needs to be taken in how much memory they can potentially consume. As a result, there is a limit of 8 columns (or 8 SIs) that can be defined for a given 1M storage region at any point in time, even though SIs are much smaller structures than equivalent database indexes. As database indexes are physical constructs however, there’s no such limit on the number of indexes that can be defined for a table. This can become another key difference between SIs and database indexes.

The following table has more than 8 columns, each with differing numbers of distinct values or distributions of data:

If we look at the session statistics, we’ll notice that a SI has been created and saved us physical IOs. Note: If you follow the demo, you’ll need to keep track of these statistics after each query or simply reconnect as a new session to ensure a SI has or has not been used.

Again, a SI has been created and used here. The SI in this case has been extremely beneficial because no data exists for 42 (only the values 1 – 10 exist). However, if an existing value were to be selected, the SI would be next to useless as such a value would exist throughout all 1M storage regions. With just 10 distinct randomly distributed values, this SI has the potential to be a waste of time. But while we search for values that don’t exist, it serves a very useful purpose. An important consideration in what’s to come.

OK, this is the really important one to note. Firstly, yes again a SI has been created and used. Note though that this has not been the first SI to be created or used on this table; SIs have previously been created and used in the previous two queries on different columns. This will also not be the most recent SI to be created, more will soon follow. However, this is by far the least effective use of a SI, because of both the selectivity of the query and distribution of data. Here, only some 66MB or so of physical IOs have been saved.

We now repeat this process, running a query against a different column, being very selective and ensuring a SI is created and used. We finally reach a point when 8 SIs have been created on the table:

We notice that no bytes have been saved here via a SI. We can run this query repeatedly and the results will be the same. No SI is created and no bytes are saved. Although Oracle could potentially create a SI and save some work, the fact we already have 8 SIs created for this table means we have already reached the limit on the number of SIs that can be created for this table. 8 is it.

Let’s run another query now using yet another different column (COL12), but this time it’s again a very selective query, much more selective and efficient than the previous query based on COL4 in which a SI had been created:

This time however the number of bytes saved has again gone up from previously meaning that indeed a new SI has been created and used for column COL12. But this makes 9 SIs in total now for this table where the limit should be a maximum of 8 ?

Does this mean that a previously created SI has been dropped and replaced by this new one. If so, which SI is now gone ?

Well, let’s go back to the first SI we created and the one that hasn’t been used for the longest period of time. If we re-run the first query again:

We notice that not only has it used a SI, but it has saved the maximum amount of IO bytes possible here meaning there was no “warming up” processes happening here indicating a newly created SI. So not only did it use a SI, it clearly used a previously created one. So this SI was not obviously impacted by the creation of this “9th” SI.

However, if we run the query again that used column COL4, the query that previously used a SI but was by far the least effective in saving physical IOs:

This time we notice there’s no change to the number of bytes saved by a SI. No matter how often we run this query now, no SI is used. So the SI that was created previously is now gone as a result of creating the more effective SI on the COL12 column. Indeed there are still just the 8 SIs on this table.

So Oracle will indeed limit the number of SIs to 8 for each table/storage region. However, it’s not simply a case of “first in first served” or some such, with Oracle using (undocumented) performance metrics to determine which 8 SIs/columns to choose. This means it might be possible in some rarer scenarios where more than 8 columns get referenced in SQL statements for SIs to come and go depending on changing workloads. Another example of where database indexes may yet play a role in Exadata environments, where currently tables have more than 8+ indexed columns.

OK, my holiday to Hawaii is now slowing fading away into distant memory. Time for a new post :)

In my previous post on differences between Exadata Storage Indexes and Database Indexes Part II, I discussed how the clustering of data within the data is an important factor (pun fully intended !!) in the performance and efficiency of both types of indexes. In this post, I’ll expand a little more on this point and highlight scenarios where a normal database index can significantly improve performance but be totally ineffective as a Storage Index.

Storage Indexes can only determine where required data can’t exist within a (1M default) storage region via the associated Min/Max values of the index. If a required value can’t possibly exist within the Min/Max range boundary, the specific storage region can be ignored and not be accessed during the full scan operation. However, in some scenarios, there can be limitations in the usefulness of only having Min/Max values to work with if both Min and Max values are relatively common and randomly distributed throughout the table. This is not an uncommon scenario when dealing with columns that have data such as code status values.

In the BIG_BOWIE table I’ve been using in this series, I have a FORMAT_ID column that was initially populated with just values 2,4,6,8 and 10, evenly distributed throughout the table. I then updated a relatively small number of these values in the following manner:

We notice the CBO has used the index to efficiently retrieve just the 0.01% of required rows because of the high selectivity of the query, even though the overall Clustering Factor of the index is appalling.

It does an even better job when accessing FORMAT_ID = 5 as these specific values are all well clustered and found within in a very tight portion of the table:

At just 154 consistent gets in this example, the results return almost instantaneously. The database index is extremely effective in these examples because it can point just to the locations within the table where these relatively few rows of interest are located.

Let’s see how Storage Indexes cope in this scenario. First, we make the database index invisible:

SQL> alter index big_bowie_format_id_i invisible;
Index altered.

We now run the same query again (this can be repeated as many times as you like):

Same results when accessing the FORMAT_ID = 5. Any possible Storage Indexes have been totally ineffective and the Full Table Scan has had to read the entire table.

The result being performance is significantly slower than with the database indexes visible and in place.

Why ?

The data consists primarily of randomly distributed values 2,4,6,8 and 10, with the odd value 3 and 5 littered around. This means that if created, the Storage Index Min/Max values will effectively be 2 and 10 for all 1M storage regions throughout the entire table.

Therefore, both values 3 and 5 could possibly exist within all/any of the table storage regions as they sit inside the 2 – 10 Min/Max range boundaries. These 3 and 5 values are effectively “hidden” within the Storage Index Min/Max values making Storage Indexes totally ineffective in this scenario, as they can’t skip a thing.

A final point here. If one of these rarer values of interest were to be say value “12”, then a Storage Index could be useful when searching for this particular value as any Min/Max ranges of 2-10 that don’t therefore contain a value of 12 can be skipped. However, if values 3 and 5 were also of interest, the Storage Index would still only be useful in a subset of required cases. A database index would therefore still be required to cater all necessarily cases.

As Database indexes can directly point to indexed values of interest, this is another example of where we would likely not want to just drop a database index in Exadata.

Two posts in two days !! Well, with Christmas just around the corner, I thought I better finish off a couple of blog posts before I get fully immersed in the festive season :)

The Clustering Factor (CF) is the most important index related statistic, with the efficiency of an index performing multi-row range scans very much dependent on the CF of the index. If the data in the table is relatively well clustered in relation to the index (i.e. it has a “low” CF), then an index range scan can visit relatively few table blocks to obtain the necessary data. If the data is effectively randomised and not well clustered in relation to the index (i.e. has a “high” CF), then an index range scan has to visit many more table blocks and not be as efficient/effective as a result. The CBO will be less inclined to use such an index as a result, depending on the overall selectivity of the query.

It’s a very similar story for Exadata Storage Indexes (SI) as well. The better the table data is clustered in relation to the SIs, the more efficient and effective the SIs are likely to be in relation to being able to eliminate accessing storage regions that can’t possibly contain data of interest. By having the data more clustered (or ordered) in relation to a specific SI, the Min/Max ranges associated with the SI are more likely to be able to determine areas of the table where data can’t exist.

An index range scan access path is selected by the CBO to retrieve the 100,000 rows. At just 1590 consistent gets, with such an excellent CF, the index can very efficiently access just the necessary 100,000 rows of data. Notice that most of these consistent gets are still physical reads (1550). If we re-run the query several times and are able to cache the corresponding index/table blocks in the database buffer cache:

We notice that although a Full Table Scan is being performed, the overall performance of the query is practically identical to that of using the index. That’s because the SIs are kicking in here and by saving 1042 MB (approximately 99% of the table), Oracle only has to actually physically access a tiny 1% of the table (basically, the 1% selectivity of the query itself). SIs based on the well clustered ALBUM_ID column are therefore very effective at eliminating the access of unnecessary data.

If we now run a query based on the TOTAL_SALES column in which the data is randomly distributed all over the place and so the associated index has a very poor CF:

We notice that although only 2009 rows are retrieved with this query, 2150 consistent gets have been performed (practically 1 for each row returned) . This is somewhat more than the 1590 consistent gets of the previous example when a full 100,000 rows were returned. Using this index therefore is nowhere near as efficient/effective in retrieving data as was the index in the previous example.

If all this data can be cached in the buffer cache however, we can again improve overall execution times:

We noticed that the physical IO bytes saved by the SIs has significantly reduced from the previous example (just 72 MBs down from 1042 MBs), even though at just 2000 odd rows we require much less data than before. In this example, only approximately 7% of table storage need not be accessed, meaning we still have to access a significant 93% of the table as the required data could potentially exist throughout the majority of the table. The poor clustering of the data in relation the TOTAL_SALES column has effectively neutralised the effectiveness of the associated SIs on the TOTAL_SALES column.

Note also that the 1.25 seconds is as good as it gets when performing a FTS with the fully generated SIs in place. In this case, using a fully cached database index access path can outperform the FTS/SI combination and provide a more efficient and ultimately more scalable method of accessing this data. As the required selectively on this column is low enough to warrant the use of a database index despite the poor CF, this is again another example of an index we may not necessarily want to automatically drop when moving to Exadata.

It’s often stated that in Exadata, you don’t need conventional database indexes anymore as everything runs so damn fast that indexes are simply a waste of time and space. Simply drop all database indexes and things will run just as fast.

Well, not quite …

There are many many scenarios where database indexes are still critical for optimal performance and scalability in an Exadata environment.

Prompted by a question in an earlier post (thanks Paul), I thought I might start looking at some scenarios where dropping a database index in Exadata would not be the best of ideas.

The first example is the scenario where one wants either the minimum/maximum of a column value. As database index entries are always ordered, this can very efficiently be found by traversing down to either the first or last index leaf block within an appropriate index. There’s an execution the INDEX FULL SCAN MIN/MAX execution path by which the CBO can decide to access an index in this manner.

For example, if I create an index on the ID column of my large DWH_BOWIE table and select the MIN(ID):

We notice the query takes considerably longer and that the Storage Indexes have been unable to be of any help with not a single byte saved.

This is because all aggregation type operations are performed at the database level, not within the storage servers. Even though in theory Storage Indexes could be seen as perhaps being of value here, they are totally ignored by Oracle when retrieving either the minimum/maximum of a column value. Remember also that a key difference between a Storage Index and a Database Index is that a Storage Index does not necessarily have to exist fully for all regions of a given table.

One of these similarities is the “warming up” process that needs to take place before indexes become “optimal” after either the Storage Server (in the case of Storage Indexes) or the Database Server (in the case of Database Indexes) is restarted.

When a database server is restarted, the contents associated with the database buffer cache is lost and has to be reloaded (as well as other components of the SGA of course). This means for a period of time while the instance “warms up”, there is a spike in physical IO (PIO) activity. Index range scans, which generally greatly benefit from re-using cached data therefore need to perform additional PIOs for both the index blocks and the table blocks they reference. The overheads associated with these additional PIOs in turn slows the overall performance of (especially) these index related execution plans.

As a simple illustration, the following query is executed after a flushed buffer cache:

All the data is now cached and the overall query response time improves considerably. Once a database instance has warmed up and much of the database’s hot, data-set of interest has been cached, PIOs are minimised and performance improves accordingly.

Much has been made of the fact Storage Indexes (SIs) need to be created on the fly but in a similar manner to database indexes, they just need a period of time to warm up and be created/stored in memory to become optimal. The difference being that SIs are actually created on the fly when cached in memory unlike database indexes which physically preexist on disk.

So when an Exadata storage server is re-started, the performance associated with the SIs on the server is initially sluggish as well, while Oracle performs the additional PIOs off disk to create the necessary SIs in memory. Additionally, new queries using new column predicates will also be initially sluggish while any necessary SIs are created and cached in memory. Note that SIs are created on a 1M storage region basis so it could well take a period of time for a SI to be created across all the associated storage regions of the table. There is also a limit of 8 SIs per 1M storage region which might limit whether a SI is actually created or preserved (more on this point in a later post).

The following query is executed on a relatively new table and in a new session to capture fresh session statistics:

The amount of physical IO bytes saved by SIs increases considerably and overall performance improves as a result.

Restarting a database server will result in sluggish performance for a period until most of the database hot data is cached. This is also the case when an Exadata Storage server is re-started, in part while the necessary SIs are recreated and cached.

I’ll next discuss another similarity, that being the importance of the clustering of data within the table to the efficiency of these indexes.

But first, following is the setup for the various demos to come. I basically create one table called BIG_BOWIE that’s about 1GB in size and then simply create another table called DWH_BOWIE where the contents of this are re-insert into itself a few times to get to about a 60GB table. The various columns have differing distinct values and distributions of data.

I used an X2-2 1/2 rack as my toy and yes, once people saw the table names instantly knew who created them :)

We see that execution times reduce significantly, down to just 4 secs. So we manage to read via a FTS a 60GB table in under 4 seconds, not bad at all.

A hint that something different might have occurred here is the appearance of the “storage” predicate listing. This basically tells us that a smart scan “might” have occurred. Note though that the “reported” physical reads and consistent gets as reported by the database is basically the same as the previous run. More on this in later posts.

The reason for this improvement is due in large part to the use of Exadata Storage Indexes. As discussed previously, a Storage Index can potentially automatically avoid having to read significant portions of data by determining areas of storage that can’t possibly contain data of interest. This is a classic example of Storage Indexes in operation, putting into practice the notion that the quickest way to do something is to avoid doing as much work as possible.

But how to tell whether a Storage Index really did kick in and how much work did it actually save ?

The V$SQL view now has a number of additional columns that provides useful information:

There are also numerous new statistics which begin with ‘cell ‘ that I generally capture before and after a particular operation to see the storage related workloads. Two statistics I find particularly useful are:

The cell physical IO bytes saved by storage index statistic denotes how much storage has not had to be read due to Storage Indexes. So this tells us just how useful Storage Indexes have been in reducing physical IO operations. The cell physical IO interconnect bytes returned by smart scan statistic denotes how much data has actually been returned to the database servers as a result of a smart scan.

As the above numbers highlight (note this last query had been the only activity within the session), the Storage Indexes were highly effective and were able to physically skip reading the vast majority of the table (59,414MB) during the Full Table Scan operation and that only a relatively small amount data (26MB) had to be returned back to the database servers.

By not having to read most of the data, the resultant Full Table Scan on this relatively large table was completed not in minutes as previously, but in a matter of a few seconds.

The obvious one is that they’re both designed specifically to reduce the overheads associated with retrieving the required data out of the database. Both index structures provides a method by which Oracle can avoid having to read every row/block in a table when searching for data of interest. It’s just the actual implementation of this mechanism that differs between the two general index types as I’ve previously discussed.

The efficiency of both index types is very largely dependant upon the clustering of the indexed data within the table (i.e. the index Clustoring Factor). The better the clustering of the related indexed data within the table, the more tightly coupled the required data is likely to be and so the more efficient both index types would be in avoiding accessing unnecessary data. If the required data were to be less well clustered and randomly distributed throughout the table, the less efficient would be both index types in retrieving the necessary data. Some actual examples of this to come in future posts.

Both index types have a period of “warming up” before being fully effective. It’s simply the manner in which this warming up process occurs that differs between the two. Database indexes on a freshly bounced database server initially incur substantial physical I/Os (PIOs) until the related index/table data is cached within the database buffer cache (and indeed in the flash cache). These PIOs can significantly reduce the performance of the SQL plans utilising database indexes. Storages indexes on a freshly bounced storage server need to be recreated and can’t immediately eliminate accessing unnecessary storage regions. This can significantly reduce the performance of Full Table Scans until the associated SIs are fully created. Again, some actual demos on all this to come in future posts.

Both index types can use “Index Combine” like logic and use multiple indexes in combination with each other to further reduce the actual number of table blocks that need to be accessed to retrieve the data of interest. Storage and Bitmap database indexes are especially suited to these types of index combine operations, although B-Tree indexes can also be used in this manner.

Both Oracle index types are really quite interesting and often misunderstood and so meets the general theme of this blog, meaning I can quite happily blog about them without shocking too many people in the process :)

Let’s explore some of the key differences between Storage Indexes (SI) and Database Indexes (DI). In no particular order, they include:

SIs are structures that exist only within the storage servers of an Exadata box, while DIs logically exist and can be accessed within the database servers.

SIs are purely memory only structures while DIs are physical segments that take up storage. As such, DIs are relatively expensive to both create and subsequently maintain as they generate considerable undo and redo within the database, can cause concurrency issues and require storage resources. SIs meanwhile require no physical storage and have little impact on DML operations.

SIs are generated automatically and transparently while DIs generally need to be explicitly created (except in some scenarios such as in the creation of Primary/Unique keys when they can be implicitly created).

SIs being memory only structures are transient in that if a storage server were to be restarted, the corresponding SIs are lost and need to be re-created. Additionally, Oracle may decide to drop a SI for a particular column and create one on a different column depending on current load and conditions. DIs are permanent objects that need to be explicitly dropped (except in some scenarios such as the dropping/disabling of Primary/Unique Key constraints when they can be implicitly dropped).

SIs can be stored in memory as they contain very brief summary information, just the min/max value and a null flag for each 1MB storage region. A corresponding DI (especially a B-Tree) would generally be significantly larger as it needs to store all indexed values from the table with associated rowids (unless compressed but still likely much larger even so).

SIs can index only a portion of a table at a specific point in time as they get generated and dropped (see above). DIs index the entire table/partition (prior to 12c), unless using smarts such as decode function-based indexes (which also index the entire table but based only on the results of the function).

SIs are limited to only 8 columns whereas DIs have no such limitations per table.

SIs reference a 1MB storage region whereas a DI references a specific database block (say 8K). Therefore, a DI is more “focused” in terms of the minimum amount of data that needs to be accessed.

SIs basically work by determining which areas of storage can not possibly contain data of interest, accessing just those storage regions that might contain data of interest. Therefore, it’s quite possible for a SI to generate false positives by having to access storage that might in the end not actually contain data of interest after all. A DI meanwhile via the rowid explicitly points to the exact location of an indexed value and does not generate false positives. A bitmap index is a little different in the manner in which rowids are stored and generated (and can have 0 bits set for rows that don’t actually exist) but again do not generate false positives when actually accessing the table blocks.

SIs are only used during Smart Scan operations, which in turn are only performed during direct-reads of full scans of larger database segments (tables / indexes / materialized views and partitions thereof). Therefore SIs are only used when DIs are not.

As SIs access data during a Smart Scan, the resultant data by-passes the Database Buffer Cache and can not be re-used by subsequent database users/operations. Therefore, SI accessed data may need to be frequently physically re-accessed. DIs perform single block reads (except for Fast Full Index Scans) which are cached in the Database Buffer Cache and which can therefore be globally reused by the database. Once cached, it may be unnecessary to subsequently physically re-access the DI retrieved data.

SIs are used, even if the majority of the data needs to be accessed regardless. As SIs are only used during a FTS, the concept of only using an index when it’s the cheaper alternative doesn’t apply to SIs. If a SI can save (say) just 5% of physical I/Os during a FTS, it’s better than no savings at all. DIs meanwhile are only used when the Cost Based Optimizer (CBO) considers it the cheapest option when accessing data.

As SIs are storage based structures, the CBO has no knowledge of their existence and play no part in the CBO cost calculations. DIs are fully known to the CBO and the DI related statistics are an important factor in the CBO calculations. The CBO only determines whether a FTS is the cheaper alternative, however the decision to perform a Direct-Read operation and so potentially enable the use of SIs is a run-time decision not made by the CBO.

SIs can be effectively used for IS NULL predicates, thanks to the null existence flag component of the SI. B-Tree Indexes can’t if all indexed columns are null (as such entries are not indexed) although Bitmap indexes can.

SIs can not be used to police Primary/Unique constraints. DIs can.

SIs can not be used to avoid performance issues in relation to Foreign Keys (such as locking implications and FTS requirements on child tables when deleting parent rows). DIs can.

SIs can not avoid sort operations. DIs can as data read via an index range scan is guaranteed to be returned in the order of the index entries.

SIs can not provide additional statistical information to the CBO, such as accurate selectivity information in multi-column predicates available in concatenated index distinct keys statistics. DIs can.

SIs can not be used to efficiently access the MIN/MAX of a column. DIs can.

Function-Based SIs are not supported. Function-Based DIs are supported.

SIs can not be treated as smaller tables and used as an alternative by the CBO to access just index related data, eg. select count(*), select indexed_column, etc. as SIs do not contain all the required data and are not visible to the CBO anyways. DIs can be treated as smaller tables and accessed accordingly if appropriate.

OK, that’s enough of a list for now to get one thinking about some of these differences :)

In the following posts, I’ll go through the benefits of SIs and show examples of how they’re implemented and used by Oracle.

Thought I might discuss Exadata Storage Indexes, explore what they are, how they work and discuss their advantages and disadvantages. Following is but a brief introduction on this very cool Exadata feature.

A Storage Index basically maintains summary information about database table data down on the Exadata storage servers. The information consists of the minimum value, the maximum value and a flag to denote whether any Nulls exist for up to 8 different columns within each 1M storage region of disk.

My little diagram above attempts to describe this (click on image for a larger version).

The Storage Indexes are created automatically and transparently based on the SQL predicate information executed by Oracle and passed down to the storage servers from the database servers. Storage Indexes take up no physical storage of themselves and are built and maintained entirely in memory. As only this very basic summary information is stored for a maximum of 8 columns for each 1M of storage, Storage Indexes are very lightweight and can be created and maintained with minimal general overheads.

So how are they used ?

During an Exadata Smart Scan, Oracle can perform predicate filtering down at the storage layer and so only return just the rows of interest back up to the database. As part of this process, Oracle can use the Storage Indexes to visit just the 1M storage regions that can potentially contain rows of interest. Those storage regions that can’t possibly contain data of interest can be eliminated and not accessed at all during the Smart Scan operation.

So, in a very simple example, if we have an SQL predicate such as WHERE CODE = 5, if a corresponding Storage Index on the CODE column of the first 1M region of the table has MIN=2 and MAX=10, Oracle would need to access this portion of the table as the CODE value of interest could potentially exist here. However, if the next 1M storage region had a CODE Storage Index with a MIN=7 and MAX=12, then the CODE value of 5 can’t possibly exist within this portion of the table and can be ignored and not accessed at all during the Smart Scan.

So depending on the column, predicates and data distribution, a Storage Index can potentially eliminate having to physically access significant portions of a table during a Smart Scan. In an extreme example, on a search of a CODE value = 42 where the maximum CODE value that actually exists is say 35, a Smart Scan can perform a so-called Full Table Scan (FTS) via a Storage Index that doesn’t actually have to perform any physical I/O at all and can ignore the entire table.

The less physical I/O performed, the less work required and the faster the response time. If a FTS of say a 500GB table only had to physically read and access a few MBs here and there AND just return the data of interest back to the database servers, that would significantly improve the overall performance and overheads associated with the FTS.

Storage Indexes can be very cool indeed.

Much more to come …

search

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle Corporation.

Contact Details

If you wish to contact me directly, please do so at richard.foote@bigpond.com