Wednesday, October 1, 2008

A bitmap join index is a special kind of index as it contains values of columns from table(s) joined to a table but holds bitmaps that point to the table being joined to. So it allows under certain circumstances to avoid join operations from taking place if you are restricting your results on your table by filtering on an attribute of a joined table.

For further information regarding bitmap (join) indexes you may want to have look at the following note by Jonathan Lewis that contains links to three Word documents that describe the basic principles of bitmap (join) indexes.

As can be seen in Jonathan's document Oracle internally supports these bitmap join indexes through various constructs - amongst these a "virtual column" is added to the column list of the "indexed" table in the data dictionary that represents the values from the joined table obviously based on the data stored in the index.

So in theory gathering statistics on this "virtual column" would allow the optimizer to get a very good estimate for the number of rows that correspond to a particular value of the attribute of the joined table, which would be very helpful especially if the "joined" value distribution of this attribute is skewed.

In Data Warehouses sometimes rows of a fact table are assigned to a special value of a dimension, like "deleted", "not mapped", "default" etc. which might become at some point a predominant value when joining the dimension to the fact table. So without the additional information stored in the bitmap join index the optimizer at parse time can hardly tell from the normal table and column statistics available how many rows a join will return if you filter on an particular attribute of a dimension table. Although the statistics can tell quite precisely how many rows of the dimension table will satisfy the filter condition, this doesn't say anything about the number of rows of the fact table that will join to the corresponding primary key values of the filtered dimension table.

However, using the data that is stored in the bitmap join index this information is available as the index can tell exactly how many rows of the fact table correspond to a particular value of an attribute of a dimension table.

Unfortunately all tested Oracle versions (9.2.0.8, 10.2.0.4 and 11.1.0.6) do not allow to gather statistics on these virtual columns that are corresponding to bitmap join indexes, and even if manually crafted statistics including histograms are applied to the virtual column by using "dbms_stats.set_column_stats" the optimizer does not consider them for cardinality estimated of the corresponding row sources.

So in future this might be a useful extension of the statistics framework to fully leverage the potential power of bitmap join indexes.

A small test case run on 11.1.0.6 (Windows 32-bit) shall demonstrate the issue. Note that 9.2.0.8 and 10.2.0.4 show similar results regarding the optimizer estimates.

First suitable tables are created that represent a fact table and its corresponding dimension table.

As we can see the bitmap join index is used to avoid the join operation but the optimizer estimate is still the average value, although the bitmap join index information could be used to determine the skew.

Now let's try to get statistics information on the "virtual" column created for the bitmap join index.

Interestingly the custom column statistics for the "virtual" column are accepted by DBMS_STATS.SET_COLUMN_STATS although DBMS_STATS.GATHER_TABLE_STATS is not able to analyze this column.

But no difference regarding the estimates can be seen even with a suitable histogram in place. This shows that the optimizer doesn't evaluate the information available from the "virtual" column of the bitmap join index.

Finally let's see what happens when querying one of the other dimension values.

The estimate is still the same (the average of 100), so in summary the optimizer is not (yet) able to take advantage of the potential added value represented by the "virtual" column created for the bitmap join index.