Sunday, March 8, 2009

It seems to be a bit like in the movie the "Groundhog Day": Although I've highlighted alreadytwice an issue when using range-list subpartition pruning, it seems to be reoccurring and there are still a lot of oddities to tell about.

A recent case of a client showed that the subpartition pruning fix that has been incorporated into the 10.2.0.4 patch set is unfortunately quite buggy.

The test case used in my previous posts therefore represents only half of the truth.

Although when using explicit subpartition pruning by means of the "subpartition" clause 10.2.0.4 correctly applies the subpartition statistics (in contrast to 10.2.0.3 which uses the partition level statistics), it fails miserably when using "normal" partition pruning by specifying the partition keys.

Using the same test setup as in my previous post running this simple query in the different versions reveals some interesting results:

11.1.0.7 correctly identifies the subpartition (10 rows, 1 block). What seems to be odd that it reports for the column #2 (X_SLICE) apparently partition level statistics (1010 BktCnt, Number of Distinct Values: 2), but it doesn't seem to influence the cardinality estimate significantly. When removing any histograms on subpartition level the cardinality estimate is correctly shown as 10.

This looks like a mess, and it is indeed. You'll get always an estimated cardinality of 0 (sanity checked to 1) in the 10.2.0.4 patch set if you prune to a single subpartition. This will screw up almost any plan and hence potentially show abysmal performance.

The issue is tracked with bug Bug 7210921 - STATISTICS WITH SUBPARTITIONS ARE NOT CORRECTLY INTERPRETED IN EXPLAIN PLANS and Bug 7264383 - WRONG STATISTICS 0 ROWS FOR QUERY PRUNED TO SUBPARTITION ON 10.2.0.4. More information can be found in Metalink Note 728056.1. A patch for bug 7210921 can be requested and is already available for a couple of platforms.

The 10.2.0.3 output reveals that it uses the partition level statistics and only gets it right in this particular case due to an frequency histogram on X_SLICE on the partition level. If you remove this histogram the cardinality estimate will be wrong (505).

So if you're using range-list subpartitioning with 10.2.0.4 and use to prune to a single subpartition you should either use 10.2.0.3 optimizer features using the OPTIMIZER_FEATURES_ENABLE parameter/hint to avoid the mess shown above - though 10.2.0.3 can only use partition level statistics - or get the patch mentioned installed.

In the next part of this series I'm going to show some further oddities when using list partitioning. These apply also to 11gR1.

12 comments:

Thanks for this...we've just hit this, or something very similar. We're on 10.2.0.4 on AIX 6 and don't have that patch installed...fortunately, we're going to migrate to 11gR2 in next few weeks, so we're not bothering with the patch...we'll retest on 11gR2 and hopefully it disappears then.

Hi Randolf, I am a regular follower of your blog, and I must thank you for the wonderful service you are doing for the oracle fraternity. I have recently observed a different incarnation of the oddity mentioned by you in 11.2.0.2. In my case, I have range partitioned table which is further hash partitioned based on two other columns. So when I run a query against the table using the partitioning key columns and subpartition key columns, the optimizer gets the cardinality right based on the partition statistics but get the #Blks estimate completely wrong, as shown in your example for 10.2.0.4 (optimizer trace). Since the #Blks is wrongly estimated as very less, so serial direct path read is not coming to play and hence no smart scan(on exadata). I can work around the problem by setting one of the following options: 1) set _serial_direct_read = TRUE 2) set _direct_read_decision_statistics_driven to FALSE 3) Setting optimizer_features_enable = 11.1.0.6 This problem is easily reproducible in any 11.2.0.2 environment. Let me know your thoughts on this.

It seems the optimizer is wrongly computing the #Blks as 72 by dividing the #Blks in partition (216) by the number of partitions (3). So when we have hundreds of partitions the #Blks estimates drop too low and as such serial direct read doesnt come to play.

If we repeat the test by setting the optimizer_features_enable to 11.1.0.6, we will see that the optizerer gets the correct estimates as shown below:

A few questions first: What block size do you use, and how is the tablespace defined you've used for testing - is it uniform or system allocated extents, and is it ASSM or MSSM?

I find the blocks reported per partition interesting - I cannot simply reproduce the 216 blocks you've shown per partition, in particular since each partition contains a different number of rows.

Now regarding your observation regarding the blocks reported: I'm not sure I can follow your point, because you seem to say that you want Oracle to report the 216 blocks as it does with 11.1.0.6 optimizer settings, but you prune to a *single* hash subpartition, so in fact the optimizer shouldn't assume 216 blocks, but 216 divided by 4, assuming a even distribution of data by hash.

So the number of blocks assumed might be wrong, since it looks like 216 / 3 instead of 216 / 4, but 216 is surely wrong, too, and an overestimate.

By the way, since you seem to prune to a single hash subpartition - if you generate statistics on subpartition level (granularity ALL), you get the same estimates from 11.2.0.2 and 11.1.0.6 optimizer settings, and these should be the correct ones taken from the subpartition statistics.

And finally: The cardinality estimate is wrong in your test case in my opinion, since it is taken from the partition statistics, but again: You prune to a *single* hash subpartition, so the cardinality should be less than the one taken from the partition (divided by 4 should be a good starting point).

The only thing that looks suspicious to me at present is why 11.2.0.2 uses partition blocks / 3 instead of partition blocks / 4, and when I try with different number of partitions / subpartitions it becomes clear that it indeed looks like a bug, since Oracle uses the number of partitions instead of subpartitions to calculate the number of blocks from the partition statistics.

So your problem is probably more caused by that bug, assuming you have many, many partitions the block estimate will become very small, but the 11.1.0.6 approach is wrong, too, as it simply uses the partition level stats when pruning to a single subpartition.

A workaround to arrive at better estimates would be to generate subpartition statistics so that Oracle can simply pick the actual blocks number from the corresponding subpartition statistics, but that will take much longer than generating just partition level stats, and might have side effects on estimates for other queries that don't prune to a single hash subpartition.

My tablespace is a BIGFILE tablespace with AUTOALLOCATE DEFAULT and SEGMENT SPACE MANAGEMENT AUTO with a block size of 16384.

Couple of more observations:

1) Even if I gather subpartition stats, the optimizer still continues to use partition stats. It does not use the subpartition stats.

2) I was referring to the 11.1.0.6 optimizer settings as correct because as indicated in the "BASE STATISTICAL INFORMATION" of the optimizer trace it is doing PARTITION PRUNING and it is getting the partition stats correctly (216), I would have expected to see "SUBPARTIONS" PRUNED in the optimizer trace, if it was trying to use subpartition stats. Please correct me if I am wrong.

Notice how it says "SUBPARTITIONS" and uses different rows / blocks in the second case, and shows the partition level cardinality and blocks / 3 like in your example when having just partition level stats.

Not sure why you don't see the subpartitions statistics used in your case, for me it works as described.

Many thanks Randolf, I was able to get subpartition stats to work. However this is a bigger changes since it involves a change in our stats gathering strategy and might have side effects on estimates for other queries that don't prune to a single hash subpartition, as you have rightly pointed out.

I suggest that you open an SR with Oracle regarding the incorrect division of the partition level stats when pruning to a single hash subpartition, that clearly looks like a bug.

Instead of gathering subpartition statistics you could evaluate if using dynamic sampling together with the DYNAMIC_SAMPLING_EST_CDN hint is applicable in your case - it should give you more better cost estimates, but might have other side effects on the execution plans you need to test.