Monday, January 9, 2012

Dynamic Sampling On Multiple Partitions - Bugs

In a recent OTN thread I've been reminded of two facts about Dynamic Sampling that I already knew but had forgotten in the meantime:

1. The table level dynamic sampling hint uses a different number of blocks for sampling than the session / cursor level dynamic sampling. So even if for both for example level 5 gets used the number of sampled blocks will be different for most of the 10 levels available (obviously level 0 and 10 are exceptions)

2. The Dynamic Sampling code uses a different approach for partitioned objects if it is faced with the situation that there are more partitions than blocks to sample according to the level (and type table/cursor/session) of Dynamic Sampling

Note that all this here applies to the case where no statistics have been gathered for the table - I don't cover the case when Dynamic Sampling gets used on top of existing statistics.

Dynamic Sampling Number Of Sample Blocks

Jonathan Lewis has a short post describing 1. above, although I believe that his post has a minor inaccuracy: The number of blocks sampled for the table level dynamic sampling is 32 * 2^(level - 1) not 32 * 2^level.

Note that the constant 32 is defined by the internal parameter "_optimizer_dyn_smp_blks" and is independent from the block size. So this is one of the cases where a larger block size potentially gives better results because more data might be sampled, of course it also means performing more work for the sampling.

Here are two excerpts from optimizer trace files that show both the difference between the table and cursor/session level sample sizes as well as the 2^(level -1) formula for the table level:

So both cases use level 5, but the number of sample blocks is different, and for the table level 5 it is 32 * 2^4 = 32 * 16 = 512 blocks

Dynamic Sampling On Multiple Partitions

Point 2. above is also described in one of the comments to the post mentioned. In principle the Dynamic Sampling code seems to assume an overhead of one sample block per (sub)segment, so the effective number of blocks to sample will fall short by the number of (sub)segments to sample.

Probably this is based on the assumption that the segment header block needs to be accessed anyway when reading a segment.

If the code didn't cater for this fact it could potentially end up with an effective number of blocks sampled that is far greater than defined by the sample size when dealing with partitioned objects.

For non-partitioned objects this is not a big deal because it means exactly one block less than defined by the sample size.

But if Dynamic Sampling needs to sample multiple partitions this has several consequences:

a. The number of blocks that are effectively sampled for data can be far less than expected according to the number of blocks to be sampled, because the code reduces the number of blocks by the number of partitions to sample

b. The point above poses a special challenge if there are actually more partitions to sample than blocks

Note that Dynamic Sampling uses static / compile time partition pruning information to determine the number of partitions that need to be sampled.

The upshot of this is that when sampling multiple partitions the sample sizes of the lower cursor/session Dynamic Sampling levels can be far too small for reasonable sample results.

If the Dynamic Sampling code faces the situation where more partitions need to be sampled than blocks, it uses a different approach.

Rather than sampling the whole table and therefore potentially accessing more partitions than blocks defined by the sample size it will randomly select (sample blocks / 2) subsegments.

According to the number of blocks determined per subsegment it will then use a sample size such that in total (sample blocks / 2) blocks will be sampled for data.

Of course you'll appreciate that this means that on average exactly one data block will be sampled for data per subsegment.

The sample query looks different in such a case because the subsegments sampled are explicitly mentioned and combined via UNION ALL resulting in quite a lengthy statement - even with a small sample size like 32 blocks 16 queries on subsegments will be UNIONed together.

Here are again two excerpts from optimizer trace files that show the two different approaches in action:

You can clearly see that the query looks quite different by listing a number of subpartitions explicitly. Also the text dumped to the trace file is different and says that it will restrict the sampling to 32 partitions.

And it is this special case where in versions below 11.2.0.3 a silly bug in the code leads to incorrect cost estimates: When putting together the number of blocks that should be used for sampling and those that are extrapolated for the whole table the code copies the wrong number into the table stats - it uses the number of blocks to sample instead of the assumed table size. This can lead to a dramatic cost underestimate for a corresponding full table scan operation.

The issue seems to be fixed in 11.2.0.3, but you can see in above excerpt from 11.2.0.1 the problem by checking carefully these lines:

...** Dynamic sampling updated table stats.: blocks=1496 <=== wrong number copied from below... block cnt. table stat. : 1496 <=== this should be on the next line block cnt. for sampling: 17952 <=== this should be on the previous line partition subset block cnt. : 1496

The two figures "block cnt. for sampling" and "block cnt. table stat." are swapped - and the wrong number is copied to the table stats line.

This will result in a potential underestimate of the table blocks. The first plan is generated with the session level 5 sample size where the bug copies the wrong number of blocks:

Note that although a minor discrepancy might be explained by the different sample sizes a cost estimate difference by an order of magnitude is clearly questionable.

Nasty Bug When Using Indexes

Finally there is another nasty bug waiting for you in the case of partitioned objects - and this time it doesn't matter if the number of partitions is more or less than the number of blocks to be sampled:

Dynamic Sampling will also make use of eligible indexes if a filter predicate is applied to a table and a suitable index exists (which probably means that it starts with the predicates applied but I haven't investigated that to a full extent).

The idea behind this is probably that by using the index a very cheap operation can be used to obtain a very precise selectivity estimate for highly selective predicates. Dynamic Sampling has some built-in sanity checks that reject the Dynamic Sampling result if not a reasonable number of rows pass the filter predicates applied - similar to saying "not enough data found to provide a reasonable estimate". So in case the filter predicates identify only a few rows out of many it requires a pretty high sample level in order to have the Dynamic Sampling results not rejected by these sanity checks.

Things look different however if there is a suitable index available: Dynamic Sampling will run an additional index-only query that is limited to a small number of rows (2,500 rows seems to be a common number) and a where clause corresponding to the filter predicates. If the number of rows returned by this query is less than 2,500 Dynamic Sampling knows that this corresponds exactly to the cardinality / selectivity of the filter predicates.

In case of partitioned objects though there is again a silly bug where the case of 100% matching rows is not handled correctly - so for any filter predicate that matches more than 2,500 rows the cardinality / selectivity estimate will be potentially incorrect.

Here are again two optimizer trace excerpts that show the bug in action:

Without a suitable index the cardinality estimate for a not really selective predicate (90%) is in the right ballpark:

that in principle the selectivity estimate from the table level operation is supposed to be used but finally the wrong selectivity gets copied over which is then echoed by the final execution plan.

This bug is tracked with bug "6408301: Bad cardinality estimate from dynamic sampling for indexes on partitioned table" and patches are available. The issue is fixed in 11.2.0.2, but the "wrong number of table blocks" issue is only fixed in 11.2.0.3. I don't have a bug number at hand for that bug, though.

Summary

If you plan to use Dynamic Sampling on partitioned objects with many partitions where the number of partitions to sample cannot be significantly limited by partition pruning the result of Dynamic Sampling might be questionable for lower levels.

In addition there is a bug that leads to wrong cost estimates for a full segment scan operation that is only fixed in the most recent releases.

It probably makes sense to use higher Dynamic Sampling levels in such cases - the side effect of this is not only more reasonable sampling results but it might also allow to avoid the mentioned bug if the number of blocks sampled is greater than the number of partitions to sample.

Be aware of the case where an index can be used by Dynamic Sampling in addition - for partitioned objects a bug might lead to dramatic underestimates of the cardinality.

Testcase Script

The issues described here can easily reproduced by using the following simple test case: