November 29, 2009

Cardinality

10g UpdateAs with equalities, 10.1.0.4 suddenly changes what happens when you fall outside the low/high limits. The arithmetic, or picture, used for ranges outside the limits is just the same as the new mechanism we saw for equalities.

I can’t work out how the formula provides a cardinality of 82 for the values ” month_no between 14 and 17″. Can you please elaborate on the out of bounds formula ?

The example in the book was only intended to show the general pattern of behaviour, and I didn’t attempt to explain what I thought the optimizer was doing – which is an odd little oversight.

month_no between 14 and 16
month_no between 14 and 17
month_no between 14 and 18
month_no between 14 and 21
month_no between 14 and 27
month_no = 14

You will see that the cardinality doesn’t change. It is the cardinality of month_no = 14 … the value at the lower end of the range.

To see why this cardinality is 82: it takes 11 steps to get from the known low value (1) to the known high value (12), so to extrapolate past the high value the optimizer steps from 100 down to zero in 11 more steps – which means 9.0909 per step.

Since 14 is two steps from 12, the cardinality is (100 – 2 * 9.0909) which rounds to 82.

Related

To see why this cardinality is 82: it takes 11 steps to get from the known low value (1) to the known high value (12), so to extrapolate past the high value the optimizer steps from 100 down to zero in 11 more steps – which means 9.0909 per step.

Since 14 is two steps from 12, the cardinality is (100 – 2 * 9.0909) which rounds to 82.

Jonathan,

Did not get that (at all). After collecting statistics (which do contain low and high values) for columns, why does CBO still try to “guess” the cardinality for the values that are out of range? The EXPLAIN PLAN shows number of rows produced by the step, right?

Don’t ask me WHY Oracle uses a given strategy – I didn’t write the specification or code; however I think that Timur’s suggestion is probably the right one.

The number of rows reported after using explain plan is the optimizer’s estimate of the number that will be produced (for each call to that line of the plan). It’s a figure that may be completely wrong.

explain plan set statement_id='s14' for
select count(*) from audience where month_no = 14;
explain plan set statement_id='s15' for
select count(*) from audience where month_no = 15;
explain plan set statement_id='s14-17' for
select count(*) from audience where month_no between 14 and 17;
explain plan set statement_id='s14-16' for
select count(*) from audience where month_no between 14 and 16;
explain plan set statement_id='s14-18' for
select count(*) from audience where month_no between 14 and 18;
explain plan set statement_id='s14-21' for
select count(*) from audience where month_no between 14 and 21;
explain plan set statement_id='s14-27' for
select count(*) from audience where month_no between 14 and 27;
select statement_id, cardinality
from plan_table
where statement_id like 's1%' and id = 2;

Thanks for that tip. I still haven’t got around to installing 11.2 so can’t investigate at present.

It’s possible that it’s deliberate – for example the out-of-range queries may be limited to the maximum single value in range.

Do you get 100 for a range of (say) 21 – 22, then 21 – 23, then 21 – 24, or do you get an increasing set of values ? Can you pick another set of ranges and see a set of results that increase until they hit 100 ?

The cardinality of an index is the number of unique values in the indexed field.For example, if you were a car manufacturer and you had a database containing one row for every car that rolls off the assembly line, the primary key might be the VIN, and a secondary key might be the color. If you make cars that are red, blue, or green, then the cardinality of the color index would be three. If you are Henry Ford building model Ts, then the cardinality would be one, since any customer can have a car painted any colour that he wants so long as it is black.

I’ve checked the example, and the change from 60 to 101 for the cardinality estimate when using explain plan is just what happens – somewhere around the page the example comes from I did make the comment that “things change”, and this seems to be one of them. (In fact the change to 101 also shows up in 9.2.0.8). With rounding errors it looks like the selectivity has turned into the selectivity of “column = {in range constant}”, just as it is for “column > {out of range constant}”.

As for the anomaly in the second part of your question, without knowing exactly what steps you took to generate the trace file I can’t work out why you’re seeing 60 as the estimate. Remember that autotrace (which uses explain plan) doesn’t look at the bind variables and assumes that they’re character strings of unknown value. In the versions I’ve just tested I don’t get a “Peeked Binds” section in the 10053 when I use “autotrace traceonly explain”, and I get an actual value displayed (which hasn’t appeared in your output) when I run the query.

Hi Jonathan,
The value of 101 (keeping aside round off errors) seems to be close to be following the {column = constant} formula which gives a value of 100 as selectivity for table of 1200 rows with 12 distinct column value.

So I was just curious to know if you come across this behavior as i mentioned in the first part (Getting card. of 101 instead of 60). Does that means in release 11g and above 5% selectivity does not hold good anymore for unclosed range based predicates and we should substitute it with formula which we use for calculating {column=constant}?

Your first paragraph is simple repeating my first paragraph – and clearly I have seen the behaviour you’ve described … even in 9.2.0.8

My second paragraph emphasises the word EXACTLY – I can’t reproduce your result because you haven’t described your experiment; I’ve also warned that autotrace and explain plan don’t behave like runtime. Obviously I’m not going to give you a wide-ranging strategy in response to a single example which hasn’t been described properly and which I can’t reproduce.