December 11, 2006

Bind Peeking

I’ve just received an email from someone who had been on my “masterclass”, and he had a question about one of the comments I had made, namely: “bind variable peeking is always done, even if histograms are not generated”.

In contrast, a condition like order_id = :1 will not trigger bind peeking assuming that order_id does not have histograms. (It may, for instance, be a primary key column, in which case histograms are not beneficial.)

Now, the argument in the white paper is basically sound – for a primary key or, in the absence of histograms, for a non-unique column the selectivity of column = constant (whether literal or bind) is set by the value of user_tab_columns.density. So the question arose: could I actually prove that I was right.

To my embarassement, I realised that I had never actually triedto prove my claim – Oracle’s original introduction to bind variable peeking had said that “we peek at bind variables” so I had assumed that this statement was true.

Fortunately, 10g makes it very easy to prove the point. Here are a couple of extracts from the new, improved, 10053 (cost based optimizer) trace file you get from 10gR2.

You will have to trust me, of course, that this query was for primary_key_column = constant. But quite clearly, Oracle has peeked at the value.

Footnote: the argument in the white paper does have a couple of holes. What if the referenced column also appears in a join predicate; what if the bound value falls outside the low/high range ? There are still (notionally) cases where peeking is necessary.

On a more pragmatic note, however, I suspect that peeking takes place because (a) it doesn’t cost much and (b) it would be more complicated to write code that had to decide not to peek.

Related

Jonathan,
I’ve posted a bit longer example about it a while ago – http://blog.oracloid.com/2006/07/bind-variable-peeking-with-no-histograms/. Originally, I came across it in 9i but example is from 10g because of improved 10053 trace. I have there a full demo and it should run in 9i as well. This is based on real life issue when BVP caused problems without histograms. Hope it helps.

Alex, Lovely – a nice simple example to show the “unexpected”. You can cruise along thinking you understand something perfectly, and the moment you add in a new feature (partitioning, IOTs, deferrable constraints, or whatever) it all changes and you realise how much more you have to know.

I’ve often said that partitioning is a bit like “free indexing” – you can get to the right bit of data more precisely without having to store an entry for every row in an index structure – I’ll have to remember that it’s also a bit like “free histograms” – especially if you use list partitioning.

Jonathan,
I came to the same idea of examining plan – diff’s ( from that oracle white paper you are here refering to )
But it seems they have never used their approach on a real system ( e.g. migration 9i-10g ). Oracle 10g has changed the PLAN notation of operations for PQ … e.g. PX SEND / RECEIVE which in 9i could be represented as PARTITION HASH/RANGE etc.
Anyway I still like the approach and like to know if you think that it would produce reasonable good results by filtering out all the PQ-nodes of the plan and only focussing on (operation,options,object_name) in the diff ?
Kind regards
Marc.

Marc, I’m always a little dubious about setting up a system to automate things like this. Many systems are likely to have a few thousand SQL statements with SQL plans that need to be trapped. Writing code that gets “before” and “after” plans into a structured form that allows them to be cross-checked seems looks like a task that could take a lot of effort and still not work.
I have to say that my strategy would be to capture the “before” plans (possibly as simple flat text using the 10132 event I have mentioned previously) and then simply wait for the (few ?)problem statements to appear after the upgrade and check them back manually.

I was wondering how CBO can choose different plans when there no histograms (assumming statistics is unchanged)? How can he know the data distribution and choose full or index scan?

Alex, your example uses partitions, does this only happens with partitioned tables?

I have seen bind peeking happens with no histograms but I cannot find a explanation (9.2.0.7). The queries which I have seen this happen are usually not a plain query involving tables, they always involves JOIN.

David,
Oracle always peeks at binds when it optimizes a statement. (Bugs and hidden parameters excepted).

The obvious examples of getting multiple plans for the same text come from range-based predicates:
select from orders_table where order_date > :b1;

If your call to this query uses yesterday’s data and my call uses a date from 3 months ago, we probably ought to get different plans. In most cases, though, the first one to execute sets the plan for everyone else. There are cases though – different optimizer environments, for example – where different sessions force extra child cursors (i.e. separate optimizations) to appear.

I believe David wanted to ask not “why” Oracle is creating two different plans, but “how” does he do it?

Say, table t1 has no histograms collected and hence no knowledge of data distribution. And yet for “select * from t1 where c1=:a1″ the optimizer comes up with two different plans, say index access for :a1=5 and fts for :a1=15, and rightly so as we have only 1% of data in t1 table where c1=5 and 99% of data where c1=15

Yet the question remains: How does Oracle know it? Again, there are no histograms collected, but bind variable peeking comes into play and somehow manages to realize that we have unevenly distributed data and produce different plans for different values of bind variables.

Yuriy – 10g and dbms_stats with estimate_percent = AUTO or much less than 100% ?
I remember a test case – column GENDER, 1 ‘M’, 999,000 ‘F’. dbms_stats missed the only M, so it was outside the min/max range, so (check Jonathan’s “Cost Based Oracle”, page 49) it estimated a cardinality of 1 for “where gender = :b” when :b = ‘M’, and a cardinality close to 1 million when :b = ‘F’. Index scan for the former, full table scan for the last; it is the same plan you would get with an histogram.

Alex, your example uses partitions, does this only happens with partitioned tables?
A bit late but still I should mention that the answer is no as far as I could see. Bind variable peeking happens all the time and presence of histograms is the most common for varying execution plans. Partition is another one. Ranges might be the third one and probably there are other cases as well.

Optimizer used density to calculate the selectivity of the form column_name = constant or column_name = bind_variable. So, with histogram in place, the selecivity of a simple equality predicate can change even for queries involving bind variables.

I have tested this with Oracle 8i / 9i and 10g release 2, but have not found this. For a equality predicate with constant and histogram in place, optimizer is able get a correct cardinality but with bind variables in place, the cardinality comes out as 1/num_distinct * num_rows.

[…] have been plenty of posts about bind peeking. Alex Gorbachev wrote about it last year, and so did Jonathan Lewis. It’s a well known issue. However what hasn’t been written about is when it is expected […]