Hi, I have a problem I will demonstrate. I read some work by Ross in an old thread but it did not help me (too dense I am).

The following sequence of simple statements shows me how Analytics and Predicate Pushing appear to work. But when I go from constant tests to a join with the same data as a row instead of a constant value, the Predicate Pushing stops.

I have a table with 9 million rows in it. It takes about 90 seconds to scan.

If I run an analytic that counts all rows in the table I can see that easy enough. This exampe as I understand it, scans the table (I know this because if for no other reason, it take 90 seconds to get an answer), then after scanning all rows, counts them and adds the count to each row. There are in fact 9500 or so rows with the values of lv_rqst shown. Instead of seeing all 9500 rows, I distinct it to get only one. See how the count shows all rows in the table.

From this we see that the predicate is not pushed into the inner query to filter rows. If it had, the analytic would have produced a number like 9500 not 9 million. I have no issue with this as this is how Analytics are documented work.

We can see how filter does happen with an analytic. We push the predicate into the inner query and all of a sudden we get a count of only those rows for the specific request. Thus we see the basics of how analytics work, particularly with respect to predicate pushing. There is not real rocket science here. The filter removed rows before the analytic counted them. This too is how Analytics are documented to work.

So far we have been doing "all rows" analytics. Now we use a PARTITION clause to group the data. Notice the rowcount. It is the count for just the lv_rqst. Do not be fooled. This is because the PARTITION column says to synchronize the analytic count to the data for its associated row. Thus the counts will be grouped by LV_RQST. Whether we had the predicate on the outside of the query or not, for this specific lv_rqst shown we would still get this count. This is not proof that we did any filtering with the predicate lv_rqst = '746780192'.

However, notice the speed of the query. It gets done so fast, that there is no way it is scanning the table. I know there is an index that starts with LV_RQST so I conclude that the index was used which I believe means the predicate was in fact PUSHED into the inner query. Thus I think we did in fact filter the rows to just this single LV_RQST value and we also accessed the table using the predicate as well which became an index range scan instead of full table scan. My biggest point is we were able to use the index to get the data we want, not scan the table and this was made so because we added the LV_RQST column to the OVER clause as part of the PARTITION BY expression.

Now I add an additional column to the inner query, and an additional predicate against that column to the outer query. We know that 9539 is the count of all rows where LV_RQST='746780192' as we have seen that before above. I also know the data and know about half say Y and half say N for this indicator column. This query gives the right answer. I just put it here are additional demonstration of the way analytics work. It demonstrates that the new predicate is not pushed into the innser query to filter rows. Again this is how Analytics are documented to work.

Again also please note the speed of the query. It used the index to range scan only the rows where LV_RQST='746780192'. So only those predicates that have their columns in the anlytics OVER clause are allowed to be pushed into the query for filtering and accessing purposes.

This next query shows a little more clarity. If we add the indicator column to our OVER clause then the rowcount changes to be the number of rows where LV_RQST='746780192' AND lv_pln_usge_del_ind = 'N'. So by putting the column into the OVER expression, Oracle decides to push the predicate down into the inner query and filter the data before the analytic count is done. Again this just demonstrates for clarity how it works. I think I am describing it right anyway.

And once again the speed clearly indicates that an index range scan was done using LV_RQST='746780192'. Recall I said there is an index that starts with this column.

NOW WE COME TO MY PROBLEM.
Instead of using the constant value '746780192' we are going to create a one column one row table that has this value in it. We are then going to join to the the analytic subquery instead of doing a contant test against it.

For a little bit more clarity, two more queries. Pay attention to how long it takes, and to how the additional joins affect things. Notice, particularly with the last statement, that the join criteria is being pushed into the inner query with the analytics. Otherwise how did it get that count?

So after looking at all this, here is my question: How do I get Oracle the use the index and nested loop join to the table lv_pln_usge_fact. We know Oracle pushes the predicates down when the columns are referenced in the OVER expression because we see that in several places. We also know the CBO can do a nested loop join with index access on LV_RQST because it does it when we use a constant test. But it won't use the index and nested loop when we do a join to a table with the same data, no matter how much rewriting or hinting I do.

I tested this in 9i/10g/11g and got same behavior in all three places.

Can someone make it do what I want?, or is this just the way Analytics behaves with a join?

Wow! Marathon post. I would have liked some AUTOTRACE, but what you are describing with FTS vs Index scan is fairly clear.

Unrelated to your problem but: when you add in the ORDER BY clause to the analytic function, are you expecting a (rolling) count of all values <= N, or all values = N? Because you are getting <= N. If you change the query to look at 'Y' you will get a count of N+Y.

If you have the column in the PARTITION BY clause, then it should be able to push the predicate inside the inline view, but clearly it chooses not to. Note that it will NOT push the predicate on lv_pln_usge_del_ind - that would change the result (if there were values < N).

Try a hint to see if you can *prove* that it can be done. Reverse the order of the tables in the FROM so you can use ORDERED.

thanks for the advice Ross. I am trying your suggestion now and all manner of variations. So far no joy. I'll keep trying.

Also, thanks for the clarification. I was not aware the ORDER BY worked that way. May I ask, how do you know this? Do you have some special work background that introduced you to this? Did you run a test? Or do you have some fountain of knowledge somewhere that you have not told the rest of us about?

Quote:

Unrelated to your problem but: when you add in the ORDER BY clause to the analytic function, are you expecting a (rolling) count of all values <= N, or all values = N? Because you are getting <= N. If you change the query to look at 'Y' you will get a count of N+Y.

If you have the column in the PARTITION BY clause, then it should be able to push the predicate inside the inline view, but clearly it chooses not to. Note that it will NOT push the predicate on lv_pln_usge_del_ind - that would change the result (if there were values < N).

I've already made the same mistake before with analytic functions. My secret is I have a good memory for my own screw ups. I suspect its a survival trait of married men.

Keep at it, because it does work. Turn autotrace on. It may be pushing the predicate and then performing a FTS anyway. You may need an INDEX hint in the sub-query. You will be able to tell from the Predicate listing in the autotrace. Also make sure you have stats on both tables and indexes.

John WatsonMessages: 5095Registered: January 2010 Location: Global Village

Senior Member

The Who:

Hey you gettin drunk, so sorry, I got you sussed
Hey you smokin mother nature, this is a bust
Hey hung up old Mr. Normal don't try to gain my trust
'Cause you ain't gonna follow me any of those ways all though you think you must

OK, I am going to wait for Oracle to get back to me on the TAR. Seems 10.2.0.3 just won't do the index access. Here are the 10g and 11g outline info, and the query that I believe has the necessary hints to give me the plan I seek. But 10g just won't do it. Notice the very last line in each outline list. 10g says FULL, 11g says INDEX. Maybe it is an optimizer_upgrade in 11g.

I think I am done for now. Thanks again to both of you. I know this took you each some time to do and I appreciate it. I feel I have made significant progress to either a solution or a formal "can't do it on 10g" from Oracle.

OK one more post (maybe). I found this document that desribes changes in the optimizer between 10g and 11g. Wouldn't you know it, but there is a section about EXTENDED JOIN PREDICATE PUSH DOWN. If I read it right, it is telling me that the feature I need is not in 10g.

Quote:

Extended Join Predicate Push Down

In previous releases when you had a SQL statement where a view V and a table T were joined by
a join predicate T.x = V.y, the Optimizer had only two possible join methods, a hash join or a
sort merge join to join T and V.

In Oracle Database 10g, we introduced the join predicate push down transformation, which
enabled the optimizer to push the join predicate into the view. So the join T.x = V.y becomes
T.x = T2.y (where T2 is the table inside view V, which has the column y in it) thereby opening
up the possibility of using a nested-loops join if an index is present on T2.y.

In Oracle Database 11g the join predicate push down capabilities have been extended to include
group by, distinct, anti-join, and semi-joins. For the following query;

Although join predicate push down exists in Oracle Database 10g we can not use it due to the
group by. In the 10g we see the view v1 being evaluated followed by a hash join to the
PRODUCTS tables.

However, in 11g join predicate pushdown has become possible and we are now taking advantage
of the index on the sales table to do a nested-loops join instead of a hash join. The cost of the
new plan has come down from 79 to 28 because of join predicate pushdown.

If I am correct then my problem is a true limitation of the 10g optimizer and I am guessing that Oracle will tell same and that my only solution is to upgrade if I really want the feature.

Once again, thanks for all your help you two. I believe I have what I need to discuss this with my team and our customer.

I don't think so. I think the DISTINCT is meaningless with respect to the problem. I put it there so that when I was running tests against real data, I only got one row returned instead of 9,600 rows in my sqlplus output.

The issue is the Analytic COUNT. It is an aggregate and as such I guess that means the 10g optimizer cannot handle it. At least that is how I see it. You see it a different way?

Just tried but no joy. You are correct, the manuals don't explicitly mention everything. I have never found a complete list anywhere but I am pretty sure analytics would count as an aggregation (eg. group by), so they keep things from working.