Please let me know why for the first server it is using full table scan?

Thanks,

Subhajit

Likely because from the predicate information you supplied it looks like you have a poorly written SQL.

5 - filter((:P_LOCCOD IS NULL OR :P_LOCCOD IS NOT NULL AND "AWB"."LOCCOD"=:P_LOCCOD) AND

(:P_SALCURCOD IS NULL OR :P_SALCURCOD IS NOT NULL AND "AWB"."CURCOD"=:P_SALCURCOD) AND

"AWB"."RATAUDIND"='Y')

So the query looks to be attempting to be a jack of all trades, either give me a P_LOCCOD or give me a P_SALCURCOD. How would you optimize that if I asked you to? That's like me asking you to get me a book from the Library, one time I'll ask for something starting with A and the next time I'll ask you for something ending in Z ... how would you optimize that before knowing which I ask for?

The table structure is same for both servers with all the indexes same.

The only difference is that for the first server the table SLCCCAISS has 55024 rows and for the second server 16197 rows. I don't think that matters however.

Think again, it certainly DOES matter. If a query returns more than, say, 30% of the table data then the optimizer will likely choose a full table scan as it's less 'work' than an index scan. You have to have the same data in both servers before you can actually compare why plans are different. You've answered your own question with this response.

You are returning a higher percentage of the total data in the table on the first server, therefore Oracle is choosing the full table scan as it's 'cheaper' than the double work an index scan and table fetch by rowid would create. It's not the volume, it's the nature of the data. As an example:

As the number of rows that meet the condition increase the likelihood of using an index goes down. The table didn't change, the index didn't change, the data did. That's what's happening in your case, I believe.

The table structure is same for both servers with all the indexes same.

The only difference is that for the first server the table SLCCCAISS has 55024 rows and for the second server 16197 rows. I don't think that matters however.

Think again, it certainly DOES matter. If a query returns more than, say, 30% of the table data then the optimizer will likely choose a full table scan as it's less 'work' than an index scan. You have to have the same data in both servers before you can actually compare why plans are different. You've answered your own question with this response.

David Fitzjarrell

The cutover point from a full table scan to using an index is much, much lower than 30%. It will be about 0.5% or lower of the data in the table being retrieved depending on the row size and the number of rows per block. I've blogged about this before:

If you've gathered representative statistics then oracle has used them to determine it is faster to execute the statement using Afull tablescan, this is entirely plausable. Have you proven that the other plan is better in this situation? Is there something you're not telling the optimizer?Do you need to gather statistics?

In the second case Oracle has split your query into two pieces - one that assumes that :P_CARNUMCOD is null, and one that assumes that :P_CARNUMCOD is not null (see the filter predicate at operation 11). I am a little surprised that I don't see a filter operation between operation 1 and 2 with the filter predicate (:p_carnumcod is null), but I assume that the optimizer has done something clever to recognise that it's not needed.

By making this assumption each of the two plans can operate more efficiently than a single plan with a driving tablescan. As David points out - when you change the data (and the relevant stats) paths can change, and it doesn't necessarily need a large change in the content, pattern, or input parameters for a dramatic change in the plan to appear. Ideally, of course, the performance should be appropriate for the amount of data actually requested. You have, however, made it impossible for Oracle to find a perfect plan for every combination of inputs by writing SQL with so many "if then else" decisions that should have been handled by the front-end code.

If you have a moment - was your result using a well clustered index? What about your index, John Brady, that led you to 0.5% as a rule of thumb? The reason I ask is that I have been doing some work related to this: trying to see if use of attribute clustering with conventional storage can result in performance similar to that achievable with Exadata smart scan, or with the in-memory column store.

Using this test case, I get a switch from index range scan to full table scan at 14% for a perfectly clustered index and at 0.06% for an unclustered index:

I am not suggesting that this test is in any way scientific. The fact that it is on my laptop with SSD is going to make it irrelevant to the real world, never mind system stats and serial or parallel direct or indirect read. But it does show that making assumptions about a %age crossover point is problematic.

I would be interested in knowing if anyone else is working with attribute clustering. It seems to me that it might be an alternative to some expensive options.

I can't remember how many times over the last 15 or 20 years I've explained in articles, newsgroups, forums, listservers, user group conferences, and even a BOOK that it's not just the volume of data that matters, the pattern is important too; so it's pretty depressing that someone can still come out with a silly idea like "the breakpoint is X percent".

That's 24.5% of the table - and the optimizer knows it, and it's still an index range scan.

Regards

Jonathan Lewis

Jonathan,

I did not see your response to the other question in the other post. There was one question where a lot of replies seemed to disappear. Maybe the OP deleted some or maybe OTN had a problem? Otherwise I would have responded to that earlier post if I had seen it.

I agree that my model is simplistic, but I don't think it is wrong for the majority of cases. I made it clear in my blog post what my assumptions were, and I was clear about Clustering Factor being part of the cost model for index access. I also posted all the SQL statements used in my testing so that other people could replicate them or do variations if desired.

That said, it cuts both ways. While my model may well work in many cases, it will not work in all cases, as in your example, and also that from John Watson. That is always a danger of simplification and rules of thumb - while it can work in many cases it may not work in some cases. Unfortunately not everybody who posts a question on OTN is willing to put in the effort to read up on and understand how Oracle and the Optimizer actually works, so sometimes simplified explanations are the only way to get across to these people what is happening to their query execution.

The main emphasis of the point I was trying to make was that when the Optimizer is seen to be choosing a full table scan it is probably the right thing and not the wrong thing. There is often a kind of shock from people when the Optimizer chooses a full table scan, as if it is the last thing it should be doing. I was trying to point out that if the volume of data being requested is high enough, subject to various factors, then a full table scan will be cheaper than other access methods such as using an index, making a FTS the right thing. And if someone is seeing a FTS in an execution plan then they should be checking the volume of data the query is requesting before making claims that the Optimizer is doing the wrong thing. I was not trying to give an absolute cutover point, but to show how low it can be, in what I consider to be normal circumstances.

I accept that "as low as 0.5%" is not always true, and I'll stop using that phrase in future. What about "as low as 0.5% if the Clustering Factor value is nearly the same as the number of rows in the table"? Either way, I'll refrain from using phrases like that. I feel another blog post coming on to point out how the assumptions used in my model can sometimes not apply, and the results can be very different. It is never simple with Oracle, is it?