"henk de wit" <henk53602(at)hotmail(dot)com> writes:> In that case the prediction is 2 rows, which is only 1 row more than in the > previous case. Yet the plan is much better and performance improved > dramatically. Is there a reason/explanation for that?

Well, it's just an estimated-cost comparison. If there's only one rowthen a nestloop join looks like the best way since it requires no extraoverhead. But as soon as you get to two rows, the other side of thejoin would have to be executed twice, and that's more expensive thandoing it once and setting up a hash table. In the actual event, with359 rows out of the scan, the nestloop way is just horrid because itrepeats the other side 359 times :-(

It strikes me that it might be interesting to use a minimum rowcountestimate of two rows, not one, for any case where we can't actuallyprove there is at most one row (ie, the query conditions match a uniqueindex). That is probably enough to discourage this sort of brittlebehavior ... though no doubt there'd still be cases where it's thewrong thing. We do not actually have any code right now to make suchproofs, but there's been some discussion recently about adding suchlogic in support of removing useless outer joins.

>> FWIW, CVS HEAD does get rid of the duplicate conditions for the common>> case of mergejoinable equality operators --- but it's not explicitly>> looking for duplicate conditions, rather this is falling out of a new>> method for making transitive equality deductions.

> This sounds very interesting Tom. Is there some documentation somewhere > where I can read about this new method?