Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I can understand that if I join individual queries that are "individually fast" the combination may become slow because the default execution plan may be non-optimal. However when I know the number of rows for one query is very small I think I should be able to use hints to control the joins.

If cj is <1s and cv <1s expect this to be <~2s but using any hint (merge/hash/loop) it is often > 1min.

I also tried to use CROSS APPLY because the docs claimed that the inner select is executed exactly once for each outer row. The query takes ~100x longer than running the inner query twice manually so perhaps I don't understand the docs.

If I populate a temp table with results from "cj" and then join _with_no_hint_ or use the cross apply it is fast but do I really have to resort to that? If I use the temp table and attempt "any" of the join hints (loop/merge/hash) it is slow so maybe that is a key point.

I don't believe diving into the depths of the query plan (both are complex to begin with) are required to solve this type of problem in general: I just want guaranteed isolation without resorting to a temp table--is that really not possible?

I know your example is trivialised but in both cases an Exists could perform better. You aren't using any data from cj that couldn't be derived from cv. Just my 2 cents.
–
Mike MillerJun 15 '12 at 12:31

The following is only true for multi-statement UDFs: "use CROSS APPLY because the docs claimed that the inner select is executed exactly once for each outer row". You are using an inline one, and there is no such guarantee for such ones.
–
A-KJul 15 '12 at 16:58

2 Answers
2

If you use a multi-statement UDF, then your inner select is executed exactly once for each outer row. The multi-statement UDF is treated as a black box: the execution plan will now show access to the objects used in your complex view.

On the other hand, a subquery and/or an inline UDF is flattened out by the optimizer. When this is the case, the execution plan will include access to the objects used in your complex view.

The optimizer bases much of it's decisions on the statistics for the tables. If they are not up to date then SQL can't make a good choice. Whenever you see poor behavior from queries, when you know that the number of rows involved is very small compared to the overall size of the tables, the first thing you should do is make sure that the statistics are up to date. This is quickly done. The usual method is reindexing the tables, but you can get similar results with commands that specifically update the statistics. The number of times that specifying hints has actually helped queries I've worked with has been extremely small.

Your answer states that you disagree with the second sentence of the question. It is after all a "hint" not a "use" so you have a point. But the docs on cross apply seem more like a "use/force" though. From my experience the optimizer gets pretty dumb pretty quick for entity attribute value models (where an index may store more than one 'field') which happens to be the case. I've had great experiences with hints but do avoid them as a preferred practice.
–
crokusekJun 15 '12 at 17:53

The optimizer seems to work best with tables that are either star schema or more normalized than a KVP table, especially when the value has a varying data type. I think that there is enough demand for a KVP object of some kind that MS should add something to support it.
–
darin straitJun 17 '12 at 11:21