Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

The CTE in the query gives us the full tree of practitioners, which is then joined via the authority table to the clients under them. The CTE portion of the query returns almost instantly, only when we start joining to the clients does the performance issue appear.

I've been running the query for the full set of practitioners (3 mins) @PractitionerAccountId = null Running it for @PractitionerAccountId=71 (one of the 20) results in an 18 second execution time (878 rows)
–
agrathOct 18 '12 at 4:59

Historically some databases (eg around the time of Oracle 7.3) have had problems with the "not in" construct; if you rewrite it as a "not exists" construct does this help?
–
Colin 't HartOct 18 '12 at 7:37

Hi Colin, Changing the not in to a not exists (with a where clause inside the query) seemed to result in a 17 second (filtered) execution time, so comparable. On the large set, it performed much worse, around 8 mins. This is SQL Server 2008 btw.
–
agrathOct 18 '12 at 9:00

Throw the result of the the CTE into a temp table and then join it. Let us know how that goes.
–
wBobOct 18 '12 at 10:55

2 Answers
2

Here is your description of the problem: "The CTE portion of the query returns almost instantly, only when we start joining to the clients does the performance issue appear".

One possible, and likely, explanation is as follows: the optimizer fails to estimate the cardinality of your subtree, and chooses an inefficient plan. With your way of storing hierarchies, this is no surprise. How would you yourself estimate the size of a subtree without actually retrieving it?

Can you use materialized path? Getting a subtree using materialized path is essentially one range scan, fast and simple, and the optimizer can have a good cardinality estimate off the statistics on one index.

In my experience, your way of storing/reading hierarchies does not scale up. I have never been able to make it work fast and use resources efficiently.

Hi Alex, that's a valid point. The optimiser would not know how many rows the CTE would return until executing it, which is after the plan is generated. Would moving it to a temp table as suggested above help? If I used a real temp table (as opposed to a table variable), maybe even adding an index to it would be possible? I'm invoking this code from a tablar UDF, I am not sure I can create tables/indexes in one but I'm about to find out..
–
agrathOct 18 '12 at 19:57