The second weird thing: inconsistent query duration

I was getting consistent results for query duration when I ran the query in the script above, which started at a high degree of parallelism, and then went to low.

But I found that for one of my test queries, when I took the query and ran it in my own session to look at Query Time Stats in the execution plan, it took much longer.

For example, at MAXDOP 13:

Duration was consistently 55-65 seconds in the looping script

Running it in a session by itself, I was seeing durations of 2 minutes to 2 minutes 30 seconds

More than twice as long.

Why was it so much slower when I ran it in another session?

It wasn’t about cache/buffer pool reuse

I had plenty of memory on this instance, and I’d made sure to set ‘max server memory (mb)’ plenty high, but could something be causing me to do physical reads when I ran the query in my own session?

This made me realize that my test script wasn’t quite fair to the DOP that ran first — it might have to do physical reads where following queries made use of data in memory. This is a private test instance, so I changed the script to run everything with cold cache, by dropping clean buffers between runs.

I wasn’t getting a different query execution plan

I was getting the same plan with the same cost both in my “slow session” and in the script.

It wasn’t ASYNC_NETWORK_IO_WAITS

I was running everything from an SSMS instance on my laptop against a SQL Server on a VM in the Azure Cloud. Could I be getting some weird kind of latency on one session?

Nope. And my query wasn’t even returning a result set to my SSMS window — it was putting the results into a table in the cloud.

Was it the Dynamic SQL?

Could something about the Dynamic SQL be making it faster? Or had I made some sort of typo and the Dynamic SQL query was a little bit different in a way that made it faster?

Nope, it really was the same query, and the same plan. When I copied the longer script and adjusted the variables to only run for DOP 13 in my “slow” session, it was also slow.

Was it using STATISTICS TIME or STATISTICS IO?

I sometimes use these to measure queries, but in this case I wasn’t using them in either session.

Was it having Actual Execution Plans on?

Yes, it was.

This query was a reminder of the observer effect: watching something can change its behavior. In this case, “watching” the query by enabling actual execution plans made it take twice the duration.

I had actual plans enabled in my “slow” session, and I didn’t have them on in the session where I was measuring duration and wait statistics in the loop. (If you’ve ever accidentally left actual plans enabled when you’re running TSQL in a loop, you know how much SSMS doesn’t love that.)

I tested this a few different ways, including in another instance of SSMS local to the VM itself, and sure enough – actual plan drags this puppy down, although it’s a relatively simple query.

Was it only graphical plans?

I disabled graphical execution plans in Management Studio, then tested a few varieties of outputting actual plan info: