The Situation

And sp_whoisactive (or other monitoring software) tells you that there’s a particular query that is running frequently. In this example, the query starts with “SELECT TOP (@ResultLimit) M.Name FROM MyTable M JOIN … etc“.

The query is called in exactly one place in your application (Oh man! A fix will take time to recompile, QA and deploy).

The query usually runs for a few minutes and using up almost all the CPU.

You’ve collected the query plan and the query text.

But, if you run the query in SQL Server Management Studio, the query completes quickly.

Some Analysis

The query plan looks different when run in Management Studio

The production database must have picked a plan based on unlucky compile-time parameters.

Confirmed! When running the query in Management Studio using the OPTIMIZE FOR clause with similar parameters, you can reproduce the bad plan.

To buy some time, it would be good to have SQL Server recompile that plan.

This is what I want to explore in this post. What is the best way to get SQL Server to recompile that plan?

Before SQL Server 2008

How do we get SQL Server to recompile this plan? Before SQL Server 2008, there were a couple things I could do:

UPDATE STATISTICS [tablename]: Find a table that is used in the query and have SQL Server update statistics on it. All plans using the table will be recompiled afterwards (including our troubled plan!) We might not actually need updated statistics; it’s the side effect of recompiled plans that we want here. I talked a bit about that in Updating Statistics Helps, But Not For The Reasons You Think.

EXEC sp_recompile [tablename]: This technique is similar to UPDATE STATISTICS in that plans are dropped. The benefit is that we don’t have all the overhead and time spent creating the statistics. It’s still not ideal though. It takes a schema modification lock and based on the server’s current busy state, that will mean at least a couple minutes of blocked processes.

Targeting a Single Query Plan

SQL Server version 2008 and later allows you to take out a single query plan from cache using the FREEPROCCACHE command with a plan handle. This is a sharp-shooter technique compared to the other techniques.

I like this technique because it takes no important locks. Existing calls to this query will continue to execute and complete (using the old bad plan). But in the meantime, new calls to this query will use a recompiled plan based on the different (and hopefully better) parameters. If so you’ve just bought yourself time to find and fix this query correctly.

But you can’t just use FREEPROCCACHE without some preparation. Finding the plan handle is not straightforward. So I’ve got a script that only asks that you identify a query based on its text.

The Script

First find out how bad your query is. I’m arbitrarily defining more than 10 active queries as bad. Change the query text here. Use enough text to uniquely identify the problem query. In my case I used “TOP (@ResultLimit)”.

-- Part 1: -- Use this query to look for bad performing queries using "TOP (@ResultLimit)"selectcount(1)as[countof running queries, should be lessthan10],
max(datediff(second, start_time, getdate()))as[longest running such query in seconds],
min(datediff(second, start_time, getdate()))as[shortest running such query in seconds]from sys.dm_exec_requests er
cross apply sys.dm_exec_sql_text(er.sql_handle) st
where st.text like '%TOP (@ResultLimit)%'
and st.text not like '%80B82FDD-0297-4057-9AEB-D811F3B5C7DB%'-- filter self
GO

Then run this part to knock out that one query (remembering again to adjust the query text).

How Risky Is This Script?

To be honest, I can’t think of much that could go wrong other than it doesn’t solve your problem. In the worst case scenario:

your analysis was wrong, the plan is recompiled and still performs poorly.

Or perhaps somehow the script picks the wrong plan to recompile, that’s not bad news either, it’s just news. I certainly can’t say the same for the other techniques that were available before version 2008.

Maybe expectations were set incorrectly. You propose this change, It doesn’t fix anything and now it may look like your guessing (This isn’t a technical issue, but it’s something to keep in mind).

By the way, don’t take my word for it. You need to tell others that the risk is small and you have to back this claim up without my help. So you should understand what’s going on here.

Plan guides. I had forgotten about those. Plan guides are a way of “hardening” SQL Server (By hardening I mean avoiding changing plans, changing statistics, etc… even in the face of changing data). Plan guides could be a good way to “pin” a query plan to buy time. Personally I don’t have much experience with it…

Hmmm…. Maybe it’s something for me to practice until either I know it well enough to recommend as a solution, or know it well enough to know not to recommend it (for some reason :-).

In addition to “hardening” the plan, you can also use plan guides in SQL 2005 to remove a single plan from the cache, or prevent it from ever being cached in the first place by setting @hints = N’OPTION(RECOMPILE)’ when calling sp_create_plan_guide. If you just want to remove the cached plan, you can then immediately drop the plan guide with sp_control_plan_guide.

That would certainly do it. The procedure is recompiled and any processes that happen to be running the procedure continue to run without blocking. I really wish I mentioned that in my article.

In my case specifically, I’m not dealing with procedures but with SQL coming directly from the application and so that option isn’t open to me. That’s why I mentioned that “a fix will take time to recompile, QA and deploy”.

It’s interesting that the command is called FREEPROCCACHE when it should be called FREEPLANCACHE.

Say the scenario I described was different in that the query wasn’t an ad hoc query coming from an application, but was a query coming from a stored procedure. Well in that scenario, it’s dead simple to sharpshoot that plan out of the cache: Execute