Building SQL ConstantCare®: Why People Aren’t Using Query Store

When Microsoft first talked about Query Store, I was wildly excited. It’s basically a black box flight data recorder for SQL Server, tracking which queries run over time, how much work they do, and how their execution plans change.

And then it came out, and…hardly anybody used it.

We thought it was a good idea, so we took a few measures to encourage adoption:

SQL ConstantCare encouraged folks to turn it on where it made sense (where they had Cumulative Updates applied to fix QS issues)

And even amongst the population of SQL ConstantCare users, where we actively pester you to turn it on, adoption is still low. Amongst eligible databases, for customers willing to share their data for collective analysis:

1,434 databases have it turned on (18%)

6,635 databases don’t (82%)

So the question became – why?

We asked, and here’s what customers told us.

We emailed folks who hadn’t turned Query Store on for any of their databases, and here are some of the answers:

I just haven’t had the time to understand what it does, and best practice for implementation, any considerations for this specific server, etc. I fear things I don’t completely understand.

Companies want paranoid people in charge of the databases, so this makes perfect sense. You don’t wanna go flipping switches unless you’re confident they’re going to make things better, not worse.

With each of these answers, I wanted to think about what would change that answer. Here, Microsoft needs to figure out the best practices for implementation based on their experience hosting Azure SQL DB across a lot of customer workloads, and then build those into the SQL Server product as the default settings for Query Store. (Right now, you can kinda feel during QS setup that the defaults are a shot in the dark.)

Basically, just too busy. I’m excited about it, but haven’t had a chance to really look into it too much, and I thought I saw something early on about a possible performance hit or memory usage with it turned on.

Several answers came in with that theme. There have been a few Query Store sessions at GroupBy that mentioned the overhead, and it’s measurable. Folks need more confidence in the expected overhead, plus guidance on whether their own system can safely handle that additional overhead. (This is where our work on telling you if your server is bored comes in handy.)

We have a monitoring tool (SQL Sentry/SentryOne) which allows us to see some of the plans, the database/application is not a critical one (not a high number of users/load), and we have not had a chance to really look into what’s needed from a setup and maintenance perspective. When we migrate our critical application DBs to newer versions, we will most likely look at turning it on for them.

Makes total sense. Query performance tracking isn’t a new need – it’s been around forever – and a lot of folks are using tools that already relieve that pain point.

To change this answer, Query Store would need to be good enough at the “free” price point to beat SentryOne at its paid price point, and that definitely isn’t the case today.

We’re using SQL Server to host our ERP software from a third party vendor. I wasn’t sure what kind of benefit Query Store would give us.

I never thought about this before, but Query Store could be an excellent black box for ISVs to use with their customers. However, I don’t see that happening anytime soon – most ISVs don’t have T-SQL tuning staff on hand to leverage the plan cache, let alone build new tooling to analyze Query Store.

How that influences our development

We’re starting to work on query analysis in SQL ConstantCare®: telling you which queries are causing your performance issues, and how to go about fixing those queries. There are a few different ways we could collect the data:

Daily sample of the plan cache – which is really easy to do with low overhead, but it has huge blind spots, especially around servers under heavy memory pressure or building dynamic strings

Frequent samples of more sources – get running queries and the plan cache on a regular basis, like every X minutes – which gets much harder to do, but has less blind spots

Get it from Query Store – have customers turn on QS, leverage the code Microsoft has already built – which is easy, but only works on 2016+, and even then, people aren’t using it. Could overcome some of that with the open source OpenQueryStore, which works on 2008+, but there are costs and risks involved with building our products on top of that.

But these numbers make the choice clear:

72% of customer databases can’t use Query Store (2014 & prior)

23% are eligible, but would take training/convincing/testing to use it

5% have Query Store enabled

As a tiny company, we gotta pick our battles. Right now, it makes more sense to focus on traditional plan cache analysis – something that will pay off instantly for every single customer we have – rather than try to roll the Query Store evangelism boulder uphill. (I tell you what, though: if I was a DBA again tomorrow, I’d have Query Store on for my databases.)

Pedro – yep, tried that and the GroupBy videos and sp_BlitzQueryStore and all the stuff I mentioned in the post. I just don’t have unlimited free time to evangelize it when people aren’t adopting it. I really wish I could, but like I explain in the post, I have to pick my battles.

I suggested using Query Store to a number of my clients in the last year. I got replies varying from “Well who will use it once your contacts ends (accompanied by some harrumphing from the usual background sand people, who only seem to show up / provide background harrumphing but scatter only to return later!!)” to feeling like someone who introduced the round(ish) world concept to the Flat earth society…

Russ – yeah, it’s funny, as I was emailing clients asking them personally about why they weren’t using it, several of them had comments along that line – basically saying that my personal emails had guilted them into it. They were willing to turn it on with the defaults, not knowing how it worked, just by me guilting them into it. While that works in small quantities, it doesn’t exactly scale – and then if they run into problems (as some early 2016 adopters did), then I gotta be their first line of support. No bueno.

Brent brilliant!!
I would like to say that most people do not know how to look in the plan cache or the dm-exec-query-stats ans the dmf query plan etc to look at it. Ms have a huge amount that can be built on the qs which can be replicated but it is hard! So unless you want to send a lot of time learning the internal of sql server sentry one does the same!!!

Shortly after we first upgraded to SQL 2017, I turned QS on. We have a heavy ad-hoc environment, and it filled up rather quickly. I tweaked a few settings, including lowering the number of days to keep the data down to 5. I decided to turn it back off for now while I address some other low hanging fruit, but I will turn it back on again later when I have more time to work with it.

Del – yeah, the original configuration advice was a little rough. I know these days, Microsoft is all about “the robots will figure everything out,” but we’re a long, long way from that.

If we’re really serious about automating DBAs out of a job, Microsoft needs to be able to figure out Query Store settings automatically for starters – that’s just plain table stakes. You shouldn’t need a human being to tell you how much history to keep, and at what granularity, and when you’re going to run into performance problems with it. Who should be setting those things – a DBA who’s never worked with it before, or Microsoft? Seems like the answer is obvious.

I wonder how many of the 5% who have turned it on actually actively look at the data? I must confess that I don’t but the article has reminded me that I really must get round to having a closer look! I had a play when I installed our first SQL 2016 systems, put together some useful scripts then moved on.

Richard – that’s my guess as well, it’s probably a low number. I’m totally okay with that though – for me, the huge value in Query Store is having that history of “good enough” plan data available so that when your plans go to hell in a handbasket, you’re able to jump back in time and see what “good enough” plans looked like.

On a day to day basis, that’s not really worth anything – but on the day that you need it, hoowee, it’s priceless.

Brent,
Funny you should send this today. I was watching the video hosted by you and Mr. Van de Larr yesterday (rainy Sunday why not train?). Anyway, I was convinced to turn on QS for 3 of our most problematic databases. It’s only been running a day now but I look forward to matching these results against the Blitz BI (THANKS for that!), and spotlight recommendations. It should be interesting. BTW The reason I didn’t turn this on months ago was we didn’t have the horsepower to spare. We relicensed and now we’re firing on all 72 cores so I thought I’d give it a shot. I’ll let you know what happens.

For Enterprise Edition users, you get automatic regression for plan changes. If you’ve gotten a good plan for that exact query (not a similar one) in the recent time span covered by Query Store, Enterprise Edition can try that prior plan. If it’s faster, EE will keep it – but it may not be, given the data changes that your database has had over time.

I have a tough time using the term “automatic tuning” for a feature that narrow. It’s a good start in SQL 2017, but it ain’t no cure-all.

Tesla calls their cruise control “AutoPilot” when it drives people into barriers at highway speeds and kills them. Even still, I’m more comfortable calling that “AutoPilot” than I am calling 2017’s plan regressions “Automatic Tuning.”

I love SentryOne, but I love the Query Store more 🙂 It allowed us to respond much more quickly to production performance issues, and stabilize them easily.
BTW I’ve asked SentryOne to incorporate the Query Store data in their product, but who knows if and when that can happen. For now we’re cross-referencing the SentryOne data with our own aggregations of the Query Store data.

I love Query Store and I use it every day. I’ve created a stored procedure which takes the Query Id from Query Store and makes a Extended Events script. My only problem seems to be that I can not get the parameter values when looking at Extended Events?

The combination of Query Store and Extended Events seems like a winner, but I cannot do the last bit of the puzzle. I get the parameters from the Query Plan, but not from Exteded Events.

I’m trying to join the 5%, but it appears like the Query Store has issues with being implemented on a database with mirroring enabled? Although I cannot find any mention of it online as being a known issue, besides the expectation that it doesnt work on a secondary readable replica.
It allows me to enable it, and everything in the settings appears to have it enabled and running as expected. Until i try and turn on automatic tuning, then it returns “Automatic Tuning option FORCE_LAST_GOOD_PLAN cannot be enabled, because Query Store is not turned on.” In all my searches i’ve found all the system tables reporting back the status of “ON/Read-Write”. Some things mention doing a consistency check which resulted in this error “Query Store stored procedure is not available on readable secondary replica.”
The database i’m running this on is not a secondary replica, it is a Principle, Synchronized database in a Synchronous Mirroring setup.
Is it a known issue with mirroring, and i’m just failing to find it documented anywhere? Has anyone else successfully implemented query store on mirrored databases?

After updating from CU11 to CU12… and in that process failing over the mirror. The query store started working appropriately.
Will the live classes on performance tuning be showing us the best parts of the query store we would want to be using?

Jeremiah – no, like I write in the post, not enough people are using it for us to justify investment in it yet. (We regularly run polls in the live classes too, and folks say no, they’re not using Query Store yet. A lot of it is around version adoption rates and patching.)

Wow, I really did read this article, but its been a crazy few days and I managed to completely blow off the main point. I was hooked by the Black Friday deal on the classes and hoping for a easy selling point for the boss!