Tony Davis is an Editor with Red Gate Software, based in Cambridge (UK), specializing in databases, and especially SQL Server. He edits articles and writes editorials for both the Simple-talk.com and SQLServerCentral.com websites and newsletters, with a combined audience of over 1.5 million subscribers. You can sample his short-form writing at either his Simple-Talk.com blog or his SQLServerCentral.com author page.
As the editor behind most of the SQL Server books published by Red Gate, he spends much of his time helping others express what they know about SQL Server. He is also the lead author of the book, SQL Server Transaction Log Management.
In his spare time, he enjoys running, football, contemporary fiction and real ale.

When you really need to know why your SQL Query’s slow

Published 28 August 2014 5:17 pm

It is useful to know that a SQL Query is slow, but it is even better to know why. To do this, we need to see how SQL Server’s Query Optimizer has translated the query into something that it can execute, and what happened at each stage of execution, and for this we need to see the execution plans.

To some developers, the database is just a black box at which they throw queries. The majority, however, care about database performance, and of the SQL code they produce. They are intrigued to know exactly how SQL Server accesses the tables and gathers the data, compared to what they think is the logical access path.

Does this mean, however, that they take the time to examine and understand the SQL execution plans, with the same zeal as a DBA? Maybe. A very experienced SQL developer recently confessed to me that he only rarely looked at execution plans. To do so was to get too deep into “implementation” rather than the code. He wrote the algorithm, tested that it was functionally correct, got some timings, and then tweaked it until it was within the performance specification. If that didn’t work, it rather meant he had simply come up with the wrong approach, and needed to step back, rethink, and start from scratch with an approach which fitted better with the data model, and so was more likely to scale. To waste time refining a faulty algorithm was like putting lipstick on a pig.

I was intrigued by the idea that in seeking a tool that could, potentially, help to improve faulty code, the developer was ‘wasting time’. There is, of course, the hassle of cranking up another GUI, if a developer has crept outside the comfort zone of Visual Studio to access SQL Server from Python, R, the PowerShell GUI, or a performance profiler. Until relatively recently, it wasn’t even easy to view the execution plan from within Visual Studio. The “include actual execution plan” button, familiar to all who use SQL Server Management Studio (SSMs), is there in SQL Server Data Tools (SSDT), but SSDT was only fully-integrated in VS2013.

Another problem is that, for some developers, execution plans are as easy to read in SSMS as ancient Babylonian. The information is there but not in an intuitive way geared for the occasional user. Regardless of how or where developers write or generate their SQL code, it needs to support not just an easy way to return the execution plan, but also to present it in a digestible format that makes for easier interpretation, and quicker troubleshooting.

It was with these thoughts in mind that we published SQL Server Central’s browser-based tool, http://sqltuneup.sqlservercentral.com/. Just paste in the XML plan text to see a nice visual representation of the plan, complete with clear indications of the expensive and “data heavy” operations that present likely hotspots in the plan. The same tool is now also built into the latest version of Redgate’s ANTS Performance Profiler, meaning that developers can investigate side-by-side any bottlenecks in both their application code and SQL code, and view immediately the execution plans for any seemingly-problematic queries.

Increasingly, there are good tools and educational content to help developers use and understand SQL Server query execution plans. It will be interesting to see if developers really do use them, or if they will remain largely a DBA-only tool. Are there any other down-sides to encouraging developers to delve into execution plans? I’d love to hear your views.

“Premature optimization” is indeed a waste of time, and some (like Donald E. Knuth) claim that it is the “root of all evil”.
I have to confess that “performance” used to be my big argument not to use a database, and to use hand-crafted binary serialization instead! After several years of being beaten into a more commercially minded developer, I encountered a team that still tightly clings to their trusty client-sided “betreive” technology (on new projects). “Performance” is their main excuse. Suddenly I was shocked to find myself fully convinced in the opposite camp.
As a developer I sporadically inspect execution plans, usually only when some reporting query causes time-outs in production. I don’t see any down-sides of acquiring such knowledge except for young enthusiasts that get hooked into performance gamification, which actually is not really that bad either (as long as deadlines are met). It will rub off over time, leaving the developer with more awareness and knowledge.
I took a look at the SqlTuneup tool you mentioned and think it is quite neat. I had to figure out how to extract the xml from the graphical Execution plan in SSMS (right-click and select “Show Execution Plan XML…” from the context menu). I admire the html-based output and the positioning of the div’s to form the tree. On the other hand I don’t think it adds very much to what SSMS shows (though the tooltip-based GUI can be annoying at times). The sorted tables of expensive steps on the right are quite a nice addition that might save some seconds of a developers time, but probably not enough to justify opening a browser and copying xml. Probably I’m missing something big, or you guys wouldn’t have gone through all that trouble to create this? It would make a great component for a web-based monitoring tool though. Please enlighten me, what am I overlooking?

A few years ago, most developers did not test code. Many developers considered spending time testing beneath them and managers considered developers too expensive to devote to testing. The mindset has now changed with the advent of unit testing and test driven development. Even if you do not work in a TDD shop, the expectations around writing unit tests, test automation and overall quality delivered by the development team are much higher than what they were 5-7 years ago. And I think most people view this as a positive change in how we develop software.

We’ve seen other changes too. Usability of most systems used to be an afterthought. But thanks to Apple, more developers than ever are interested in building systems that are easy to learn, easy to use and aesthetically pleasing.

We need to continue this trend, so that software developers (and really all technology professionals) take responsibility for the entire system that they build, not just the code that they write. Being responsible for the entire system that you build means taking responsibility for the performance of the system, meaning you are thinking about things like execution plans, indexing strategies and when to cache data. Knowing how to generate and interpret an execution plan isn’t the goal. Getting all developers to take responsibility for the performance of their system is the goal. Knowing how to interpret an execution plan comes naturally after to take that responsibility.

There are many challenges to be sure. Too many companies I have worked for didn’t even want to invest in a reasonable performance testing environment that could handle production size data sets. Managers want to finish “on time and on budget”, and anything that is optional like “this performance stuff” gets thrown overboard at the first sign of schedule slippage. And with all the gee-whiz JavaScript libraries, who wants to learn to work with a relational database that is so last century.

Those of us who have been around for a few years know though, that you build a quality system from the foundation up. And we can take guidance from lean methodologies which are used in manufacturing and are now being applied to software engineering. Getting quality right the first time, when you construct a component is ultimately the most cost effective and sustainable way to build that component. To get quality right, you have to know what quality looks like and how to test for quality. The software development community has made this transition in terms of unit testing. It can happen for performance too.

Performance is just another quality attribute. We can perform a simple back of the envelope calculation to tell us our approximate data sizes, and we should know what our user’s expectation is about response times. Validating an execution plan is sensible is how we build quality in during construction. We do it with unit tests and we do it for maintainability with cyclomatic complexity calculations? There is no reason we can’t do it with execution plans as well.

Much of the challenge comes down to education and awareness. Most people don’t know what they don’t know. Do developers know how to generate an execution plan? Do they know what a Clustered Index Scan means when they read a plan? Do they know what a logical IO is and why it matters? Do they know that they need a data set that represents production in both size and distribution? Unfortunately, none of this is usually taught in university courses and then never gets introduced on the job.

It becomes the responsibility of those of us who read the articles on this site to reach out to others, in the companies we work for and at our local user groups to help our fellow software engineering professionals understand why this is important and why these are important practices you should follow. I would challenge all of us who have read Tony’s post above to put on sometime in the next 6 months an hour long lunch and learn at their company on how to read an interpret an execution plan. Some of your colleagues will come, listen and learn. Some will not. But those who do will help to create the change we need as a software community.

Last year I set up Red-Gate SQL Monitor on a system that had embedded SQL within the application. So I came across a query (that was obviously running slowly) that passed over 2000 parameters! Initially I thought I was seeing things but the explanation from the developer was that someone had copied and pasted a big comma separated list into one of the search boxes within the application. The code generated by the application used all 2000+ parameters using an IN on the WHERE clause. I then tried to explain to the developer that this would cause havoc for the query optimiser and suggested a re-write with a stored procedure using one VARCHAR(MAX) parameter which was split internally within the sp. The developer maintained that his method effectively amounted to the same thing. I even lobbied the views of a world expert on SQL Server optimisation. He confirmed that the current approach was disastrous. In the end I realised I was wasting my breath and the arrogance and ignorance displayed by the developer was insurmountable.

I’d say this is still primarily a DBA only tool. A lot of developers use the framework to map their code and expect it to do a decent job. It often does a pretty good job, but the tricky part comes when you’re returning lots of rows and need to tune that over getting one thing at a time over and over again. We had a mini-profiler that could be turned on locally for debugging to show how each page behaved and where it spent its time. That was useful, but expensive to use all of the time. We later subscribed to services such as New Relic to profile our apps and help us get some ideas about performance. It’s after the fact, but still helpful.

For query plan analysis, I’ve used SQL Sentry Plan Explorer for a lot of the details in recent years. It’s been a great way to see the whole plan or pieces of the plan easily. That’s really helped us find and tweak areas of the code that needs some help.

I can definitely see the point of view that looking at the query plan isn’t always necessary. Sometimes the queries are just basic and written off the primary key so the query plans are pretty simple. Adding them up is another story as the ORM decides to pull in 1000′s of rows one at a time, then loop through each of those rows multiple times to get additional data points. Each query in itself is really simple, but the overall picture shows a lot of wasted time and cycles that aren’t immediately visible in the smaller test sets often used for development. This is when the tools such as New Relic or App Dynamix can help see the load of a page. The mini profilers we’ve used help as well when developing something new.

Looking at the SQLTuneup site, I agree w/ Louis – not sure about the extent of its usefulness. I can see it being useful when I can’t install something on the machine, but if I have the option and will be doing a bit of development, I’ll install SSMS Tools Pack or SQL Sentry’s Plan Explorer. (Also, the site has some formatting issues if the text overruns certain parts of the screen.)