Boost Performance with Parallel Processing

People often ask me what the Max Degree of Parallelism (MAXDOP) setting should be in their SQL Server environment. Parallelism seems to be one SQL Server aspect that people don’t quite understand sufficiently to make an intelligent configuration decision. Now’s the time to take away some of that mystery.

Most people understand the basics of parallelism: the notion that SQL Server can utilize more than one processor at a time when processing a user request. By default, to process any part of a user request, SQL Server can use up to all the system’s processors or the number of processors that the edition supports—assuming the workload can benefit from parallel processing in the first place. If someone uses a plain-vanilla INSERT statement to insert a single row, the SQL Server engine’s only option is to use one thread or processor, regardless of the number of available processors. But if the request requires large scans of an index, SQL Server might decide to spawn many threads and use all the available processors. So, it’s important to understand that although SQL Server can use all available processors, several factors determine how many it will actually use.

What Is MAXDOP?

One of those factors is the MAXDOP setting, which sets the maximum number of processors that a single user or process can use in parallel. By default, this setting has a value of zero, which really means all available processors under the conditions described above. You can influence this setting in several ways, as you’ll see in a moment.

But it’s the internal algorithm that this optimizer uses when it prepares to execute the request that is the real determining factor for how many processors the system uses at any given time for a single request. This algorithm considers many factors about the current state of the server, such as how many threads are available, how busy the processors are, how many rows will be affected, how many total processors are available to the server, and what limit might be imposed by one of the MAXDOP settings. The algorithm does a pretty good job of limiting the possibility that one or a few users might monopolize the processors. For example, if a user attempts to run a poorly optimized report that would normally use all the system’s processors—but there are four other users running queries already—the optimizer would most likely reduce the degree of parallelism or choose a single-threaded plan and not use parallelism at all. The result is greater concurrency, as well as a much better utilization of the server in a multiuser environment.

You’re probably wondering, “Why do we have a MAXDOP setting if the server can figure everything out?” Consider that same example, in which a user runs a poorly optimized query, but this time let’s say no one else is using the system when the user starts the report. Let’s also assume that the report will take three minutes to run. If the report uses all the available processors, what happens to the other users if they want to run queries during the three minutes that the report takes to run? Obviously, they’ll be affected because the processors are monopolized by the other user’s report. In this case, if you had set a server-level MAXDOP setting of 2, and there were eight processors in total, the user running the report would be able to utilize only two processors at a time for any given part of the report. That would leave six processors available to the other users. Often, there are trade-offs between high degrees of parallelism and concurrency.

Limiting Parallelism

Here’s where it gets tricky. In reality, everybody’s system is unique. Even if everyone had the same schema, system access, and number of users, the hardware and even the data itself can dictate the appropriate level of parallelism. Typically, online transaction processing (OLTP) systems benefit more from a lower degree of parallelism, and reporting systems benefit more from higher degrees of parallelism. OLTP systems generally have many concurrent users that process small amounts of data simultaneously, whereas reporting systems process larger quantities of data and aren’t concerned as much with concurrency.

However, even some queries that the optimizer judges will benefit from parallelism might in fact be hindered by it. Often, when many threads are spawned in a parallel operation, they’re dependent on each other at some point to complete their tasks. If one thread finishes before another upon which it’s dependent, it will go into a wait state and risk getting pulled off the processor in favor of another thread. The result can be an inefficient process that potentially takes longer than if a single processor had been used from the start. For a good indicator, watch for high CXPACKET waits when you’re looking at system wait stats. (For details, refer to my article “Getting to Know Wait Stats,” InstantDoc ID 96746.)

Another aspect to consider when determining the number of processors to use is maintenance operations such as creating indexes, backups, and integrity checks. These operations are processor-intensive, so you should limit the number of processors they use while users are on the system—a clear trade-off between concurrency and speed. Because every environment is unique, a system evaluation is necessary to determine the proper setting to achieve this balance for you. However, with typical, non-maintenancerelated OLTP database activities, the optimal degree of parallelism is often much less than what you might expect. If your database and code are properly tuned, the majority of the requests should be single threaded. Again, reporting or BI operations might be different.

Controlling MAXDOP

The primary location for controlling MAXDOP is in SQL Server Management Studio: Access the SQL Server Properties dialog box, go to the Advanced section, and choose the Server Level setting. (You can also use sp_configure to change server-level settings.) This setting is at the server level, so changing its value can affect everything in the server from that point forward. If your SQL Server system has multiple processors, you need to give a lot of thought to this setting, which determines how many processors in parallel a request can use.

Another option for controlling parallelism is to use a hint at the query or statement level. Doing so lets you finetune individual statements in terms of parallelism. You can temporarily override the Server Level setting, either up or down, for the execution of the statement.

One of the new SQL Server 2005 features is the ability to specify the MAXDOP setting when using the alter index command to rebuild indexes. This capability can come in handy if you usually want the MAXDOP setting at the server level to be configured for a low number of processors for concurrency during the day, but at night, when you rebuild the indexes, you want to utilize more of them to accomplish the task more quickly.

What’s Best for You?

Most OLTP systems will generally benefit from a lower degree of parallelism than the default, which is the potential to use them all. As the numbers of concurrent users and transactions per second increase, the setting becomes more important. One benefit of the MAXDOP settings is that they take effect immediately, letting you experiment with settings and determine which are best for your system without requiring a SQL Server restart.

Discuss this Article 2

Parallelization for queries (SELECT) is covered quite well by the SQL engine itself, but when it comes to large volume data modifications (UPDATE, INSERT, DELETE), the standard engine does parallelize towards best use of all available resources (disk, multiple cpu-cores, etc.).
Therefore you may have a look into the approach of SQL Parallel Boost at
http://www.ibax.ch/-IBX-/plain.simpleimagetitletextlinklist.en.solutions.products.parallelboost/default.aspx
This approach can also be used to execute multiple SQL statements in parallel.
A purely SQL engine related parallelisation solution takes advantage of minimized complexity and has no 'external' components like SSIS involved, Furthermore it's the best performing solution regarding task splitting and synchronization, as it hasn't potential connection and communication overhead. The overall performance gain thru parallelisation with SQL Parallel Boost is up to 10 !
In case you don't want to rebuild your own solution, SQL Parallel Boost provides a self-contained pure T-SQL based solution, which can be easily embedded in existing applications and ETL process tasks.
SQL Parallel Boost improves large volume data modification operations up to factor 10 (!).
The free 'Community Edition' of SQL Parallel Boost can be downloaded at http://sqlparallelboost.codeplex.com

From the Blogs

My initial goal in writing this series of posts was to outline some of the concerns surrounding Availability Groups (AGs) and SQL Server Agent Jobs – and call out how there is virtually no guidance from Microsoft on this front and then detail some of the pitfalls and options available for tackling this problem domain. I initially expected this series of posts to have between 25 and 30 posts – according to some of the early outlines I created ‘way back when’....More

Throughout this series of posts I’ve taken a somewhat pessimistic view of how SQL Server Agent jobs are managed within most organizations – meaning that most of the code and examples I’ve provided up until this point were based on assumptions about how CHANGE to jobs is managed. That pessimism, to date, has come in two forms:...More

In this series of posts I’ve called out some of the concerns related to SQL Server AlwaysOn Availability Groups and their interaction with SQL Server Agent jobs – both in the form of Batch Jobs (see post #3) and backups....More