When you install SQL Server Standard or Web Edition in a virtual machine with more than 4 cores, things can look like they’re just fine – when in reality, you’re not using anywhere near your server’s potential. See, these editions have a tricky limitation – let’s look at the documentation:

Standard and Web Edition Limitations

Here’s the catch: the lesser of 4 sockets or 24/16 cores. Not the greater – the lesser. So if you configure a virtual machine with, say, 8 virtual processors, you have to be really sure that you don’t accidentally end up with 8 sockets. Here’s a screenshot of Task Manager in a poorly configured VM:

8 sockets, each with 1 core

At the right, see how it says “Sockets: 8”? That means we set the VM up with 8 CPUs, each of which has a single core. Standard Edition will only use the first 4 cores here, and that’s it. No matter how much load our SQL Server gets, it’ll only use 4 cores – meaning CPU will only hit 50% busy. Our wait stats will show SOS_SCHEDULER_YIELD out the wazoo, but our sysadmins will say, “There’s plenty of available CPU power – you’re just not using it.”

Here’s an example of a reader who ran into the problem after reading this post – note how their first four cores are getting hammered, and the rest are sitting idle:

This is notoriously tricky because if you open SQL Server Management Studio, right-click on the server, and click Properties, it looks like everything’s okay. You see all 8 cores – you just can’t use ’em because they’re offline, as shown in sys.dm_os_schedulers:

There’s another clue buried in the SQL Server startup log:

SQL Server detected 8 sockets with 1 cores per socket and 1 logical processors per socket, 8 total logical processors; using 4 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

Just informational. No big deal. Carry on.

How to tell if you’re having this problem – and fix it

Just run sp_Blitz, and look for the warning of “CPU Cores Offline” or “Memory Nodes Offline.” This is one of those things I found once in the field, figured I’d better check for it, and now I find it all over the place. I totally understand why – from the SQL Server side, nothing looks out of the ordinary. (To date, SQL ConstantCare® has found this problem on 27 servers!)

The solution is easy:

Shut down the VM

Change the number of cores per socket (VMware instructions) – in my case, I could set it to a single-socket, 8-cores-per-socket VM, or a 2-socket quad-core, or a 4-socket dual-core

Turn the VM back on

And enjoy your newfound power. Be aware, of course, that your CPU percent-busy may suddenly get higher (worse) – but that’s a good thing, because it means you’re using all that horsepower you paid for.

Bonus track: demoing this in VMware Fusion

VMware doesn’t enable you to build really dumb VM configurations by default. You have to go into your VM library, hold down the Option key on your keyboard, right-click on the VM, and continue holding down the Option key as you click “Open Config File in Editor.” The server-grade vSphere documentation applies to Fusion, too, so here’s how I built a 12-core guest with offline CPU cores and memory nodes:

I’m curious if there are any pros/cons for the different ways that a VM could be set up in regards to cpus/#cores/#sockets. IE, does it make a difference if SQL Server sees 1 socket with 4 cores vs 4 sockets with 1 core. Unfortunately I don’t really have a setup to test this with at the moment.

The biggest differences come down to the underlying physical architecture of the host, and Non-Uniform Memory Architecture (NUMA).

For physical servers with multiple sockets, the memory banks are separated per socket, so that each socket has a “preferred” memory node – all sockets can access memory in all nodes, but there’s higher latency to access memory from a non-preferred node. SQL Server is a NUMA-aware application, so if you’re running SQL on a physical machine, it will keep memory allocations for a given query in the same memory node as the CPU core(s) the query is running on.

Your hypervisor is also NUMA-aware, but it’s also designed to make things as simple for configuration as possible. When it gets a request to start up a new virtual machine, it will look at the current load on each CPU socket and on each memory bank, and try to find the best fit for the virtual machine (possibly rebalancing running VMs to other sockets & NUMA nodes if necessary). However, the default settings implicitly assume the CPU vs memory requirements of a standard web/application server, which will have a relatively small number of CPU cores & small RAM allocation relative to the total power of the machine, such that they’ll easily fit within a single NUMA node. For SQL guests (or other high performance servers, like an Exchange backend), it’s much more likely that you’ll run into guests that are too large to comfortably fit on a single NUMA node. In that case, VMWare and Hyper-V both provide a way that you can specify the socket layout for the guest, and then they’ll allocate the physical resources accordingly (so you can have your SQL server spread across 2 sockets & 2 NUMA nodes, taking 4 cores and 128GB of RAM from each node out of the 20 cores & 256GB per node your server has, instead of trying to max out the RAM allocation on a single node and being unable to since the hypervisor OS needs a little plus any other VM guests).

Basically… if you have a relatively small guest (less than 25% of the resources of a single NUMA node on the underlying physical hardware), there’s little to no performance benefit for overriding the default behavior of the hypervisor. If you have a larger guest, especially if you have multiple larger guests on the same host, splitting them up so that they balance across multiple nodes will help resource contention (since the hypervisor can’t arrange 3 guests that all need 50% of a single NUMA node’s RAM capacity across 2 NUMA nodes without one of them getting “non-preferred” and thus higher-latency memory), and thus will help with performance of the guests.

Hello 🙂
Thank you Brent for the original post, and thank you Noah for all that information on NUMA.
A while ago I heard a rumour (myth?) that if you assign a large number of cores (or it could have been vCPUs) to a virtual machine it can slow the machine down because it will wait for that number of cores (or vCPUs) to become available on the host before granting the guest CPU cycles on the host. Just wondering if you had heard of this or knew of any truth in this rumour please? 🙂 **It was in relation to VMware

If you have the SQL Server DB Engine on an 8 socket machine as well as SSRS, are both the DB Engine and SSRS locked down to the first 4 sockets? Could they be configured such that DB uses the first 4 and SSRS uses the other 4?

Eric – I don’t do SSRS, sorry. It’s SQL Server in name only – Microsoft just throws it in the box as a free prize, but it’s not like it shares the same code base or admin tools. (Same thing with SSIS, SSAS, etc.)

Thanks Brent! After reading this I checked our SQL Server VM configurations and contacted our hosting company to get them to reconfigure any offending VMs. Result, I improved performance on the offending machines for free.
I will remember this, and add it to the specifications for any future SQL Server VMs we order.

Excellent article, Brent. I ran into it by chance w/blitz, and have since learned it is a factor on a LOT of my customer’s VMs… and it even explains some of the performance troubles they’ve been seeing!

Dude…this stupid thing just got us yesterday with a virtual prod server with 8 sockets and 8 cores. Why is the default 8:8? We changed that setting and queries that were taking over 2 hours (because of all the freakin’ paging) took 10 seconds…

This article is what was referenced and used to address the configuration. As always, thanks for sharing your knowledge!