Configuring SQL Server 2005 for Soft NUMA

Sometimes I am surprised how issues come up the moment you mention them J. Recently, I have discussed Soft NUMA configurations in the article http://blogs.msdn.com/slavao/articles/441058.aspx. This week one of our customers had an interesting problem. The customer wanted to partition single SQL server instance based on the load. Customer’s application is heterogynous. It consists of TPCH type queries and data loading applications. The customer has a system, which is NUMA, with 2 nodes and 4 CPU per Node. The customer wanted to give the loading application two CPUs and the rest of CPUs to the queries. Is it possible to achieve it?

As you might guess the answer is SQL2005’s Soft NUMA support. We advised them to configure SQL Server and clients to treat system as three node NUMA system. (Surprised? Yes it is possible with SQL 2005 Soft NUMA support). The configuration looks like following: zero node has 4 CPUs, first node has 2 CPUs and last node has 2 CPUs. Keep in mind, when you configure SQL Server for Soft NUMA, soft nodes should fully be contained in the real nodes, i.e. a soft node can not span several real NUMA nodes. Customer’s TPCH queries were configured to utilize zero and first nodes and the load application was configured to utilize last node. Once configured and started this configuration worked as expected – load was fully partitioned across CPUs. It is important to notice that the customer was very delighted with the experience. Below is the example of node & network configuration we provided customer with:

– This configuration makes SQL Server to listen on two ports. One port serves two nodes. It means that port node relations ship could be one two many.

– Keep in mind that in the square brackets you specify node affinity not the CPU affinity, i.e. 3 in this case means that port 1433 will processes requests on node 0 and 1

– When port is assigned to multiple nodes it will assign connections to the nodes in a round robin fashion

Once registry is configured server have to be restarted for changes to take affect. In addition clients should be configured to connect to specific ports. In the case of our customer TPCH clients were directed to port 1433 and the loading application was assigned to port 2000.

SQL Server 2005 supports more complex configuration for Soft NUMA. Below is information along with an example provided to me by our performance team; describing how to configure SQL Server 2005 for Soft NUMA in general. Please notice that you can actually bind a specific NIC to the node.

Thus if you need 3 ports (2000,2001,1433) on a four node machine such that 2000 controls connections such that they go nodes 0 and 1 ,2001 causes them to go to nodes 2 and 3 and 1433 to go to all nodes You use:

Thus the values in the square brackets are node masks and NOT processor masks. The above key makes all NICs on the system listen to all the above ports. Now instead, if we want to control things on a NIC by NIC basis, we need to flip this switch

Thanks for this article… never knew we could do that with CPUs till now… I have a couple of customers that can benefit from this, however I think they might be curious and me too to know if we can make this system dynamic? In the above example assuming that the application is intelligent enough to redirect the queries to the appropriate NIC, when the bulk load is not running and that there is a heavy load of TPCH queries, in that case the CPU resources would be poorly utilized? I would like the SoftNUMA to increase and decrease it’s CPUs within the node, is that possible without restarting?

Secondly most of my customer servers are on a cluster in that case how to make the application simple? Application is assumed to be intelligent enough to redirect queries to appropriate nodes, now in case of failover onto a mirror/passive server (where the nodes are not configured or configured differently) the application will have to revert to the non-SoftNUMA config or new-SoftNUMA config?

Currently SoftNUMA configuration is static. However you can overcome the issue by pushing actual decisions which node to use to middle tier/client side. When bulk loads not running you could round robin connection across all the nodes, when bulk load is about to run you can remove TPCH load from the bulk load node.

For the second question the answer is yes if you decide to manage connections on the client/application side you will have to know about nodes availability hence change the configuration after failover.