Cluster-Aware Updating in Windows Server 2012

Driving down the management overhead of IT systems is a top priority for most organizations today. One way to achieve this goal is to minimize the work associated with patching OSs, including minimizing the number of patches that are needed. I'll first discuss how to reduce the number of patches, then show you how to automate the patching process for the patches that remain.

Reducing the Number of Patches

Since Windows Server 2008, the Server Core configuration level has been available. In a Server Core environment, the graphical interface, management tools, and management infrastructure are removed from the Windows deployment. This typically means about 50 percent fewer patches and, more important, longer times between reboots, because the patches that are no longer required are typically those that require reboots.

The main challenge with Server Core in Windows Server 2008 R2 and Server 2008 is that Server Core has to be set at installation time and can't be changed without reinstallation. This is a big risk for organizations not used to managing Windows Server from a command prompt or remotely. In addition, in Server 2008 R2 and Server 2008, Server Core is supported for only infrastructure roles (i.e., roles that are part of Windows Server itself) and not for other applications.

Server Core in Windows Server 2012 has completely changed. Server Core is now the default installation option for Server 2012. The graphical interface, management tools, and management infrastructure can now be added and removed at any point in a server's life cycle, with only a reboot required. This gives you a lot more flexibility and granularity, because you can choose to have the management infrastructure but not the graphical interface, for example.

In addition, Server Core is now an application platform, so applications (e.g., SQL Server 2012) can run on it. This means the patching overhead can be greatly reduced, without losing capabilities for Server 2012. You'll still need to patch and reboot, but you won't have to patch and reboot as often.

Rebooting a server typically means the services running on that server have to be unavailable during that time. In the case of a Hyper-V host, this means all the virtual machines (VMs) and the services running on them will be unavailable. However, this doesn't have to be the case. This is where the Failover Clustering feature comes into play. When hosts are in a failover cluster, you can move a service (including a VM) between the nodes without downtime, provided that the service supports a zero-downtime migration technology, such as live migration of VMs or leveraging the Server Message Block (SMB) 3.0 transparent failover. This means that patching and the associated reboots aren't a big deal in terms of availability, because there's no impact on availability.

However, the patching process can be time-consuming. Consider that to manually patch a cluster in Server 2008 and later, you need to perform the following steps:

Pick a node in the cluster and migrate all services from that node to other nodes in the cluster using a zero-downtime migration technology (e.g., live migration for VMs).

Place that node in maintenance mode, which will drain it of all its resources and move them to other nodes in the cluster. For this step, you can use Windows PowerShell's Suspend-ClusterNode cmdlet with the -Drain parameter. Suspend-ClusterNode is one of many cmdlets in the Failover Cluster Module for PowerShell.

Download and apply the patches and reboot the node. Once rebooted, check to see whether there are any new patches that apply. If so, apply them and reboot again.

Bring the node out of maintenance mode. For this step, you can use Windows PowerShell's Resume-ClusterNode cmdlet with the -Failback parameter.

Migrate the services back to the patched node using a zero-downtime migration technology.

Repeat the previous steps for the next node in the cluster and so on until the whole cluster is patched.

This update process sounds simple enough, but if you have a 64-node cluster, it's a lot of work. You can purchase products that can reduce the amount of work. For example, Microsoft System Center Virtual Machine Manager (VMM) 2012 provides one-click patching of Hyper-V hosts in a cluster and there are System Center Orchestrator 2012 runbooks to automate the patching of a cluster. However, if you're running Server 2012, you can take advantage of a new built-in capability named Cluster-Aware Updating (CAU).

Automating Patching with CAU

CAU is part of Server 2012's Failover Clustering feature. With CAU, the update process for an entire Server 2012 failover cluster can be performed automatically. It's important to note this is only for Server 2012 failover clusters. It's not backward compatible with Server 2008 R2 (or earlier) clusters. There's no limitation on the applications that will work with CAU. If it's a cluster application, it should be supported. For example, VMs, file shares, and SQL Server will work with CAU. Each application's native migration technology is used as part of the CAU process.

The source for the patches can be either Windows Update or your on-premises Windows Server Update Services (WSUS) implementation. If using WSUS with CAU, you need to make sure you're using WSUS 4.0 (which is part of Server 2012) or WSUS 3.0 SP2 with the KB2734608 update applied. In addition, CAU supports a plug-in model, which allows patches from other sources. For example, Microsoft.HotfixPlugin is a nondefault plug-in supplied with CAU. It lets you select hotfixes that aren't deployed through Windows Update. You can also use it for non-Microsoft updates, such as driver and firmware updates from your hardware vendors.

With this plug-in model, CAU might eventually be able to use Microsoft System Center Configuration Manager (SCCM) as the source for the patches. However, at the time of this writing, this plug-in hasn't been written, which means you can't use SCCM as the patch source when using CAU.

After CAU is enabled in your cluster, you have a flexible yet simple patch capability for your clusters. You can trigger it manually or schedule it to run. You can even run pre-update and post-update PowerShell scripts on each node as part of the patching process.

Understanding the CAU Modes and Requirements

CAU works on physical clusters and clusters configured inside VMs. There are two modes: self-updating and remote-updating.

With the self-updating mode, the Failover Clustering management tools are installed on each node in the cluster so there are no external dependencies. Plus, a CAU clustered role is installed.

With the remote-updating mode, the Failover Clustering management tools are installed on a remote Server 2012 or Windows 8 computer, which controls the patching application in the remote clusters. If the Failover Clustering management tools aren't installed on the cluster nodes when using remote-updating, the only restriction on functionality relates to the running of debug-type information, which isn't generally used by administrators anyway.

The advantage of using the self-updating mode is that the cluster is completely self-managed and can effectively patch itself on autopilot. The advantage of using the remote-updating mode is that many different clusters can be patched from the box configured as the remote-updating coordinator. Because the coordination is being done remotely, there's no need for the Failover Clustering management tools to be installed on the cluster nodes (or even PowerShell and the Microsoft .NET Framework if you aren't using pre-update and post-update scripts), which means the cluster nodes can be running the Server Core configuration level. In addition, remote-updating gives a more verbose feedback, which is ideal when close attention is needed (but there's still no manual administrator action required).

To use CAU, the cluster nodes need to meet a few requirements:

Remote Windows Management Instrumentation (WMI) must be enabled (which is the default). If you need to enable it, you can use PowerShell's Set-WSManQuickConfig cmdlet.

The .NET Framework 4.5 must be installed (which is the default) if you're using the self-updating mode or using pre-update and post-update PowerShell scripts.

PowerShell 3.0 must be installed and PowerShell remoting must be enabled if you're using the self-updating mode or using pre-update and post-update PowerShell scripts. You can use the Enable-PSRemoting cmdlet or Group Policy to enable PowerShell remoting.

There must be a firewall exception for remote restart, which is accomplished by enabling the built-in Remote Shutdown exception. This is done automatically as part of the CAU configuration when using the GUI. Action is required only if there are any Group Policies that might disable this exception.

The nodes must be part of a cluster and therefore have the Failover Clustering feature installed.

If you use a proxy to access Windows Update, you need to set this proxy for the computer account because CAU runs under the System account and not a user account. You can use the Netsh command-line utility to set up a proxy by customizing and running the following command in Cmd.exe:

Netsh winhttp set proxy <proxy IP>:<proxy port> "<local>")

Installing and Configuring CAU

After the requirements are met, you can install and configure CAU. I'll be walking you through setting up CAU in the self-updating mode. If you want to use the remote-updating mode, you simply install the Failover Clustering management tools, then use the GUI or the PowerShell cmdlets in the Failover Cluster Module to start the update process.

To install and configure CAU in the self-updating mode, follow these steps:

Open Failover Cluster Manager. Using the Connect to Cluster action, connect to the cluster in which you want to use CAU.

Click the Cluster-Aware Updating link, which will open the Cluster-Aware Updating screen and initiate a scan of the environment.

Click the Configure cluster self-updating options action, which will launch the Configure Self-Updating Options Wizard. Click Next on the Getting Started page of the wizard.

On the Add CAU Clustered Role with Self-Updating Enabled page, select the check box labeledAdd the CAU clustered role, with self-updating mode enabled, to this cluster. You'll also see the check box labeled I have a prestaged computer object for the CAU clustered role. In the self-updating mode, a CAU role service is added to the cluster. It uses its own virtual computer object, which needs to be created in Active Directory (AD). If you leave this check box clear, this prestaged virtual computer object will be automatically created for you. However, for that to occur, your cluster computer object must have permission to create computer objects in the default Computers container (or the container in which your cluster computer object is located), as shown in Figure 1. If your computer object for the cluster isn't able to have this permission because of corporate policies, you'll need to prestage the virtual computer object for CAU, select the I have a prestaged computer object for the CAU clustered role check box, and provide the name of the prestaged virtual computer object. (For information on how to prestage objects, see "Failover Cluster Step-by-Step Guide: Configuring Accounts in Active Directory.") Click Next.

On the Specify self-updating schedule page, configure the update schedule. As Figure 2 shows, you can have the updates occur daily, weekly, or monthly. When planning your schedule, keep in mind that Microsoft releases new patches on the second Tuesday of each month. Click Next.

On the Advanced Options page, you can change the number of retry attempts (the default is 3), specify pre-update and post-update PowerShell scripts if you want to use them, and configure other options. You can also change the CAU plug-in. By default, the Microsoft.WindowsUpdatePlugin plug-in is selected. It installs cluster updates directly from Windows Update or an on-premises WSUS server. Note that you don't configure the cluster to use Windows Update or WSUS as part of the CAU configuration. The cluster will use whatever update method it has already been configured to use. You're just specifying the plug-in to use to get the updates from that source. Make any changes you need to on the Advanced Options page and click Next.

If the Microsoft.WindowsUpdatePlugin plug-in is specified on the previous page, the Additional Update Options page will appear. On this page, you'll find the option Give me recommended updates the same way that I receive important updates. Select this option if desired and click Next.

On the Confirmation page, a summary of the options you selected will be shown. After verifying your options, I recommend that you scroll down to the bottom of the page, where you'll find the PowerShell command the wizard is about to run. The command will look something like:

On the Confirmation page, click Apply. The wizard will then add the CAU clustered role and create the virtual computer object in AD.

After the configuration is complete, click the Cluster-Aware Updating link for your cluster to bring up the various CAU actions. I recommend that you run the Analyze cluster updating readiness action. This will check the cluster nodes to ensure that CAU will operate correctly. CAU is now ready to apply updates following the schedule you specified.

Performing CAU Updates Manually

At any time, you can check the updates that need to be applied and manually run those updates. To do so, click the Cluster-Aware Updating link for your cluster to bring up the various CAU actions. Next, click the Preview updates for this cluster action to bring up the Preview Updates dialog box. By selecting your plug-in and clicking the Generate Update Preview List button, you can generate a list of all the updates that need to be applied to all the hosts in the cluster, as shown in Figure 3.

To manually install those updates, close the Preview Updates dialog box and click the Apply updates to this cluster action. This will bring up the Cluster-Aware Updating Wizard. You just need to click Next on the Getting Started page, then click Update on the Confirmation page. CAU will then update the cluster using the existing settings.

The order in which hosts are updated is based on the number of resources currently hosted on the node. CAU will first update the node with the fewest number of resources, then update the node with second fewest resources, and so on, until all the nodes are patched.

You can monitor the progress of the updates. As Figure 4 shows, detailed information is given, including the updates being downloaded, the nodes being placed into maintenance mode, the updates being applied, the nodes being rebooted, and so on.

When you're using the self-updating mode, you'll see that the Update Coordinator will move between nodes as boxes are rebooted. (If you're doing remote updating, the Update Coordinator will remain on the remote box.) The Update Coordinator is the brains of CAU. It scans, downloads, and installs the patches on each node, controls scripts, and so on.

After all the updates are applied to a node, CAU will reboot the node and check again for any new updates that might need to be applied. If there are new updates, CAU will apply them, reboot the node, and check again for updates. This will continue until no new updates are found. After CAU has finished patching that node, it will then move to the next node. The video "Patching a Cluster with Cluster-Aware Updating (CAU)" shows this update process as well as demonstrates some other CAU-related activities. After CAU completes the update process, you can click the Generate report on past Updating Runs action to obtain a report that shows the details of the CAU execution.

Note that CAU supports a "configured but on hold" setup in which the update cycle is always manually forced and never scheduled. With this setup, you need to trigger the update application manually or use some other process to trigger it. For more information about this setup, see "Advanced Options and Updating Run Profiles for CAU."

What You Need to Keep in Mind

CAU is a fantastic feature, but it's still very important that you test the updates using the standard update validation processes before allowing CAU to deploy them. So, if you plan to let CAU run on autopilot, you need to make sure that you allow enough time for this testing when scheduling the updates.

In addition, you need to make sure you don't have other update processes going on, such as automatic updates applied outside of CAU, because this could cause downtime to your cluster. Keep in mind that CAU isn't a new patching technology. It's an orchestration technology that leverages your existing patching technologies.

Finally, CAU is one technology you don't want to tell your boss about. As far as he or she is concerned, you're still working all weekend patching your 14 separate clusters!

John Savill's Hyper-V Master Class

Join John Savill for 12 hours of comprehensive Hyper-V training. This master-level online training course will explore all the key aspects of a Hyper-V based virtualization environment covering both current capabilities in Windows Server 2012 R2 and looking at the future with Windows Server vNext.