Maintenance of One ESXi Host in Nutanix Cluster

I have the Nutanix Cluster with 4 ESXi Host. I would like to do the maintanance for all the host one by one so that I can complete the Maintannance with out downtime for VM. Could you please share the process to do maintanance for one ESXi Host.

End to end maintanance of One ESXi host in nutanix cluster.

icon

Best answer by GPVenkatesh13 April 2017, 16:17

Steps prior to bring down host:1) Ensure that the “Data Resiliency – Status” is Normal in PRISM Portal for the target Cluster.2) Migrated all the user VMs (except CVM) residing in the target ESXi host to other healthy nodes part of the cluster.3) Connect the CVM via SSH and find its UUID using below mentioned command.
ncli host ls | grep -C7 [IP address of CVM]
4) Place the CVM in maintenance mode using below mentioned command with UUID which we have traced using previous step.ncli host edit id=[uuid] enable-maintenance-mode="true"
5) Verify that the CVM has been placed in maintenance mode using following command.cluster status | grep CVM
6) Perform shutdown of CVM using below mentioned command.cvm_shutdown –h now
7) Once the CVM made down.8) Place the ESXi host in maintenance mode and do your maintenance activity.

Steps post bringing the host Online:1) Exit the host from maintenance mode and Power ON the CVM.2) Connect any one of neighbor CVM in cluster using SSH.3) Check the status of CVM which we have Powered ON using following command. In this stage CVM should be reported as it is in maintenance mode.ncli host ls | grep -C7 [IP address of CVM]
4) Exit the CVM from maintenance mode using below mentioned command.ncli host edit id=[uuid] enable-maintenance-mode="false”
5) Verify that the CVM has been removed from maintenance mode using following command.cluster status | grep CVM
6) Ensure that the “Data Resiliency and Meta-data sync status” came normal post completing the maintenance activity in PRISM portal. It may take 5 to 10 minutes to reflect.

9 replies

Steps prior to bring down host:1) Ensure that the “Data Resiliency – Status” is Normal in PRISM Portal for the target Cluster.2) Migrated all the user VMs (except CVM) residing in the target ESXi host to other healthy nodes part of the cluster.3) Connect the CVM via SSH and find its UUID using below mentioned command.
ncli host ls | grep -C7 [IP address of CVM]
4) Place the CVM in maintenance mode using below mentioned command with UUID which we have traced using previous step.ncli host edit id=[uuid] enable-maintenance-mode="true"
5) Verify that the CVM has been placed in maintenance mode using following command.cluster status | grep CVM
6) Perform shutdown of CVM using below mentioned command.cvm_shutdown –h now
7) Once the CVM made down.8) Place the ESXi host in maintenance mode and do your maintenance activity.

Steps post bringing the host Online:1) Exit the host from maintenance mode and Power ON the CVM.2) Connect any one of neighbor CVM in cluster using SSH.3) Check the status of CVM which we have Powered ON using following command. In this stage CVM should be reported as it is in maintenance mode.ncli host ls | grep -C7 [IP address of CVM]
4) Exit the CVM from maintenance mode using below mentioned command.ncli host edit id=[uuid] enable-maintenance-mode="false”
5) Verify that the CVM has been removed from maintenance mode using following command.cluster status | grep CVM
6) Ensure that the “Data Resiliency and Meta-data sync status” came normal post completing the maintenance activity in PRISM portal. It may take 5 to 10 minutes to reflect.

Shutting Down a Node in a Cluster (vSphere Web Client)
Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

Log on to vCenter Server by using vSphere Web Client.

If DRS is not enabled, manually migrate all the VMs except the Controller VM to the another host in the cluster or shut down any VMs other than the Controller VM that you do not want to migrate to the another host.If DRS is enabled on the cluster, you can skip this step.

In the Confirm Maintenance Mode, click OK.The host gets ready to go into maintenance mode, which prevents VMs from running on this host. DRS automatically attempts to migrate all the VMs to another host in the cluster.
Note: If DRS is not enabled, you need to manually migrate or shut down all the VMs excluding the Controller VM. The VMs that are not migrated automatically even when the DRS is enabled can be because of a configuration option in the VM that is not present on the target host.

Log on to the Controller VM with SSH and shut down the Controller VM.
nutanix@cvm$ cvm_shutdown -P now
Note: Do not reset or shutdown the Controller VM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the Controller VM is unavailable

After the Controller VM shuts down, wait for the host to go into maintenance mode.

Right-click the host and select Shut Down.Wait until vCenter Server displays that the host is not responding, which may take several minutes. If you are logged on to the ESXi host rather than to vCenter Sever, the vSphere Web Client disconnects when the host shuts down.

Shutting Down a Node in a Cluster (vSphere command line)
Before you beginIf DRS is not enabled, manually migrate all the VMs except the Controller VM to another host in the cluster or shut down any VMs other than the Controller VM that you do not want to migrate to another host. If DRS is enabled on the cluster, you can skip this pre-requisite. Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.You can put the ESXi host into maintenance mode and shut it down from the command line or by using the vSphere Web Client.

Log on to the Controller VM with SSH and shut down the Controller VM.
nutanix@cvm$ cvm_shutdown -P now

Log on to another Controller VM in the cluster with SSH.

Shut down the host.
nutanix@cvm$ ~/serviceability/bin/esx-enter-maintenance-mode -s cvm_ip_addr
If successful, this command returns no output. If it fails with a message like the following, VMs are probably still running on the host.
CRITICAL esx-enter-maintenance-mode:42 Command vim-cmd hostsvc/maintenance_mode_enter failed with ret=-1Ensure that all VMs are shut down or moved to another host and try again before proceeding.
nutanix@cvm$ ~/serviceability/bin/esx-shutdown -s cvm_ip_addrReplace cvm_ip_addr with the IP address of the Controller VM on the ESXi host.
Alternatively, you can put the ESXi host into maintenance mode and shut it down using the vSphere Web Client.
If the host shuts down, a message like the following is displayed.
INFO esx-shutdown:67 Please verify if ESX was successfully shut down using ping hypervisor_ip_addr

Confirm that the ESXi host has shut down.
nutanix@cvm$ ping hypervisor_ip_addr
Replace hypervisor_ip_addr with the IP address of the ESXi host.
If no ping packets are answered, the ESXi host is shut down.

Hello, thank you for this interresting post.
Do we know what is the Nutanix position about that ? Because many companies VMware admin may no be Nutanix admin to perform SSH command on a CVM. Is there an official Best Practice / guide for common task on ESXi + Nutanix ?

Hi, nope, the CVM musn't be migrated when putting the ESXi in maintenance mode. However I woukd like to heard a word from nutanix about just shutting down the CVM via the vCenter without the need to put also the CVM in maintenance with an SSH session.

When you get an outage, the CVM don't switch to maintenance mode I guess, you just lost the service and get it back when the host come online.

Every time we do maintenance on an ESXi host, we put the host into maintenance mode and shut down the CVM. I'm curious, what is the purpose of putting the CVM into maintenance mode as opposed to just shutting it down?

Cookie policy

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.