Nutanix 4.6 Features Overview – Part 2 (Beyond Marketing)

This is the second part of the series. The first part can be found here and contains the overview for the following new features:

Native File Services

Performance Improvements

Volume Groups in PRISM

Self Service Restore (SSR)

1-Click Upgrades of Foundation, BIOS, BMC

Metro Availability Non-Disruptive vMotion and Site Migrations

Cross Hypervisor Disaster Recovery

Project Dial: In-Place Hypervisor Conversion

OpenStack Integration

Citrix Power Management Plugin

AHV Guest Customization

AHV VLAN Trunking

Prism Actionable Capacity Forecasting

Prism Customizable Dashboards

Prism Entity Explorer

Here are the additional new features not covered in the first article:

SaltStack self-healing baseline/ STIG AHV

Nutanix produces Security Technical Implementation Guidelines, also known as STIGs, for high security governance. The system tracks over 1,700 security entities across storage and the Acropolis Hypervisor (AHV) components. These guidelines are implemented from factory to create standard security baselines that allow Nutanix to be a high security solution. To prevent baselines from inadvertent or malicious modification throughout the usage life-cycle Nutanix engineering have added a self-healing framework to AHV.

This framework uses SaltStack for self-healing for all STIGs (CVM OS, AHV, JRE, Tomcat) and is auto-enabled by default with an admin configurable schedule interval to run the baseline check. During this process all security deviations are logged and auto-corrected without human intervention. The self-healing process also enable seamless upgrades for security patching without the need for custom scripts.

Single CG Restriction Removal for Linked Clone Deployments

Disaster Recovery for Horizon View has always been a hot topic. To this date VMware still doesn’t provide an official methodology to protect virtual desktops. Equally, VMware Site Recovery Manager does not support Linked Clone desktops created by Horizon View Composer. Conversely, Full Clone desktops can be protected using native storage replication.

Most Horizon View deployments use Linked Clones desktops provisioned with View Composer. Linked Clone desktops can be of non-persistent (aka floating) and persistent types. For the non-persistent types no protection is required, and there are multiple ways to enable protection and replication of user data and profiles.

Nutanix cluster has a complete understanding of Horizon View Composer intricacies and is able to backup/restore and replicate Linked Clone desktops to a recovery site. Additionally, when in recovery mode, it is possible to power on desktops and Nutanix will automatically register the desktops with vCenter in the recovery site. When the recovery event is over all changes are replicated back to the primary site and life returns to normal.

Desktops are not the only resources needed when in recovery mode; you will also need Connection and Security Servers, Active Directory, SQL or Oracle Databases. All components, if not already available in the recovery site, can be replicated and made available for use. Please note, that DNS name resolution and IP translations for Connection and Security Servers must remain the same to allow desktop agents to communicate properly.

However, there was a limitation to be observed that is now gone – “Limit Linked Clone desktop pools to a maximum of 50 desktops and ensure desktops are member of a unique Nutanix Consistency Group and Protection Domain.” Administrators now may create multiple protection domains with different schedules for the same large desktop pool now with up to 250 desktops per protection domain.

In the past I produced this video with the full disaster recovery of Linked Clone desktops.

Foundation 3.1 Updates

The Nutanix Foundation is the tool that allow administrators to bootstrap, deploy and configure a bare-metal Nutanix cluster from start-to-end with minimal interaction in matter of minutes. The previous release of Foundation made the tool an integral part of PRISM, allowing administrators to drive operations like add-node from within PRISM. This new release adds:

Support for the Lenovo HX platforms

1-Click Upgrades for Foundation, BIOS and BMC

Easier replacement of SSD in single SSD Nodes

Easier setup of AHV based clusters

VLAN tagging support for initial imaging

Better Logging and Progress Monitoring

New Hyper-V GUI SKU

Many improvements to the imaging process

Network Integration – Phase 2

Nutanix is working on a multi-release networking strategy to enable automated root-cause analysis of networking issues like MTU mismatch, packet drops and network utilization generic issues. Given that virtual infrastructure admins need to rely on network administrators for basic networking information it is, generally speaking, difficult to correlate VM network traffic to host traffic and the actual network switch traffic.

By collecting the first hop network switch stats Nutanix is able to debug basic network issues without interaction with network administrators and yet provide a VM centric view of the network that help administrators to understand and track network utilization, stats and issues.

Nutanix can discover the first hop network switches based on the Switch configuration and start collecting the stats to provide a VM centric network view by correlating the Virtual NIC, Host NIC and Network Switch information.

Nutanix Guest Tools (NGT)

The NGT is a software bundle that enables advanced user VM (UVM) functionality. At this point in time NGT provide support for the Nutanix Guest Agent service for communicating with Nutanix CVM, Nutanix Volume Shadow Copy (VSS) software for Nutanix snapshots, Cross-Hypervisor VM Mobility Drivers and File Level Restores.

Guided Alert Root Cause Analysis

Nutanix 4.6 has some big improvements to alerting usability and configurability. Some of these improvements include User Friendly Titles, Alert threshold and exception configuration for a subset of Prism elements, and Prism Central Widgets with emphasis on alert categories such as performance, availability, capacity, configuration and system indicator.Additionally, Alert notifications can now follow special email rules based on a combination of Severity, Category and Cluster selection.

For Root Cause Analysis, Prism Central now offer a guided alert drill-down for the top 15 frequently occurring alerts with relevant metrics for possible causes.

Replication Factor Migration/Conversion

Resiliency Factor 2 (FT1) ensures that two copies of all data are written to persistent media prior to being acknowledged to the guest operating system. This ensures at N+1 level of redundancy which translates to being able to tolerate a single failure.

Resiliency Factor 3 (FT2) ensures that three copies of all data are written to persistent media prior to being acknowledged to the guest operating system. This ensures at N+2 level of redundancy which translates to being able to tolerate two concurrent SSD/HDD or node failures.

Nutanix 4.6 offers the ability for the administrator to change the metatada and data resiliency factors online for a container without cluster disruption. Everything happens automatically in the background, allowing any business SLA changes to be quickly followed by IT without disruptions.

Application Consistent Snapshots for AHV

Nutanix 4.6 introduces the ability to take application consistency snapshots with AHV. AHV works in tandem with NGT to send a quiesce request to the VM while the agent running in the VM will invoke Microsoft VSS service to quiesce the workload. After quiesce is done Nutanix takes a snapshot of the VM.

There are other major and minor improvements in Nutanix 4.6 such as Zookeeper Auto Migration and Leader Only Paxos Reader, but these are operating under the covers to make users and admins experience better or improve performance and reliability.

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

What you show me are different things. The first is the cloud pod that allow you to manage multiple linked clone desktops across cloud pods via a single UI. The desktops themselves are not replicated to a DR site. The assumption of the paper is that the linked clones in use are completely stateless and only user data and profile are being replicated to DR site.

The second paper talks about stretched cluster, where for full clone VMs you can move them between sites – nothing new here. Linked Clones however are not supported as far as I know.

Nutanix will replicate and work with both Full and Linked Clones. Many organizations still use Linked Clones just like persistent desktops and that is the DR use-case we are solving for them.

Nikolay, in retrospect I think you are correct… it seems that VMware does provide a methodology to protect full clone persistent desktops. Would be good to see orchestration and automation in place moving forward.

Nikolay

Hi Andre,
I think replication of Linked Clones desktops aren’t right way, because, as you mentioned, they are stateless. You need to replicate pools setting to recovery site, users db, their data, app stacks, etc, but not linked clones VM itself. It’s “trash” that loads storages and, the most important, cross-dc network. And CPA solve this (honestly, with some 3-rd party tools, like MS DFS for replication user data).
The only thing that I found is Horizon 6.2 support stretched cluster on top of VSAN. Without any remarks about linked clones, so I want to think “everything that is stated and is not separately prohibited – is allowed”.

Thanks Nikolay, I just want to make clear that the new solution that Nutanix is providing is the replication of linked-clones; full clone replication has always been done by Nutanix – as most storage solutions. The use-case here are the companies that use linked-clones but do not refresh upon logoff.