Category Archives: VMware

Nutanix customers love the fact we give them their weekends back by having 1-click upgrades for the Acropolis operating system, BIOS, BMC, Firmware and the Hypervisor. When speaking to some customers still go through a multi-step process to include:

Download Updates in VUM
Create a new baseline
Attach Hosts to baseline and scan hosts to validate
Place DRS to manual and evacuate guests from the host
Issue shutdown command to CVM
Place host into maintenance mode
Proceed with remediation wizard
Complete upgrade
Reboot host
Power on CVM
Validate RF in Prism and move on

Yes, a couple of these steps are added compared to non-Nutanix environments, however there are still a number of steps that need to be completed.

With Prism, as long as the cluster is managed with vCenter, we are able to manage the entire process for you, by simply opening the upgrade tab, uploading the offline upgrade package with the json file from the Nutanix support portal and off you go! It’s as simple as that, and here’s another video to show the process.

I’ve never had much to do with Hyper-V and my knowledge of it is nowhere near as strong as VMware, however since joining Nutanix I find the topic comes up more and more in conversations with clients. It’s great that Nutanix is hypervisor agnostic and supports multiple platforms, but this means I need to get up to speed with the Hyper-V lingo!

I’ve come up with the below table that I can use as my mini translator, so when in conversation I have a quick reference point.

DISCLAIMER: This ‘hack’ is UNSUPPORTED and should never be used in a Production Environment – for home labs though may be useful!

Whilst experiencing an issue with a corrupt SSO installation in vSphere 5.5, I discovered a reg hack that allowed me to continue to login to a vSphere environment with domain credentials until I could resolve the existing issue with SSO. This is a simple change and involves editing the vpxd.cfg file.

In the second of my series of Quick Tip’s with Nutanix I wanted to cover off VMware APIs for Array Integration (VAAI).

The Nutanix platform supports VAAI which allows the hypervisor to offload certain tasks to the array. This vSphere feature has been around a while now and is much more efficient as the hypervisor doesn’t need to be the “man in the middle” slowing down certain storage related tasks.

For both the full and fast file clones an NDFS “fast clone” is done meaning a writable snapshot (using re-direct on write) for each clone is created. Each of these clones has its own block map meaning that chain depth isn’t anything to worry about.

The following will determine whether or not VAAI will be used for specific scenarios:

Clone VM with Snapshot > VAAI will NOT be used

Clone VM without Snapshot which is Powered Off –> VAAI WILL be used

Clone VM to a different Datastore/Container –> VAAI will NOT be used

Clone VM which is Powered On –> VAAI will NOT be used

These scenarios apply to VMware View:

View Full Clone (Template with Snapshot) –> VAAI will NOT be used

View Full Clone (Template w/o Snapshot) –> VAAI WILL be used

View Linked Clone (VCAI) –> VAAI WILL be used

What I haven’t seen being made clear in any documentation thus far (and I’m not saying it isnt there, I’m simply saying I havent seen it!), is that VAAI WILL only work when the source and destination resides in the same container. This means consideration needs to be given as to the placement of ‘Master’ VDI images or with automated workloads from vCD or vCAC.

For example, if I have two containers on my Nutanix Cluster (Master Images and Desktops) with my master image residing in the master images container, yet I want to deploy desktops to the Desktops container VAAI will NOT be used.

I don’t see this as an issue, however more of a ‘Gotcha’ which needs to be considered at the design stage.

Learn important aspects of vSphere including administration, security, performance, and configuring vSphere Management Assistant (VMA) to run commands and scripts without the need to authenticate every attempt

VMware ESXi 5.1 Cookbook is a recipe-based guide to the administration of VMware vSphere

I’ve been working with VMware products for a number of years now and this book looked like a beginners guide. I was also a little disappointed that the book was based on vSphere 5.1 and not the most current release vSphere 5.5 even though the current release of vSphere was out 6 months before the book.

Who is the book for?

The book is primarily written for technical professionals with system administration skills and basic knowledge of virtualization who wish to learn installation, configuration, and administration of vSphere 5.1. Essential virtualization and ESX or ESXi knowledge is advantageous.

I personally would say it was for people who were new to Virtualization or deploying VMware vSphere products for the first time. Perhaps even a useful resource for management or project management who want to delve a little deeper into the technology. Virtualization concepts would be advantageous, however the book covers each step of a basic installation in good detail.

Areas Covered

The book is split into 9 chapters, aimed at covering a cradle to grave ‘basic’ vSphere installation.

Installing and Configuring ESXi

Installing and Using vCenter

Networking

Storage

Resource Management and High Availability

Managing Virtual Machines

Securing the ESXi Server and Virtual Machines

Performance Monitoring and Alerts

vSphere update Manager

The book reads and flows well, with the explanations clear and concise. The author does a good job explaining all concepts covered in the book.

Final Thoughts

If you are a seasoned vSphere administrator/architect this book probably isn’t for you. Saying this, it does act as a handy reference if there are areas of vSphere that you aren’t familiar with that you need to review. One thing I do like about this book, is all screenshots (where possible) are taken from the vSphere Web Client. As many of us know, the Web Client will be the only way to manage VMware infrastructure in the not too distant future, therefore for the old skool folk like myself it also acts as a handy reference to help complete tasks in this manner.

Overall, I would say the author has done a great job in what they set out to do. Create a quick fire reference for vSphere administration tasks.

I’ve spent the past 4 months on a fast paced VDI project built upon Nutanix infrastructure, hence the number of posts on this technology recently. The project is now drawing to a close and moving from ‘Project’ status to ‘BAU’. As this transition takes place, I’m tidying up notes and updating documentation. From this, you may see a few blog posts with some quick tips around Nutanix specifically with VMware vSphere architecture.

As you may or may not know, a Nutanix block ships with up to 4 nodes. The nodes are stand alone it terms of components and share only the dual power supplies in each block. Each node comes with a total of 5 network ports, as shown in the picture below.

Image courtesy of Nutanix

The IPMI port is a 10/100 ethernet network port for lights out management.

There are two 2 x 1GigE Ports and 2 x 10GigE ports. Both the 1GigE and 10GigE ports can be added to Virtual Standard Switches or a Virtual Distributed Switches in VMware. From what I have seen people tend to add the 10GigE NICs to a vSwitch (of either flavour) and configure them in an Active/Active fashion with the 2 x 1GigE ports remaining unused.

This seems to be resilient, however I discovered (whilst reading documentation, not through hardware failure) that the 2 x 10GigE ports actually reside on the same physical card, so this could be considered a single point of failure. To work around this single point of failure, I would suggest incorporating the 2 x 1GigE network ports into your vSwitch and leave them in Standby.

With this configuration, if the 10GigE card were to fail, the 1GigE cards would become active and you would not be impacted by VMware HA restarting machines in the on the remaining nodes in the cluster (Admission Control dependant) .

Yes, performance may well be impacted, however I’d strongly suggest alarms and monitoring be configured to scream if this were to happen. I would rather manually place a host into maintenance mode and evict my workloads in a controlled manner rather than have them restarted.

You are working on a large virtual desktop deployment using Active/Active datacenters, you have multiple use cases and multiple master images. With an Active/Active setup, your users have the possibility of being in DC1 one day, and DC2 the next.

So, what do you do when you have a requirement for the image to be available in case of a site failure? Nutanix make this easy for us, using protection domains and per-VM backups.

What is a protection domain?

A protection domain is a VM or group of VMs that can be backed up locally on a cluster or replicated on the same schedule to one or more clusters. Protection domains can then be associated with remote sites.

It is worth noting that protection domain names must be unique across sites and a VM can only reside in one protection domain.

A protection domain on a cluster will be in one of two modes:

Active – Manages live VMs, makes, replicates and expires snapshots

Inactive – Receives snapshots from a remote cluster

A Protection Domain manages replication via a Consistency Group.

What is a consistency group?

A Consistency Group is a subset of the VMs within the Protection Domain. All VMs within a Consistency Group will be snapshotted in a crash-consistent manner and have snapshots created at each replication interval.

What is a snapshot?

A snapshot is a read-only copy of the state and data of a VM at a point in time. Snapshots for a VM are crash consistent. This means that the VMDK on disk images are consistent with a single point in time. The snapshot represents the on disk data as if the VM crashed. These snapshots are not however application consistent meaning the application data is not quiesced at the time of the snapshot. With some server workloads this could cause us some issues for recovery, however for our VDI master image this is not an issue – the master image is likely going to be powered off the majority of the time. Snapshots are copied asynchronously from one cluster to another.

What are per VM Backups?

A per VM backup give the ability to designate certain VMs for backup to a different site, such as a group of desktop master images. Not all legacy storage vendors offer the ability to replicate at a VM level, normally an entire LUN or Volume replicated at a single time.

Where am I going with this?

There are many solutions to replicate data, however Nutanix provides this capability, albeit at a small cost, within its platform. No additional components are necessary and it even has an SRM plugin. The key feature is Nutanix integrates with vSphere to make this is a seamless process.

I’ve been working with VMware products for a number of years now, from fairly simple small environments to enterprise level complex environments. The area that always crops up as my weakness is Networking. It’s an area I never really had much involvement in when working my way through the ranks of helpdesk and Wintel server administration, however is an extremely important factor in a successful VMware deployment. Finally after years of waiting along comes this little gem from VMware Press authored by VCDX Chris Wahl and Steven Pantol. Being an avid reader of Chris’ Blog the Wahl Network I had high hopes for this book and I’m pleased to say I was not let down.

The book is split into four sections:

Networking 101 – The very basics

Virtual Switching – The differences from physical

Storage Networking – A look at IP storage

Design Scenarios – A look at vSwitch example configurations

This to me was a great layout, starting at the very basics slowly easing you into the more technical matter.

Given the title of the book, prior to reading I would have put this in the ‘deep technical’ category, however both authors have a great writing style and their sense of humour really comes through making the book a pleasure to read.

This book is a must have for any VMware admin and one that I only wish was available a few years back!

Over the past few weeks, I’ve been involved in a fast paced VDI project that is planned to scale up to 10K seats.

Of late, I’ve not had much involvement with VDI projects and have been focussing more on Private Cloud projects, however this project quickly got my attention as Nutanix were the chosen vendor for the solution.

Given this constraint, the design process was easy, made even easier with the first phase use case.

For those not already aware of Nutanix, here is a great video explaining How Nutanix Works and gives a great insight into the offering.

I wanted to share some of the configuration changes we made to during the build phase and to the vSphere 5.5 platform.

Nutanix Build

First off, Nutanix have recently changed the way they ship their blocks to site. They used to be shipped with a specific flavour of VMware ESXi, however with added support for KVM and Hyper-V, as well as there being a number of different ESXi versions used in the workplace, they found customers always wanting to change the out of the box software. Nutanix now installs the Nutanix Operating System (NOS) controller virtual machine (CVM) and a KVM hypervisor at the factory before shipping. If you want to use a different hypervisor (such as VMware ESXi) nodes must be re-imaged on site and the NOS CVM needs to be reinstalled. Sound daunting? Well, it’s not really.

Nutanix will provide you with two tools named Orchestrator (Node imaging tool) and Pheonix (Nutanix Installer ISO). Once you have these files and your chosen hypervisor ISO, you’ll need to download and install Oracle VM VirtualBox to get underway. This process is very well documented so i’m not going to replay that here, however I would suggest:

Ensure the laptop/desktop you are configuring the Nutanix block from has IPv6 enabled. IPv6 is used for the initial cluster initialization process.

If configuring multiple Nutanix blocks, only image 4 nodes at a time. We attempted 8 at a time, however imaging this many nodes at a time proved troublesome for the installer.

Keep your ESXi VMkernels and CVM VM’s on the same subnet. This is well documented, however for security we had to attempt to split these on different VLANs. This caused some issues with the auto-pathing feature.

I’ll point out here, after we configured out first, 2 block, 8 node cluster, we decided to manually install the second 2 block 8 node cluster and skip using the automated imaging process. This process again is very well documented and took less than 1 hour to have 8 VMware ESXi nodes up and running with storage presented to all nodes. Compare that to the traditional way and this is still a very impressive setup time.

When you’ve built your nodes, boot them into the BIOS and configure the Power Technology from Energy Efficient to Max Performance to ensure power savings don’t dampen performance.

vSphere Configuration

In this particular environment, we were making use of Nutanix de-duplication, which increases the overhead on each CVM. We therefore increased the RAM on each CVM from the default of 16GB to 32GB and set a vSphere reservation to ensure it always has this physical RAM available.

Nutanix have done a very good job with their documentation which detail recommended configurations per vendor for things such as HA and DRS in vSphere. After reading an Example Architecture blog by Josh Odgers I decided to add an advanced parameter to my HA configuration to change the default isolation address by adding “das.isolationaddress1 | Nutanix Cluster IP address”. I chose the cluster IP address over a CVM IP address for a simple reason. If a CVM hosting the cluster IP address fails, the cluster IP automatically is moved to another CVM in the cluster. The cluster IP is a new configuration option that was released for Hyper-V support, but we can make good use of it in the VMware world.

Each CVM resides on local SSD storage in the form of a 60GB SATA DOM. When you logon to the vSphere client and try and deploy a new workload you will have the option of deploying out to this 60GB SATA DOM SSD storage. This deployment was solely going to be used for a VDI project, therefore all workloads would be provisioned directly from a broker meaning we can control in the broker which datastores the workloads will reside in. So, to avoid any confusion and to stop an over eager admin deploying out to the SATA DOM disk, I created a Datastore Cluster with all Storage DRS and IO options disabled and named it along the lines of “Nutanix NOS Datastore – Do Not Use”.

Overall, the Nutanix devices are very easy to deploy and you can be up and running in next to no time. Now, I’ve managed to get all this down in a post, I can do some performance tests for my next post!

“Why do I need to bother backing up the config file of my vCNS Manager, can’t I just snapshot it?”

It’s a good question, and one that involved a little lab testing to play around with.

If you were to snapshot your vCNS manager,which does work from testing in my lab (albeit limited functional testing), then you are able to restore the vCNS manager from snapshot fairly efficiently and quickly.

The questions I then thought of were:

When is the backup window? (if there is one)

How often would a vCNS snapshot be taken?

How busy is the vCNS manager?

Does a backup restore involve change control or other teams?

The reason for these questions in my head were simple.

If a vCNS manager was in a relatively busy vCloud environment deploying a number of Edge devices daily, then yes they would continue to run if the manager were to fail, but if the vCNS manager were only scheduled to have a daily snapshot during a nightly backup window, then there could be an issue with unknown Edge devices after the restore from backup.

The official supported method of backing up vCNS manager is to schedule a backup from the manager itself, to backup the configuration to an FTP/SFTP site.

If the vCNS manager were to fail, you would simply deploy a new vCNS manager (normally within minutes) then re-apply the last saved configuration and get back up and running fairly quickly. Yes, you could argue that if only a single backup was taken daily then we would be in the same boat as with a snapshot, however, It’s much easier and more manageable, in my opinion, to set perhaps an hourly backup (in busy environments) and perhaps only keep a days worth of backup files.

After some debate with my client, my recommendation was to ‘keep it simple’. This meant, stay within the realms of vendor recommendation and support. Configure an hourly backup and keep a single days worth of backups. In the case of a failed and unrecoverable vCNS manger, deploy a new appliance and restore the configuration.

I’d be interested to hear any feedback from others as to what they do in their environments or in fact recommend to others.