Posts Tagged ‘DRS’

The authors of this new book really need no introduction. Duncan Epping and Frank Denneman. Both hail from the Netherlands and that company I talk about from time to time – VMware. The title of the book is of course VMware vSphere 5 Clustering technical deepdive and is available in three formats:

Kindle

Paperback (B&W)

Paperback (Color)

I’ve ordered the color paperback version and I also picked up the Kindle version for my iPad and iPhone 4 the day the book was announced – Tuesday July 12th, 2011. It’s quite ironic that this vSphere 5 book was debuted the same day VMware made its public announcement about vSphere 5, SRM 5, vCD 1.5, and the new vSA. I’m guessing VMware timed the release of its new cloud platform with Duncan and Frank’s new book. Steve Herrod didn’t get to where he is today without a solid background in strategy and tactics.

This is not a comprehensive book review. I’d be lying through my teeth if I said I had already finished this book. The fact is, having only the Kindle version right now, I’ve only glanced at it. I much prefer my books in hard copy format. I like to write a lot of notes and discussion points in the margins. However, the Kindle version makes a great searchable reference tool and I’ll almost always have the electronic copy with me on one of my Apple products. Add to that I’m currently a TE on another book project which keeps me busy along with the blog, my day job, and my vSphere 5 lab. There are seriously not enough hours in the day for a VMware enthusiast.

Duncan and Frank’s previous collaboration was the authoritative source on HA and DRS (as well as DPM). As you might have guessed from the title, this book covers more than just HA and DRS. The authors have built on the success from the previous edition by refreshing the HA, DRS, and DPM sections. From there they added additional content relevant to vSphere 5 clustering such as EVC, SIOC, and SDRS. At the moment, I don’t see much in the way of networking but in fairness, I’ll save the final review until after I have finished the book. 348 pages of vSphere 5 clustering technical deepdive is going to be thoroughly enjoyable. I’m really looking forward to digging in!

At 9am PDT this morning, Paul Maritz and Steve Herrod take the stage to announce the next generation of the VMware virtualized datacenter. Each new product and set of features are impressive in their own right. Combine them and what you have is a major upgrade of VMware’s entire cloud infrastructure stack. I’ll highlight the major announcements and some of the detail behind them. In addition, the embargo and NDA surrounding the vSphere 5 private beta expires. If you’re a frequent reader of blogs or the Twitter stream, you’re going to bombarded with information at fire-hose-to-the-face pace, starting now.

vSphere 5.0 (ESXi 5.0 and vCenter 5.0)

At the heart of it all is a major new release of VMware’s type 1 hypervisor and management platform. Increased scalability and new features make virtualizing those last remaining tier 1 applications quantifiable.

ESX and the Service Console are formally retired as of this release. Going forward, we have just a single hypervisor to maintain and that is ESXi. Non-Windows shops should find some happiness in a Linux based vCenter appliance and sophisticated web client front end. While these components are not 100% fully featured yet in their debut, they come close.

Storage DRS is the long awaited compliment to CPU and memory based DRS introduced in VMware Virtual Infrastructure 3. SDRS will coordinate initial placement of VM storage in addition to keeping datastore clusters balanced (space usage and latency thresholds including SIOC integration) with or without the use of SDRS affinity rules. Similar to DRS clusters, SDRS enabled datastore clusters offer maintenance mode functionality which evacuates (Storage vMotion or cold migration) registered VMs and VMDKs (still no template migration support, c’mon VMware) off of a datastore which has been placed into maintenance mode. VMware engineers recognize the value of flexibility, particularly when it comes to SDRS operations where thresholds can be altered and tuned on a schedule basis. For instance, IO patterns during the day when normal or peak production occurs may differ from night time IO patterns when guest based backups and virus scans occur. When it comes to SDRS, separate thresholds would be preferred so that SDRS doesn’t trigger based on inappropriate thresholds.

Profile-Driven Storage couples storage capabilities (VASA automated or manually user-defined) to VM storage profile requirements in an effort to meet guest and application SLAs. The result is the classification of a datastore, from a guest VM viewpoint, of Compatible or Incompatible at the time of evaluating VM placement on storage. Subsequently, the location of a VM can be automatically monitored to ensure profile compliance.

I mentioned VASA previously which is a new acronym for vSphere Storage APIs for Storage Awareness. This new API allows storage vendors to expose topology, capabilities, and state of the physical device to vCenter Server management. As mentioned earlier, this information can be used to automatically populate the capabilities attribute in Profile-Driven Storage. It can also be leveraged by SDRS for optimized operations.

The optimal solution is to stack the functionality of SDRS and Profile-Driven Storage to reduce administrative burden while meeting application SLAs through automated efficiency and optimization.

If you look closely at all of the announcements being made, you’ll notice there is only one net-new release and that is the vSphere Storage Appliance (VSA). Small to medium business (SMB) customers are the target market for the VSA. These are customers who seek some of the enterprise features that vSphere offers like HA, vMotion, or DRS but lack the fibre channel SAN, iSCSI, or NFS shared storage requirement. A VSA is deployed to each ESXi host which presents local RAID 1+0 host storage as NFS (no iSCSI or VAAI/SAAI support at GA release time). Each VSA is managed by one and only one vCenter Server. In addition, each VSA must reside on the same VLAN as the vCenter Server. VSAs are managed by the VSA Manager which is a vCenter plugin available after the first VSA is installed. It’s function is to assist in deploying VSAs, automatically mounting NFS exports to each host in the cluster, and to provide monitoring and troubleshooting of the VSA cluster.

You’re probably familiar with the concept of a VSA but at this point you should start to notice the differences in VMware’s VSA: integration. In addition, it’s a VMware supported configuration with “one throat to choke” as they say. Another feature is resiliency. The VSAs on each cluster node replicate with each other and if required will provide seamless fault tolerance in the event of a host node failure. In such a case, a remaining node in the cluster will take over the role of presenting a replica of the datastore which went down. Again, this process is seamless and is accomplished without any change in the IP configuration of VMkernel ports or NFS exports. With this integration in place, it was a no-brainer for VMware to also implement maintenance mode for VSAs. MM comes in to flavors: Whole VSA cluster MM or Single VSA node MM.

VMware’s VSA isn’t a freebie. It will be licensed. The figure below sums up the VSA value proposition:

High Availability (HA) has been enhanced dramatically. Some may say the version shipping in vSphere 5 is a complete rewrite. What was once foundational Legato AAM (Automated Availability Manager) is now finally evolving to scale further with vSphere 5. Some of the new features include elimination of common issues such as DNS resolution, node communication between management network as well as storage along with failure detection enhancement. IPv6 support, consolidated logging into one file per host, enhanced UI and enhanced deployment mechanism (as if deployment wasn’t already easy enough, albeit sometimes error prone).

From an architecture standpoint, HA has changed dramatically. HA has effectively gone from five (5) fail over coordinator hosts to just one (1) in a Master/Slave model. No more is there a concept of Primary/Secondary HA hosts, however if you still want to think of it that way, it’s now one (1) primary host (the master) and all remaining hosts would be secondary (the slaves). That said, I would consider it a personal favor if everyone would use the correct version specific terminology – less confusion when assumptions have to be made (not that I like assumptions either, but I digress).

The FDM (fault domain manager) Master does what you traditionally might expect: monitors and reacts to slave host & VM availability. It also updates its inventory of the hosts in the cluster, and the protected VMs each time a VM power operation occurs.

Slave hosts have responsibilities as well. They maintain a list of powered on VMs. They monitor local VMs and forward significant state changes to the Master. They provide VM health monitoring and any other HA features which do not require central coordination. They monitor the health of the Master and participate in the election process should the Master fail (the host with the most datastores and then the lexically highest moid [99>100] wins the election).

Another new feature in HA the ability to leverage storage to facilitate the sharing of stateful heartbeat information (known as Heartbeat Datastores) if and when management network connectivity is lost. By default, vCenter picks two datastores for backup HA communication. The choices are made by how many hosts have connectivity and if the storage is on different arrays. Of course, a vSphere administrator may manually choose the datastores to be used. Hosts manipulate HA information on the datastore based on the datastore type. On VMFS datastores, the Master reads the VMFS heartbeat region. On NFS datastores, the Master monitors a heartbeat file that is periodically touched by the Slaves. VM availability is reported by a file created by each Slave which lists the powered on VMs. Multiple Master coordination is performed by using file locks on the datastore.

As discussed earlier, there are a number of GUI enhancements which were put in place to monitor and configure HA in vSphere 5. I’m not going to go into each of those here as there are a number of them. Surely there will be HA deep dives in the coming months. Suffice it to say, they are all enhancements which stack to provide ease of HA management, troubleshooting, and resiliency.

Another significant advance in vSphere 5 is Auto Deploy which integrates with Image Builder, vCenter, and Host Profiles. The idea here is centrally managed stateless hardware infrastructure. ESXi host hardware PXE boots an image profile from the Auto Deploy server. Unique host configuration is provided by an answer file or VMware Host Profiles (previously an Enterprise Plus feature). Once booted, the host is added to vCenter host inventory. Statelessness is not necessarily a newly introduced concept, therefore, the benefits are strikingly familiar to say ESXi boot from SAN: No local boot disk (right sized storage, increased storage performance across many spindles), scales to support of many hosts, decoupling of host image from host hardware – statelessness defined. It may take some time before I warm up to this feature. Honestly, it’s another vCenter dependency, this one quite critical with the platform services it provides.

For a more thorough list of anticipated vSphere 5 “what’s new” features, take a look at this release from virtualization.info.

vCloud Director 1.5

Up next is a new release of vCloud Director version 1.5 which marks the first vCD update since the product became generally available on August 30th, 2010. This release is packed with several new features.

Fast Provisioning is the space saving linked clone support missing in the GA release. Linked clones can span multiple datastores and multiple vCenter Servers. This feature will go a long way in bridging the parity gap between vCD and VMware’s sun setting Lab Manager product.

3rd party distributed switch support means vCD can leverage virtualized edge switches such as the Cisco Nexus 1000V.

The new vCloud Messages feature connects vCD with existing AMQP based IT management tools such as CMDB, IPAM, and ticketing systems to provide updates on vCD workflow tasks.

vCD 1.5 adds support for vSphere 5 including Auto Deploy and virtual hardware version 8 (32 vCPU and 1TB vRAM). In this regard, VMware extends new vSphere 5 scalability limits to vCD workloads. Boiled down: Any tier 1 app in the private/public cloud.

Last but not least, vCD integration with vShield IPSec VPN and 5-tuple firewall capability.

vShield 5.0

VMware’s message about vShield is that it has become a fundamental component in consolidated private cloud and multi-tenant VMware virtualized datacenters. While traditional security infrastructure can take significant time and resources to implement, there’s an inherent efficiency in leveraging security features baked into and native to the underlying hypervisor.

There are no changes in vShield Endpoint, however, VMware has introduced static routing in vShield Edge (instead of NAT) for external connections and certificate-based VPN connectivity.

Site Recovery Manager 5.0

Another major announcement from VMware is the introduction of SRM 5.0. SRM has already been quite successful in providing simple and reliable DR protection for the VMware virtualized datacenter. Version 5 boasts several new features which enhance functionality.

Replication between sites can be achieved in a more granular per-VM (or even sub-VM) fashion, between different storage types, and it’s handled natively by vSphere Replication (vSR). More choice in seeding of the initial full replica. The result is a simplified RPO.

Another new feature in SRM is Planned Migration which facilitates the migration protected VMs from site to site before a disaster actually occurs. This could also be used in advance of datacenter maintenance. Perhaps your policy is to run your business 50% of the time from the DR site. The workflow assistance makes such migrations easier. It’s a downtime avoidance mechanism which makes it useful in several cases.

Failback can be achieved once the VMs are re protected at the recovery site and the replication flow is reversed. It’s simply another push of the big button to go the opposite direction.

Feedback from customers has influenced UI enhancements. Unification of sites into one GUI is achieved without Linked Mode or multiple vSphere Client instances. Shadow VMs take on a new look at the recovery site. Improved reporting for audits.

Other miscellaneous notables are IPv6 support, performance increase in guest VM IP customization, ability to execute scripts inside the guest VM (In guest callouts), new SOAP based APIs on the protected and recovery sides, and a dependency hierarchy for protected multi tiered applications.

In summary, this is a magnificent day for all of VMware as they have indeed raised the bar with their market leading innovation. Well done!

On Tuesday July 12th, VMware CEO Paul Maritz and CTO Steve Herrod are hosting a large campus and worldwide event where they plan to make announcements about the next generation of cloud infrastructure.

The event kicks off at 9am PDT and is formally titled “Raising the Bar, Part V”. You can watch it online by registering here. The itinerary is as follows:

Sometimes it’s the little things that can make life easier. This post is actually a fork from the one which I originally wrote previously on a DPM issue. Shortly after that post, it was pointed out that I was reading DPM Priority Recommendations incorrectly. Indeed that was the case.

Where did I go wrong? Look at the priority descriptions between DRS and DPM, where both slide bars are configured with the same aggressiveness:

My brief interpretation was that a higher recommendation meant a higher number (ie. priority 4 is a higher recommendation than priority 3). It’s actually the opposite that is true. A higher recommendation is a lower number.

I believe there is too much left open to interpretation on the DPM screen as to what is higher and what is lower. The DRS configuration makes sense because it’s clear as to what is going to be applied; no definition of high or low to be (mis-)interpreted. The fix? Make the DPM configuration screen mirror DRS configuration screen. Development consistency goes a long way. As a frequent user of the tools, I expect it. I view UI inconsistency as sloppy.

If you are a VMware DPM product manager, please see my previous post VMware DPM Issue.

I’ve been running into a DPM issue in the lab recently. Allow me briefly describe the environment:

3 vCenter Servers 4.1 in Linked Mode

1 cluster with 2 hosts

ESX 4.1, 32GB RAM, ~15% CPU utilization, ~65% Memory utilization, host DPM set for disabled meaning the host should never be placed in standby by DPM.

ESXi 4.1, 24GB RAM, ~15% CPU utilization, ~65% Memory utilization, host DPM set for automatic meaning the host is always a candidate to be placed in standby by DPM.

Shared storage

DRS and DPM enabled for full automation (both configured at Priority 4, almost the most aggressive setting)

Up until recently, the ESX and ESXi hosts weren’t as loaded and DPM was working reliably. Each host had 16GB RAM installed. When aggregate load was light enough, all VMs were moved to the ESX host and the ESXi host was placed in standby mode by DPM. Life was good.

There has been much activity in the lab recently. The ESX and ESXi host memory was upgraded to 32GB and 24GB respectively. Many VMs were added to the cluster and powered on for various projects. The DPM configuration remained as is. Now what I’m noticing is that with a fairly heavy memory load on both hosts in cluster, DPM moves all VMs to the ESX host and places the ESXi host in standby mode. This places a tremendous amount of memory pressure and over commit on the solitary ESX host. This extreme condition is observed by the cluster and nearly as quickly, the ESXi host is taken back out of standby mode to balance the load. Then maybe about an hour later, the process repeats itself again.

I then configured DPM to manual mode so that I could examine the recommendations being made by the calculator. The VMs were being evacuated for the purposes of DPM via a Priority 3 recommendation which is half way between Conservative and Aggressive recommendations.

What is my conclusion? I’m surprised at the perceived increase in aggressiveness of DPM. In order to avoid the extreme memory over commit, I’ll need to configure DPM slide bar for Priority 2. In addition, I’d like to get a better understanding of the calculation. I have a difficult time believing the amount of memory over commit being deemed acceptable in a neutral configuration (Priority 3) which falls half way between conservative and aggressive. In addition to that, I’m not a fan of a host continuously entering and exiting standby mode, along with the flurry of vMotion activity which results. This tells me that the calculation isn’t accounting for the amount of memory pressure which is actually occurring once a host goes into standby mode, or coincidentally there are significant shifts in the workload patterns shortly after each DPM operation.

“Haven’t we learned from Hollywood what happens when the machines become self-aware?”

I got a good chuckle. He took my comment of VMware becoming “self-aware” exactly where I wanted it to go. A reference to The Terminator series of films in which a sophisticated computer defense system called Skynet becomes self-aware and things go downhill for mankind from there.

Further testing in the lab revealed that this condition will be caused with a vCenter VM and/or a VMware Update Manager (VUM) VM. I understand from other colleagues on the Twitterverse that they’ve seen the same symptoms occur with patch staging.

The work around is to manually place the host in maintenance mode, at which time it has no problem whatsoever evacuating all VMs, including infrastructure VMs. At that point, the host in maintenance mode can be remediated.

VMware Update Manager has apparently become self-aware in that it detects when its infrastructure VMs are running on the same host hardware which is to be remediated. Self-awareness in and of itself isn’t bad, however, its feature integration is. Unfortunately for the humans, this is a step backwards in functionality and a reduction in efficiency for a task which was once automated. Previously, a remediation task had no problem evacuating all VMs from a host, infrastructure or not. What we have now is… well… consider the following pre and post “self-awareness” remediation steps:

Update 5/5/10: I received this response back on 3/5/10 from VMware but failed to follow up with finding out if it was ok to share with the public. I’ve received the blessing now so here it is:

[It] seems pretty tactical to me. We’re still trying to determine if this was documented publicly, and if not, correct the documentation and our processes.

We introduced this behavior in vSphere 4.0 U1 as a partial fix for a particular class of problem. The original problem is in the behavior of the remediation wizard if the user has chosen to power off or suspend virtual machines in the Failure response option.

If a stand-alone host is running a VM with VC or VUM in it and the user has selected those options, the consequences can be drastic – you usually don’t want to shut down your VC or VUM server when the remediation is in progress. The same applies to a DRS disabled cluster.

In DRS enabled cluster, it is also possible that VMs could not be migrated to other hosts for configuration or other reasons, such as a VM with Fault Tolerance enabled. In all these scenarios, it was possible that we could power off or suspend running VMs based on the user selected option in the remediation wizard.

To avoid this scenario, we decided to skip those hosts totally in first place in U1 time frame. In a future version of VUM, it will try to evacuate the VMs first, and only in cases where it can’t migrate them will the host enter a failed remediation state.

One work around would be to remove such a host from its cluster, patch the cluster, move the host back into the cluster, manually migrate the VMs to an already patched host, and then patch the original host.

It would appear VMware intends to grant us back some flexibility in future versions of vCenter/VUM. Let’s hope so. This implementation leaves much to be desired.

Update 5/6/10: LucD created a blog post titled Counter the self-aware VUM. In this blog post you’ll find a script which finds the ESX host(s) that is/are running the VUM guest and/or the vCenter guest and will vMotion the guest(s) to another ESX host when needed.

We can put a man on the moon and we can hot migrate virtual machines with SMP and gigs of RAM, but we can’t create anti-affinity rules with three or more VMs. This has been a thorn in my side since 2006, long before I requested it fixed in February 2007 on the VMTN Product and Feature Suggestions forum.

VMware updated KB article 1006473 on 3/26 outlining anti-affinity rule behavior when using three or more VMs:

“This is expected behavior, as anti affinity rules can be set only for 2 virtual machines.

When a third virtual machine is added any rule becomes disabled (with 2.0.2 or earlier).

There has been a slight change in behavior with VirtualCenter 2.5, wherein input validation occurs, where a third virtual machine added produces a warning message indicating a maximum of two virtual machines only can be added to this rule.

To workaround this, create more rules to cover all of the combinations of virtual machines.

That last sentence is what has been burning my cookies for the longest time. In my last environment, I had several NLB VMs which could not be on the same host for load balancing and redundancy purposes. Rather than create a minimum amount of rules to intelligently handle all of the VMs, I was left with no choice but to create several rules for each potentially deadly combination.