Archive for the ‘Management’ Category

Thanks to a spate of upgrades to vSphere 5.1, I recently (re)discovered the following inconvenient result when applying an update to a DRS cluster from Update Manager (5.1.0.13071, using vCenter Server Appliance 5.1.0 build 947673):

Immediately I thought: “Great! I left a host-only ISO connected to these VMs.” However, that assumption was as flawed as Update Manager’s assumption that the workloads cannot be vMotion’d without disconnecting the removable media. In fact, the removable media indicated was connected to a shared ISO repository available to all hosts in the cluster. However, I was to blame and not Update Manager, as I had not remembered that Update Manager’s default response to removable media is to abort the process. Since cluster remediation is a powerful feature made possible by Distributed Resource Scheduler (DRS) in Enterprise (and above) vSphere editions that may be new to the feature to many (especially uplifted “Advanced AK” users), it seemed like something worth reviewing and blogging about.

Why is this a big deal?

More the the point, why does this seem to run contrary to “a common sense” response?

First, the manual for remediation of a host in a DRS cluster would include:

Applying “Maintenance Mode” to the host,

Selecting the appropriate action for “powered-off and suspended” workloads, and

Allowing DRS to choose placement and finally vMotion those workloads to an alternate host.

In the case of VMs with removable media attached, this set of actions will result in the workloads being vMotion’d (without warning or hesitation) so long as the other hosts in the cluster have access to the removable media source (i.e. shared storage, not “Host Device.”) However, in the case of Update Manger remediation, the following are documented road blocks to a successful remediation (without administrative override):

A CD/DVD drive is attached (any method),

A floppy drive is attached (any method),

HA admission control prevents migration of the virtual machine,

DPM is enabled on the cluster,

EVC is disabled on the cluster,

DRS is disabled on the cluster (preventing migration),

Fault Tolerance (FT) is enabled for a VM on the host in the cluter.

Therefore it is “by design” that a scheduled remediation would have failed – even if the removable media would be eligible for vMotion. To assist in the evaluation of “obstacles to successful deferred remediation” a cluster remediation report is available (see below).

In fact, the report will list all possible road blocks to remediation whether or not matching overrides are selected (potentially misleading, certainly not useful for predicting the outcome of the remediation attempt). While this too is counter intuitive, it serves as a reminder of the show-stoppers to successful remediation. For the offending “removable media” override, the appropriate check-box can be found on the options page just prior to the remediation report:

Disabling removable media during Update Manager driven remediation.

The inclusion of this override allows Update Manager to slog through the remediation without respect to the attached status of removable media. Likewise, the other remediation overrides will enable successful completion of the remediation process; these overrides are:

Maintenance Mode Settings:

VM Power State prior to remediation: Do not change, Power off, Suspend

Temporarily disable any removable media devices;

Retry maintenance mode in case of failure (delay and attempts);

Cluster Settings:

Temporarily Disable Distributed Power Management (forces “sleeping” hosts to power-on prior to next steps in remediation);

These settings are available at the time of remediation scheduling and as host/cluster defaults (Update Manager Admin View.)

SOLORI’s Take: So while it follows that the remediation process is NOT as similar to the manual process as one might think, it still can be made to function accordingly (almost.) There IS a big difference between disabling removable media and making vMotion-aware decisions about hosts. Perhaps VMware could take a few cycles to determine whether or not a host is bound to a removable media device (either through Host Device or local storage resource) and make a more intelligent decision about removable media.

vSphere already has the ability to identify point-resource dependencies, it would be nice to see this information more intelligently correlated where cluster management is concerned. Currently, instead of “asking” DRS for a dependency list, it just seems to just ask the hosts “do you have removable media plugged-into any VM’s” – and if the answer is “yes” it stops right there… Still, not very intuitive for a feature (DRS) that’s been around since Virtual Infrastructure 3 and vCenter 2.

If you’re still not quite ready to upgrade from VI 3.x to vSphere, time may be running out on your ESX hosts to stay “current” inside of VI3 unless you act before June 1, 2011. If your VMware VI3 hosts have not been patched since November of 2010, you are at risk for losing update/patching capabilities unless you apply ESX350-201012410-BG before the deadline. This patch ONLY addresses the expiring secure key on the ESX host which will otherwise become invalid on June 1, 2011.

Enablement of Intel Xeon Processor 3400 Series – Support for the Intel Xeon processor 3400 series has been added. Support includes Enhanced VMotion capabilities. For additional information on previous processor families supported by Enhanced VMotion, see Enhanced VMotion Compatibility (EVC) processor support (KB 1003212).

Driver Update for Broadcom bnx2 Network Controller – The driver for bnx2 controllers has been upgraded to version 1.6.9. This driver supports bootcode upgrade on bnx2 chipsets and requires bmapilnx and lnxfwnx2 tools upgrade from Broadcom. This driver also adds support for Network Controller – Sideband Interface (NC-SI) for SOL (serial over LAN) applicable to Broadcom NetXtreme 5709 and 5716 chipsets.

Driver Update for LSI SCSI and SAS Controllers – The driver for LSI SCSI and SAS controllers is updated to version 2.06.74. This version of the driver is required to provide a better support for shared SAS environments.

Newly Supported Guest Operating Systems – Support for the following guest operating systems has been added specifically for this release:

This patch comes with a roll-up approach that VMware describes this way:

Note: As part of the end of availability for some VMware Virtual Infrastructure product releases, the ESX 3.5 Update 5 upgrade package ESX350-Update05.zip has been replaced by ESX350-Update05a.zip in order to remove dependencies upon patches that will no longer be available for download. Hosts upgraded using ESX350-Update05a.zip are equivalent to those upgraded using the older package, but patch bundles released before ESX 3.5 Update 5 will not be required during the upgrade process.

In February, we detailed the installation and first use of the VMware vCenter Mobile Access appliance (version 1.0.41). In that write up, we pointed out that vCMA had some security issues and said the following:

Being HTTP-only, vCMA doesn’t lend itself to secure computing over the public Internet or untrusted intranet. Instead, it is designed to work with security layer(s) in front of it. While it IS possible to add HTTPS to the Apache/Tomcat server delivering its web application, vCMA is meant to be deployed as-is and updated as-is – it’s an appliance.

SSL Connections
By default “https” (or SSL certificate) is enabled in the appliance for the vCMA for enhanced security. You can replace the out-of-the-box certificate with your own, if needed. However, http->https redirection is currently not supported.

Other deployment considerations

The vCMA server comes with a default userid/password. For security reasons, we strongly recommended that you change root password.

If you prefer, you can set a hostname or IP address for the appliance.

Using standard Linux utilities, you can change the date and time in the appliance.

You can also upgrade the hardware version and VMware Tools in the vCMA appliance following standard procedures.

SOLORI’s Take: This welcomed change circumvents any additional kludge work necessary to secure the appliance. Using an HTTPS proxy was cumbersome and kludgey in its own right and “hacking” the appliance was tricky and doomed to be reversed by the next appliance update. VMware’s move opens the door for more widespread use vCMA and (hopefully) more interesting applications of its use in the future.

In this In-the-Lab segment we’re going to look at how to recover from a failed ZFS version update in case you’ve become ambitious with your NexentaStor installation after the last Short-Take on ZFS/ZPOOL versions. If you used the “root shell” to make those changes, chances are your grub is failing after reboot. If so, this blog can help, but before you read on, observe this necessary disclaimer:

NexentaStor is an appliance operating system, not a general purpose one. The accepted way to manage the system volume is through the NMC shell and NMV web interface. Using a “root shell” to configure the file system(s) is unsupported and may void your support agreement(s) and/or license(s).

That said, let’s assume that you updated the syspool filesystem and zpool to the latest versions using the “root shell” instead of the NMC (i.e. following a system update where zfs and zpool warnings declare that your pool and filesystems are too old, etc.) In such a case, the resulting syspool will not be bootable until you update grub (this happens automagically when you use the NMC commands.) When this happens, you’re greeted with the following boot prompt:

grub>

Grub is now telling you that it has no idea how to boot your NexentaStor OS. Chances are there are two things that will need to happen before your system boots again:

We’ll update both in the same recovery session to save time (this assumes you know or have a rough idea about your intended boot checkpoint – it is usually the highest numbered rootfs-nmu-NNN checkpoint, where NNN is a three digit number.) The first step is to load the recovery console. This could have been done from the “Safe Mode” boot menu option if grub was still active. However, since grub is blown-away, we’ll boot from the latest NexentaStor CD and select the recovery option from the menu.

Import the syspool

Then, we login as “root” (empty password.) From this “root shell” we can import the existing (disks connected to active controllers) syspool with the following command:

# zpool import -f syspool

Note the use of the “-f” card to force the import of the pool. Chances are, the pool will not have been “destroyed” or “exported” so zpool will “think” the pool belongs to another system (your boot system, not the rescue system). As a precaution, zpool assumes that the pool is still “in use” by the “other system” and the import is rejected to avoid “importing an imported pool” which would be completely catastrophic.

With the syspool imported, we need to mount the correct (latest) checkpointed filesystem as our boot reference for grub, destroy the local zfs.cache file (in case the pool disks have been moved, but still all there), update the boot archive to correspond to the mounted checkpoint and install grub to the disk(s) in the pool (i.e. each mirror member).

List the Checkpoints

# zfs list -r syspool

From the resulting list, we’ll pick our highest-numbered checkpoint; for the sake of this article let’s say it’s “rootfs-nmu-013” and mount it.

Install Grub to Each Mirror Disk

Unmount and Reboot

# umount /tmp/syspool
# sync
# reboot

Now, the system should be restored to a bootable configuration based on the selected system checkpoint. A similar procedure can be found on Nexenta’s site when using the “Safe Mode” boot option. If you follow that process, you’ll quickly encounter an error – likely intentional and meant to elicit a call to support for help. See if you can spot the step…

As features are added to ZFS – the ZFS (filesystem) code may change and/or the underlying ZFS POOL code may change. When features are added, older versions of ZFS/ZPOOL will not be able to take advantage of these new features without the ZFS filesystem and/or pool being updated first.

Since ZFS filesystems exist inside of ZFS pools, the ZFS pool may need to be upgraded before a ZFS filesystem upgrade may take place. For instance, in ZFS pool version 24, support for system attributes was added to ZFS. To allow ZFS filesystems to take advantage of these new attributes, ZFS filesystem version 4 (or higher) is required. The proper order to upgrade would be to bring the ZFS pool up to at least version 24, and then upgrade the ZFS filesystem(s) as needed.

Systems running a newer version of ZFS (pool or filesystem) may “understand” an earlier version. However, older versions of ZFS will not be able to access ZFS streams from newer versions of ZFS.

For NexentaStor users, here are the current versions of the ZFS filesystem (see “zfs upgrade -v”):

Anyone who’s discussed storage with me knows that I “hate” desktop drives in storage arrays. When using SAS disks as a standard, that’s typically a non-issue because there’s not typically a distinction between “desktop” and “server” disks in the SAS world. Therefore, you know I’m talking about the other “S” word – SATA. Here’s a tale of SATA woe that I’ve seen repeatedly cause problems for inexperienced ZFS’ers out there…

When volumes fail in ZFS, the “final” indicator is data corruption. Fortunately, ZFS checksums recognize corrupted data and can take action to correct and report the problem. But that’s like treating cancer only after you’ve experienced the symptoms. In fact, the failing disk will likely begin to “under-perform” well before actual “hard” errors show-up as read, write or checksum errors in the ZFS pool. Depending on the reason for “under-performing” this can affect the performance of any controller, pool or enclosure that contains the disk.

Wait – did he say enclosure? Sure. Just like a bad NIC chattering on a loaded network, a bad SATA device can occupy enough of the available service time for a controller or SAS bus (i.e. JBOD enclosure) to make a noticeable performance drop in otherwise “unrelated” ZFS pools. Hence, detection of such events is an important thing. Here’s an example of an old WD SATA disk failing as viewed from the NexentaStor “Data Sets” GUI:

Something is wrong with device c5t84d0...

Device c5t84d0 is having some serious problems. Busy time is 7x higher than counterparts, and its average service time is 14x higher. As a member of a RAIDz group, the entire group is being held-back by this “under-performing” member. From this snapshot, it appears that NexentaStor is giving us some good information about the disk from the “web GUI” but this assumption would not be correct. In fact, the “web GUI” is only reporting “real time” data so long as the disk is under load. In the case of a lightly loaded zpool, the statistics may not even be reported.

However, from the command shell, historic and real-time access to per-device performance is available. The output of “iostat -exn” shows the count of all errors for devices since the last time counters were reset, and average I/O loads for each:

Device statistics from 'iostat' show error and I/O history.

The output of iostat clearly shows this disk has serious hardware problems. It indicates hardware errors as well as transmission errors for the device recognized as ‘c5t84d0’ and the I/O statistics – chiefly read, write and average service time – implicate this disk as a performance problem for the associated RAIDz group. So, if the device is really failing, shouldn’t there be a log report of such an event? Yes, and here’s a snip from the message log showing the error:

SCSI error with ioc_status=0x8048 reported in /var/log/messages for failing device.

However, in this case, the log is not “full” with messages of this sort. In fact, it only showed-up under the stress of an iozone benchmark (run from the NexentaStor ‘nmc’ console). I can (somewhat safely) conclude this to be a device failure since at least one other disk in this group is of the same make, model and firmware revision of the culprit. The interesting aspect about this “failure” is that it does not result in a read, write or checksum error for the associated zpool. Why? Because the device is only loosely coupled to the zpool as a constituent leaf device, and it also implies that the device errors were recoverable by either the drive or the device driver (mapping around a bad/hard error.)

Since these problems are being resolved at the device layer, the ZFS pool is “unaware” of the problem as you can see from the output of ‘zpool status’ for this volume:

Problems with disk device as yet undetected at the zpool layer.

This doesn’t mean that the “consumers” of the zpool’s resources are “unaware” of the problem, as the disk error has manifested itself in the zpool as higher delays, lower I/O through-put and subsequently less pool bandwidth. In short, if the error is persistent under load, the drive has a correctable but catastrophic (to performance) problem and will need to be replaced. If, however, the error goes away, it is possible that the device driver has suitably corrected for the problem and the drive can stay in place.

SOLORI’s Take: How do we know if the drive needs to be replaced? Time will establish an error rate. In short, running the benchmark again and watching the error counters for the device will determine if the problem persists. Eventually, the errors will either go away or they wont. For me, I’m hoping that the disk fails to give me an excuse to replace the whole pool with a new set of SATA “eco/green” disks for more lab play. Stay tuned…

SOLORI’s Take: In all of its flavors, 1.5Gbps, 3Gbps and 6Gbps, I find SATA drives inferior to “similarly” spec’d SAS for just about everything. In my experience, the worst SAS drives I’ve ever used have been more reliable than most of the SATA drives I’ve used. That doesn’t mean there are “no” good SATA drives, but it means that you really need to work within tighter boundaries when mixing vendors and models in SATA arrays. On top of that, the additional drive port and better typical sustained performance make SAS a clear winner over SATA (IMHO). The big exception to the rule is economy – especially where disk arrays are used for on-line backup – but that’s another discussion…

Until there is an updated release of the VMware vSphere Client, running the client on a Windows7 system will require a couple of tricks. While the basic process outlined in these notes accomplishes the task well, the use of additional “helper” batch files is not necessary. By adding the path to the “System.dll” library to your user’s environment, the application can be launched from the standard icon without further modification.

First, add the XML changes at the end of the “VpxClient.exe.config” file. The end of your config file will now look something like this:

Once the changes are made, save the “VpxClient.exe.config” file (if your workstation is secured, you may need “Administrator” privileges to save the file.) Next, copy the “System.dll” file from the “%WINDOWS%\Microsoft.NET\Framework\v2.0.50727” folder on an XP/Vista machine to a newly created “lib” folder in the VpxClient’s directory. Now, you will need to update the user environment to reflect the path to “System.dll” to complete the “developer” hack.

To do this, right-click on the “Computer” menu item on the “Start Menu” and select “Properties.” In the “Control Panel Home” section, click on “Advanced system settings” to open the “System Properties” control panel. Now, click on the “Environment Variables…” button to open the Environment Variables control panel. If “DEVPATH” is already defined, simply add a semi-colon (“;”) to the existing path and add the path to your copied “System.dll” file (not including “System.dll”) to the existing path. If it does not exist, create a new variable called “DEVPATH” and enter the path string in the “Variable Value” field.

The path begins with either %ProgramFiles% or %ProgramFiles(x86)% depending on whether or not 32-bit or 64-bit Windows7 is installed, respectively. Once the path is entered into the environment and the “System.dll” file is in place, the vSphere Client will launch and run without additional modification. Remember to remove the DEVPATH modification to the environment when a Windows7 vSphere Client is released.

Note that this workaround is not supported by VMware and that the use of the DEVPATH variable could have unforseen consequences in your specific computing environment. Therefore, appropriate considerations should be made prior to the implementation of this “hack.” While SOLORI presents this information “AS-IS” without warranty of any kind, we can report that this workaround is effective for our Windows7 workstations, however …

In Medio Stat Veritas

SOLORI's Take and Quick Take posts express my personal opinion unless explicitly attributed to other sources. Where possible, supporting facts are presented to properly frame and ground these opinions, however they are presented "AS-IS" without regard to warranty or promise: expressed or implied.

Comments are open to all registered users and may be edited for decorum. Spam is deleted with prejudice.