Archive for November, 2010

I wanted to pass-on some information posted by Joerg Moellenkamp at c0t0d0s0.org – some good news for Sun/ZFS users out there about Solaris Express 2010.11 availability, links to details on ZFS encryption features in Solaris 11 Express and clarification on “production use” guidelines. Here’s the pull quotes from his posting:

“Darren (Moffat) wrote three really interesting articles about ZFS encryption: The first one is Introducing ZFS Crypto in Oracle Solaris 11 Express. This blog entry gives you a first overview how to use encryption for ZFS datasets. The second one…”

“There is a long section in the FAQ about licensing and production use: The OTN license just covers development, demo and testing use (Question 14) . However you can use Solaris 11 Express on your production system as well…”

In this In-the-Lab segment we’re going to look at how to recover from a failed ZFS version update in case you’ve become ambitious with your NexentaStor installation after the last Short-Take on ZFS/ZPOOL versions. If you used the “root shell” to make those changes, chances are your grub is failing after reboot. If so, this blog can help, but before you read on, observe this necessary disclaimer:

NexentaStor is an appliance operating system, not a general purpose one. The accepted way to manage the system volume is through the NMC shell and NMV web interface. Using a “root shell” to configure the file system(s) is unsupported and may void your support agreement(s) and/or license(s).

That said, let’s assume that you updated the syspool filesystem and zpool to the latest versions using the “root shell” instead of the NMC (i.e. following a system update where zfs and zpool warnings declare that your pool and filesystems are too old, etc.) In such a case, the resulting syspool will not be bootable until you update grub (this happens automagically when you use the NMC commands.) When this happens, you’re greeted with the following boot prompt:

grub>

Grub is now telling you that it has no idea how to boot your NexentaStor OS. Chances are there are two things that will need to happen before your system boots again:

We’ll update both in the same recovery session to save time (this assumes you know or have a rough idea about your intended boot checkpoint – it is usually the highest numbered rootfs-nmu-NNN checkpoint, where NNN is a three digit number.) The first step is to load the recovery console. This could have been done from the “Safe Mode” boot menu option if grub was still active. However, since grub is blown-away, we’ll boot from the latest NexentaStor CD and select the recovery option from the menu.

Import the syspool

Then, we login as “root” (empty password.) From this “root shell” we can import the existing (disks connected to active controllers) syspool with the following command:

# zpool import -f syspool

Note the use of the “-f” card to force the import of the pool. Chances are, the pool will not have been “destroyed” or “exported” so zpool will “think” the pool belongs to another system (your boot system, not the rescue system). As a precaution, zpool assumes that the pool is still “in use” by the “other system” and the import is rejected to avoid “importing an imported pool” which would be completely catastrophic.

With the syspool imported, we need to mount the correct (latest) checkpointed filesystem as our boot reference for grub, destroy the local zfs.cache file (in case the pool disks have been moved, but still all there), update the boot archive to correspond to the mounted checkpoint and install grub to the disk(s) in the pool (i.e. each mirror member).

List the Checkpoints

# zfs list -r syspool

From the resulting list, we’ll pick our highest-numbered checkpoint; for the sake of this article let’s say it’s “rootfs-nmu-013” and mount it.

Install Grub to Each Mirror Disk

Unmount and Reboot

# umount /tmp/syspool
# sync
# reboot

Now, the system should be restored to a bootable configuration based on the selected system checkpoint. A similar procedure can be found on Nexenta’s site when using the “Safe Mode” boot option. If you follow that process, you’ll quickly encounter an error – likely intentional and meant to elicit a call to support for help. See if you can spot the step…

As features are added to ZFS – the ZFS (filesystem) code may change and/or the underlying ZFS POOL code may change. When features are added, older versions of ZFS/ZPOOL will not be able to take advantage of these new features without the ZFS filesystem and/or pool being updated first.

Since ZFS filesystems exist inside of ZFS pools, the ZFS pool may need to be upgraded before a ZFS filesystem upgrade may take place. For instance, in ZFS pool version 24, support for system attributes was added to ZFS. To allow ZFS filesystems to take advantage of these new attributes, ZFS filesystem version 4 (or higher) is required. The proper order to upgrade would be to bring the ZFS pool up to at least version 24, and then upgrade the ZFS filesystem(s) as needed.

Systems running a newer version of ZFS (pool or filesystem) may “understand” an earlier version. However, older versions of ZFS will not be able to access ZFS streams from newer versions of ZFS.

For NexentaStor users, here are the current versions of the ZFS filesystem (see “zfs upgrade -v”):

I came across a recent post by Chad Sakac (VP, VMware Alliance at EMC) discussing the issue of how vendors drive customer specifications down from broader goals to individual features or implementation sets (I’m sure VCE was not in mind at the time.) When it comes to vendors insist on framing the “client argument” in terms of specific features and proprietary approaches, I have to agree that Chad is spot on. Here’s why:

First, it helps when vendors move beyond the “simple thinking” of infrastructure elements as a grid of point solutions and more of an “organic marriage of tools” – often with overlapping qualities. Some marriages begin with specific goals, some develop them along the way and others change course drastically and without much warning. The rigidness of point approaches rarely accommodates growth beyond the set of assumptions that created the it in the first place. Likewise, the “laser focus” on specific features detracts from the overall goal: the present and future value of the solution.

When I married my wife, we both knew we wanted kids. Some of our friends married and “never” wanted kids, only to discover a child on the way and subsequent fulfillment through raising them. Still, others saw a bright future strained with incompatibility and the inevitable divorce. Such is the way with marriages.

Second, it takes vision to solve complex problems. Our church (Church of the Highlands in Birmingham, Alabama) takes a very cautious position on the union between souls: requiring that each new couple seeking a marriage give it the due consideration and compatibility testing necessary to have a real chance at a successful outcome. A lot of “problems” we would encounter were identified before we were married, and when they finally popped-up we knew how to identify and deal with them properly.

Couples that see “counseling” as too obtrusive (or unnecessary) have other options. While the initial investment of money are often equivalent, the return on investment is not so certain. Uncovering incompatibilities “after the sale” provides for difficult and too often a doomed outcome (hence, 50% divorce rate.)

This same drama plays out in IT infrastructures where equally elaborate plans, goals and unexpected changes abound. You date (prospecting and trials), you marry (close) and are either fruitful (happy client), disappointed (unfulfilled promises) or divorce. Often, it’s not the plan that failed but the failure to set/manage expectations and address problems that causes the split.

Our pastor could not promise that our marriage would last forever: our success is left to God and the two of us. But he did help us to make decisions that would give us a chance at a fruitful union. Likewise, no vendor can promise a flawless outcome (if they do, get a second opinion), but they can (and should) provide the necessary foundation for a successful marriage of the technology to the business problem.

Third, the value of good advice is not always obvious and never comes without risk. My wife and I were somewhat hesitant on counseling before marriage because we were “in love” and were happy to be blind to the “problems” we might face. Our church made it easy for us: no counseling, no marriage. Businesses can choose to plot a similar course for their clients with respect to their products (especially the complex ones): discuss the potential problems with the solution BEFORE the sale or there is no sale. Sometimes this takes a lot of guts – especially when the competition takes the route of oversimplification. Too often IT sales see identifying initial problems (with their own approach) as too high a risk and too great an obstacle to the sale.

Ultimately, when you give due consideration to the needs of the marriage, you have more options and are better equipped to handle the inevitable trials you will face. Whether it’s an unexpected child on the way, or an unexpected up-tick in storage growth, having the tools in-hand to deal with the problem lessens its severity. The point is, being prepared is better than the assumption of perfection.

Finally, the focus has to be what YOUR SOLUTION can bring to the table: not how you think your competition will come-up short. In Chad’s story, he’s identified vendors disqualifying one another’s solutions based on their (institutional) belief (or disbelief) in a particular feature or value proposition. That’s all hollow marketing and puffery to me, and I agree completely with his conclusion: vendors need to concentrate on how their solution(s) provide present and future value to the customer and refrain from the “art” of narrowly framing their competitors.

Features don’t solve problems: the people using them do. The presence (or absence) of a feature simply changes the approach (i.e. the fallacy of feature parity). As Chad said, it’s the TOTALITY of the approach that derives value – and that goes way beyond individual features and products. It’s clear to me that a lot of counseling takes place between Sakac’s EMC team and their clients to reach those results. Great job, Chad, you’ve set a great example for your team!

Virtual machines were once relegated to a second class status of single-core vCPU configurations. To get multiple process threads, you had to add to add one “virtual CPU” for each thread. This approach, while functional, had potential serious software licensing ramifications. This topic drew some attention on Jason Boche’s blog back in July, 2010 with respect to vSphere 4.1.

With vSphere 4U2 and vSphere 4.1 you have the option of using an advanced configuration setting to change the “virtual cores per socket” to allow thread count needs to have a lesser impact on OS and application licensing. The advanced configuration parameter name is “cpuid.coresPerSocket” (default 1) and acts as a divisor for the virtual hardware setting “CPUs” which must be an integral multiple of the “cpuid.coresPerSocket” value. More on the specifics and limitations of this setting can be found in “Chapter 7, Configuring Virtual Machines” (page 79) of the vSphere Virtual Machine Administrator Guide for vSphere 4.1. [Note: See also VMware KB1010184.]

The value of “cpuid.coresPerSocket” is effectively ignored when “CPUs” is set to 1. In case “cpuid.coresPerSocket” is an imperfect divisor, the power-on operation will fail with the following message in the VI Client’s task history:

While the configuration guide clearly states (as Jason Boche rightly pointed out in his blog):

The number of virtual CPUs must be divisible by the number of cores per socket. The coresPerSocketsetting must be a power of two.

– Virtual Machine Configuration Guide, vSphere 4.1

We’ve found that “cpuid.coresPerCPU” simply needs to be a perfect divisor of the “CPUs” value. This tracks much better with prior versions of vSphere where “odd numbered” socket/CPU counts were allowed, so therefore odd numbers of cores-per-CPU allowed provided the division of CPUs by coresPerCPU is integral. Suffice to say, if the manual says “power of two” (1, 2, 4, 8, etc.) then those are likely the only “supported” configuration available. Any other configuration that “works” (i.e. 3, 5, 6, 7, etc.) will likely be unsupported by VMware in the event of a problem.

That said, odd values of “cpuid.coresPerCPU” do work just fine. Since SOLORI has a large number of AMD-only eco-systems, it is useful to test configurations that match the physical core count of the underlying processors (i.e. 2, 3, 4, 6, 8, 12). For instance, we were able to create a single, multi-core virtual CPU with 3-cores (CPUs = 3, cpuid.coresPerSocket = 3) and run Windows Server 2003 without incident:

Windows Server 2003 with virtual "tri-core" CPU

It follows, then, that we were likewise able to run a 2P virtual machine with a total of 6-cores (3-per CPU) running the same installation of Windows Server 2003 (CPUs = 6, cpuid.coresPerSocket = 3):

Virtual Dual-processor (2P), Tri-core (six cores total)

Here are the relevant vmware log messages associated with this 2P, six total core virtual machine boot-up:

It’s clear from the log that each virtual core spawns a new virtual machine monitor thread within the VMware kernel. Confirming the distribution of cores from the OS perspective is somewhat nebulous due to the mismatch of the CPU’s ID (follows the physical CPU on the ESX host) and the “arbitrary” configuration set through the VI Client. CPU-z shows how this can be confusing:

CPU#1 as described by CPU-z

CPU#2 as described by CPU-z

Note that CPU-z identifies the first 4-cores with what it calls “Processor #1” and the remaining 2-cores with “Processor #2” – this appears arbitrary due to CPU-z’s “knowledge” of the physical CPU layout. In (virtual) reality, this assessment by CPU-z is incorrect in terms of cores per CPU, however it does properly demonstrate the existence of two (virtual) CPUs. Here’s the same VM with a “cpuid.coresPerSocket” of 6 (again, not 1, 2, 4 or 8 as supported ):

How does this help with per-CPU licensing in a virtual world? It effectively evens the playing field between physical and virtual configurations. In the past (VI3 and early vSphere 4) multiple virtual threads were only possible through the use of additional virtual sockets. This paradigm did not track with OS licensing and CPU-socket-aware application licensing since the OS/applications would recognize the additional threads as CPU sockets in excess of the license count.

Also, in NUMA systems where core/socket/memory affinity is a potential performance issue, addressing physical/virtual parity is potentially important. This could have performance implications for AMD 2400/6100 and Intel 5600 systems where 6 and 12 cores/threads are delivered per physical CPU socket.

Popular Posts

In Medio Stat Veritas

SOLORI's Take and Quick Take posts express my personal opinion unless explicitly attributed to other sources. Where possible, supporting facts are presented to properly frame and ground these opinions, however they are presented "AS-IS" without regard to warranty or promise: expressed or implied.

Comments are open to all registered users and may be edited for decorum. Spam is deleted with prejudice.