Archive for the ‘Operating Systems’ Category

Our open storage partner, Nexenta Systems Inc., hit a milestone this month by releasing NexentaStor 4.0.1 for general availability. This release is significant mainly because it is the first commercial release of NexentaStor based on the Open Source Illumos kernel and not Oracle’s OpenSolaris (now closed source). With this move, NexentaStor’s adhering to the company’s promise of “open source technology” that enables hardware independence and targeted flexibility.

Some highlights in 4.0.1:

Faster Install times

Better HA Cluster failover times and “easier” cluster manageability

Support for large memory host configurations – up to 512GB of DRAM per head/controller

Note, the 18TB Community Edition EULA is still hampered by the “non-commercial” language, restricting it’s use to home, education and academic (ie. training, testing, lab, etc.) targets. However, the “total amount of Storage Space” license for Community is a deviation from the Enterprise licensing (typical “raw” storage entitlement)

2.2 If You have acquired a Community Edition license, the total amount of Storage Space is limited as specified on the Site and is subject to change without notice. The Community Edition may ONLY be used for educational, academic and other non-commercial purposes expressly excluding any commercial usage. The Trial Edition licenses may ONLY be used for the sole purposes of evaluating the suitability of the Product for licensing of the Enterprise Edition for a fee. If You have obtained the Product under discounted educational pricing, You are only permitted to use the Product for educational and academic purposes only and such license expressly excludes any commercial purposes.

– NexentaStor EULA, Version 4.0; Last updated: March 18, 2014

For those who operate under the Community license, this means your total physical storage is UNLIMITED, provided your space “IN USE” falls short of 18TB (18,432 GB) at all times. Where this is important is in constructing useful arrays with “currently available” disks (SATA, SAS, etc.) Let’s say you needed 16TB of AVAILABLE space using “modern” 3TB disks. The fact that your spinning disks are individually larger than 600GB indicates that array rebuild times might run afoul of failure PRIOR to the completion of the rebuild (encountering data loss) and mirror or raidz2/raidz3 would be your best bet for array configuration.

SOLORI Note: Richard Elling made this concept exceedingly clear back in 2010, and his “ZFS data protection comparison” of 2, 3 and 4-way mirrors to raidz, raidz2 and raidz3 is still a great reference on the topic.

By “raw” licensing standards, the 3-way mirror would require a 76TB license while the raidz2 volume would require a 51TB license – a difference of 25TB in licensing (around $5,300 retail). However, under the Community License, the “cost” is exactly the same, allowing for a considerable amount of flexibility in array loadout and configuration.

Why do I need 54TiB in disk to make 16TB of “AVAILABLE” storage in Raidz2?

The RAID grouping we’ve chosen is 6-disk raidz2 – that’s akin to 4 data and 2 parity disks in RAID6 (without the fixed stripe requirement or the “write hole penalty.”) This means, on average, one third of the space consumed on-disk will be in the form of parity information. Therefore, right of the top, we’re losing 33% of the disk capacity. Likewise, disk manufacturers make TiB not TB disks, so we lose 7% of “capacity” in the conversion from TiB to TB. Additionally, we like to have a healthy amount of space reserved for new block allocation and recommend 30% unused space as a target. All combined, a 6-disk raidz array is, at best, 43% efficient in terms of capacity (by contrast, 3-way mirror is only 22% space efficient). For an array based on 3TiB disks, we therefore get only 1.3TB of usable storage – per disk – with 6-disk raidz (by contrast, 10-disk raidz nets only 160GB additional “usable” space per disk.)

SOLORI’s Take: If you’re running 3.x in production, 4.0.1 is not suitable for in-place upgrades (yet) so testing and waiting for the “non-disruptive” maintenance release is your best option. For new installations – especially inside a VM or hypervisor environment as a Virtual Storage Appliance (VSA) – version 4.0.1 presents a better option over it’s 3.x siblings. If you’re familiar with 3.x, there’s not much new on the NMV side outside better tunables and snappier response.

There have always been things about virtualizing the enterprise that have concerned me. Most boil down to Uncle Ben’s admonishment to his nephew, Peter Parker, in Stan Lee’s Spider-Man, “with great power comes great responsibility.” Nothing could be more applicable to the state of modern virtualization today.

Back in “the day” when all this VMware stuff was scary and “complicated,” it carried enough “voodoo mystique” that (often defacto) VMware admins either knew everything there was to know about their infrastructure, or they just left it to the experts. Today, virtualization has reached such high levels of accessibility that I think even my 102 year old Nana could clone a live VM; now that is scary.

Enter Veeam Backup, et al

Case in point is Veeam Backup and Recovery 6 (VBR6). Once an infrastructure exceeds the limits of VMware Data Recovery (VDR), it just doesn’t get much easier to backup your cadre of virtual machines than VBR6. Unlike VDR, VBR6 has three modes of access to virtual machine disks:

Network – VBR6 backup server/proxy accesses virtual machine disks from ESXi hosts similar in a manner similar to the way the vSphere Client grants access to virtual machine disks across the LAN – slower, with more overhead;

For block-based storage, option (1) appears to be the best way to go: it’s fast with very little overhead in the data channel. For those of us with grey hair, think VMware Consolidated Backup proxy server and you’re on the right track; for everyone else, think shared disk environment. And that, boys and girls, is where we come to the point of today’s lesson…

Enter Windows Server, Updates

For all of its warts, my favorite aspect of VMware Data Recovery is the fact that it is a virtual appliance based on a stripped-down Linux distribution. Those two aspects say “do not tamper” better than anything these days, so admins – especially Windows admins – tend to just install and use as directed. At the very least, the appliance factor offers an opportunity for “special case” handling of updates (read: very controlled and tightly scripted).

The other “advantage” to VMDR is that is uses a relatively safe method for accessing virtual machine disks: something more akin to VBR6’s “virtual appliance” mode of operation. By allowing the ESXi host(s) to “proxy” access to the datastore(s), a couple of things are accomplished:

Unlike “Direct SAN Access” mode, no additional initiators need to be added to the target(s)’ ACL;

If the host can access the VMDK, it stands a good chance of being backed-up fairly efficiently.

However, VBR6 installs onto a Windows Server and Windows Server has no knowledge of what VMFS looks like nor how to handle VMFS disks. This means Windows disk management needs to be “tweaked” to ignore VMFS targets by disabling “automount” in VBR6 servers and VCB proxies. For most, it also means keeping up with patch management and Windows Update (or appropriate derivative). For active backup servers with a (pre-approved, tested) critical update this might go something like:

And the Wheels Came Off…

Service Pack 1 for Windows Server 2008 R2 requires a BCD update, so existing installations of VCB or VBR5/6 will fail to update. In an environment where there is no VCB or VBR5/6 testing platform, this could result in a resume writing event for the patching guy or the backup administrator if they follow Microsoft’s advice and “fix” SP1. Why?

Done, right? Possibly in more ways than one. By GLOBALLY enabling automount, rebooting Windows Server and installing SP1, you’ve opened-up the potential for Windows to write a signature to the VMFS volumes holding your critical infrastructure. Fortunately, it doesn’t have to end that way.

Avoiding the Avoidable

Veeam’s been around long enough to have some great forum participants from across the administrative spectrum. Fortunately, a member posted a solution method that keeps us well away from VMFS corruption and still solves the SP1 issue in a targeted way: temporarily mounting the “hidden” system partition instead of enabling the global automount feature. Here’s my take on the process (GUI mode):

On the pop-up, click the “Add…” button and accept the default drive letter offered, click “OK”;

Now “try again” the installation of Service Pack 1 and reboot;

Once SP1 is installed, re-run Disk Management;

Right-click on the “System Reserved” partition and select “Change Drive Letter and Paths..”

Click the “Remove” button to unmap the drive letter;

Click “Yes” at the “Are you sure…” prompt;

Click “Yes” at the “Do you want to continue?” prompt;

Reboot (for good measure).

This process assumes that there are no non-standard deployments of the Server 2008 R2 boot volume. Of course, if there is no separate system reserved partition, you wouldn’t encounter the SP1 failure to install issue…

SOLORI’s Take: The takeaway here is “consider your environment” (and the people tasked with maintaining it) before deploying Direct SAN Access mode into a VMware cluster. While it may represent “optimal” backup performance, it is not without its potential pitfalls (as demonstrated herein). Native access to SAN LUNs must come with a heavy dose of respect, caution and understanding of the underlying architecture: otherwise, I recommend Virtual Appliance mode (similar to Data Recovery’s take.)

While no VMFS volumes were harmed in the making of this blog post, the thought of what could have happened in a production environment chilled me into writing this post. Direct access to the SAN layer unlocks tremendous power for modern backup: just be safe and don’t forget to heed Uncle Ben’s advice! If the idea of VMFS corruption scares you beyond your risk tolerance, appliance mode will deliver acceptable results with minimal risk or complexity.

The latest update of NexentaStor may not go too smoothly if you are using Windows Server 2008 AD servers and delegating shares via NexentaStor. While the latest update includes a long sought after fix in AD capabilities (see pull quote below) it may require a tweak to the CIFS Server settings to get things back on track.

Domain Group Support

It is now possible to allow Domain groups as members of local groups. When a Windows client authenticates with NexentaStor using a domain account, NexentaStor consults the domain controller for information about that user’s membership in domain groups. NexentaStor also computes group memberships based on its _local_ groups database, adding both local and domain groups based on local group memberships, which are allowed to be indirect. NexentaStor’s computation of group memberships previously did not correctly handle domain groups as members of local groups.

In the past, some of NexentaStor’s in-place upgrades have reset the “lmauth_level” of the associated SMB share server from its user configured value back to a “default” of four (4). This did not work very well in an AD environment where the servers were Windows Server 2008 and running their native authentication mode. The fix was to change the “lmauth_level” to two (2) via the NMV or NMC (“sharectl set -p lmauth_level=2 smb”) and restart the service. If you have this issue, the giveaway kernel log entries are as follows:

However, the rules have changed in some applications; Nexenta’s new guidance is:

Summary Description CIFS Issue

A recent patch release by Microsoft has necessitated a changed to the CIFS authorization setting. Without changing this setting, customers will see CIFS disconnects or the appliance being unable to join the Active Directory domain. If you experience CIFS disconnects or problems joining your Active Directory domain, please modify the ‘lmauth_level’ setting.

# sharectl set -p lmauth_level=4 smb

– NexentaStor 3.1.3 Release Notes

While this may work for others out there it does not universally work for any of my tested Windows Server 2008 R2, native AD mode servers. Worse, it appears to work with some shares, but not all; this can lead to some confusion about the actual cause (or resolution) of the problem based on the Nexenta release notes. Fortunately (or not, depending on your perspective), the genesis of NexentaStor is clearlyheading toward an intersection with Illumos although the current kernel is still based on Open Solaris (134f), and a post from OpenIndiana points users to the right solution.

(Jonathan Leafty) I always thought it was weird that lmauth_level had to be set to 2 so I
bumped it back to the default of 3 and restarted smb and it worked...
(Gordon Ross) Glad you found that. I probably should have sent a "heads-up" when the
"extended security outbound" enhancement went in. People who have
adjusted down lmauth_level should put it back the the default.

Following the advice for OpenIndiana re-enabled all previously configured shares. This mode is also the default for Solaris, although NexentaStor continues to use a different one. According to the man pages for smb on Nexenta (‘man smb(4)’) the difference between ‘lmauth_level=3’ and ‘lmauth_level=4’ is as follows:

lmauth_level

Specifies the LAN Manager (LM) authentication level. The LM compatibility level controls the type of user authentication to use in workgroup mode or
domain mode. The default value is 3.

This illustrates either a continued dependency on LAN Manager (absent in ‘lmauth_level=4’) or a bug as indicated in the OpenIndiana thread. Either way, more testing to determine if this issue is unique to my particular 2008 AD environment or this is a general issue with the current smb/server facility in NexentaStor…

SOLORI’s Take: So while NexentaStor defaults back to ‘lmauth_level=4’ and ‘lmauth_level=2’ is now broken (for my environment), the “default” for OpenIndiana and Solaris (‘lmauth_level=3’) is a winner; as to why – that’s a follow-up question… Meanwhile, proceed with caution when upgrading to NexentaStor 3.1.3 if your appliance is integrated into AD – testing with the latest virtual appliance for the win.

While working on an article on complex VSA’s (i.e. a virtual storage appliance with PCIe pass-through SAS controllers) an old issue came back up again: NexentaStor virtual machines still have a problem installing VMware Tools since it branched from Open Solaris and began using Illumos. While this isn’t totally Nexenta’s fault – there is no “Nexenta” OS type in VMware to choose from – it would be nice if a dummy package was present to allow a smooth installation of VMware Tools; this is even the case with the latest NexentaStor release: 3.1.2.

VMware Tools bombs-out at SUNWuiu8 package failure. Illumos-based NexentaStor has no such package.

Instead, we need to modify the vmware-config-tools.pl script directly to compensate for the loss of the SUNWuiu8 package that is explicitly required in the installation script.

Commenting out the SUNWuiu8 related section allows the tools to install with no harm to the system or functionality.

Note the full “if” stanza for where the VMware Tools installer checks for ‘tools-for-solaris’ must be commented out. Since the SUNWuiu8 package does not exist – and more importantly is not needed for Illumos/Nexenta – removing a reference to it is a good thing. Now the installation can proceed as normal.

After the changes, installation completes as normal.

That’s all there is to getting the “Oracle Solaris” version of VMware Tools to work in newer NexentaStor virtual machines – now back to really fast VSA’s with JBOD-attached storage…

SOLORI’s Note: There is currently a long-standing bug that affects NexentaStor 3.1.x running as a virtual machine. Currently there is no known workaround to keep NexentaStor from running up a 50% cpu utilization from ESXi’s perspective. Inside the NexentaStor VM we see very little CPU utilization, but from the performance tab, we see 50% utilization on every configured vCPU allocated to the VM. Nexenta is reportedly looking into the cause of the problem.

I looked through this and there is nothing that stands out other that a huge number of interrupts while idle. I am not sure where those interrupts are coming from. I see something occasionally called volume-check and nmdtrace which could be causing the interrupts.

View Transfer Server supports Server 2008 R2 but does not support the use of the “default” virtual LSI Logic SAS controller. If you’ve already carved-out a cloning template using the LSI Logic SAS template, it is not necessary to create a new template (or fresh installation) just to spool-up a Transfer Server. In fact, it will take you TWO re-boots from clone completion to LSI Logic Parallel replacement.

CAUTION: You must configure the virtual machine that hosts View Transfer Server with an LSI Logic Parallel SCSI controller. You cannot use a SAS or VMware paravirtual controller.

On Windows Server 2008 virtual machines, the LSI Logic SAS controller is selected by default. You must change this selection to an LSI Logic Parallel controller before you install the operating system.

– VMware View Upgrades (EN-000526-00), Page 13

Here’s the process to take you from completed Server 2008/R2 clone with LSI Logic SAS to LSI Logic Parallel – by-passing the Windows blue screen at boot:

Clone your Server 2008/R2 server as normal,

Shutdown clone and edit settings,

Change Options>Advanced>Boot Options to “Force BIOS Setup” on next reboot;

Following-up on the last installment of managing CIFS shares, there has been a considerable number of questions as to how to establish domain user rights on the share. From these questions it is apparent that the my explanation about root-level share permissions could have been more clear. To that end, I want to look at default shares from a Windows SBS Server 2008 R2 environment and translate those settings to a working NexentaStor CIFS share deployment.

Evaluating Default Shares

In SBS Server 2008, a number of default shares are promulgated from the SBS Server. Excluding the “hidden” shares, these include:

Address

ExchangeOAB

NETLOGON

Public

RedirectedFolders

SYSVOL

UserShares

Printers

Therefore, it follows that a useful exercise in rights deployment might be to recreate a couple of these shares on a NexentaStor system and detail the methodology. I have chosen the NETLOGON and SYSVOL shares as these two represent default shares common in all Windows server environments. Here are their relative permissions:

NETLOGON

From the Windows file browser, the NETLOGON share has default permissions that look like this:

Looking at this same permission set from the command line (ICALCS.EXE), the permission look like this:

The key to observe here is the use of Windows built-in users and NT Authority accounts. Also, it is noteworthy that some administrative privileges are different depending on inheritance. For instance, the Administrator’s rights are less than “Full” permissions on the share, however they are “Full” when inherited to sub-dirs and files, whereas SYSTEM’s permissions are “Full” in both contexts.

SYSVOL

From the Windows file browser, the NETLOGON share has default permissions that look like this:

Looking at this same permission set from the command line (ICALCS.EXE), the permission look like this:

Note that Administrators privileges are truncated (not “Full”) with respect to the inherited rights on sub-dirs and files when compared to the NETLOGON share ACL.

Create CIFS Shares in NexentaStor

On a ZFS pool, create a new folder using the Web GUI (NMV) that will represent the SYSVOL share. This will look something like the following:Read the rest of this entry ?

I wanted to pass-on some information posted by Joerg Moellenkamp at c0t0d0s0.org – some good news for Sun/ZFS users out there about Solaris Express 2010.11 availability, links to details on ZFS encryption features in Solaris 11 Express and clarification on “production use” guidelines. Here’s the pull quotes from his posting:

“Darren (Moffat) wrote three really interesting articles about ZFS encryption: The first one is Introducing ZFS Crypto in Oracle Solaris 11 Express. This blog entry gives you a first overview how to use encryption for ZFS datasets. The second one…”

“There is a long section in the FAQ about licensing and production use: The OTN license just covers development, demo and testing use (Question 14) . However you can use Solaris 11 Express on your production system as well…”

In Medio Stat Veritas

SOLORI's Take and Quick Take posts express my personal opinion unless explicitly attributed to other sources. Where possible, supporting facts are presented to properly frame and ground these opinions, however they are presented "AS-IS" without regard to warranty or promise: expressed or implied.

Comments are open to all registered users and may be edited for decorum. Spam is deleted with prejudice.