Posts Tagged ‘ESX’

Regardless of what the vSphere host Advanced Setting Disk.MaxLUN has stated as its definition for years, “Maximum number of LUNs per target scanned for” is technically not correct. In fact, it’s quite misleading.

The Disk.MaxLUN attribute specifies the maximum LUN number up to which the ESX Server system scans on each SCSI target as it is discovering LUNs. If you have a LUN 131 on a disk that you want to access, for example, then Disk.MaxLUN must be at least 132. Don’t make this value higher than you need to, though, because higher values can significantly slow VMkernel bootup.

The 128 LUN limit refers only to the total number of LUNs that the ESX Server system is able to discover. The system intentionally stops discovering LUNs after it finds 128 because of various service console and management interface limits. Depending on your setup, you can easily have a situation in which Disk.MaxLUN is high (255) but you see few LUNs, or a situation in which Disk.MaxLUN is low (16) but you reach the 128 LUN limit because you have many targets.

Note the last sentence in the first paragraph above in the KB article. Keep the value as small as possible for your environment when using block storage. vSphere ships with this value configured for maximum compatibility out of the box which is the max value of 256. Assuming you don’t assign LUN numbers up to 256 in your environment, this value can be immediately ratcheted down in your build documentation or automated deployment scripts. Doing so will decrease the elapsed time spent rescanning the fabric for block devices/VMFS datastores. This tweak may be of particular interest at DR sites when using Site Recovery Manager to carry out a Recovery Plan test, a Planned Migration, or an actual DR execution. It will allow for a more efficient use of RTO (Recovery Time Objective) time especially where multiple recovery plans are run consecutively.

Here’s a discussion that has somewhat come full circle for me and could prove to be a handy for those with lab or production environments alike.

A little over a week ago I was having lunch with a former colleague and naturally a TPS discussion broke out. We talked about how it worked and how effective it was with small memory pages (4KB in size) as well as large memory pages (2MB in size). The topic was brought up with a purpose in mind.

Many moons ago, VMware virtualized datacenters consisted mainly of Windows 2000 Server and Windows Server 2003 virtual machines which natively leverage small memory pages – an attribute built into the guest operating system itself. Later, Windows Vista as well as 2008 and its successors came onto the scene allocating large memory pages by default (again – at the guest OS layer) to boost performance for certain workload types. To maintain flexibility and feature support, VMware ESX and ESXi hosts have supported large pages by default providing the guest operating system requested them. For those operating systems that still used the smaller memory pages, those were supported by the hypervisor as well. This support and configuration remains the default today in vSphere 5.1 in an advanced host-wide setting called Mem.AllocGuestLargePage (1 to enable and support both large and small pages – the default, 0 to disable and force small pages). VMware released a small whitepaper covering this subject several years ago titled Large Page Performance which summarizes lab test results and provides the steps required to toggle large pages in the hypervisor as well as within Windows Server 2003

As legacy Windows platforms were slowly but surely replaced by their Windows Server 2008, R2, and now 2012 predecessors, something began to happen. Consolidation ratios gated by memory (very typical mainstream constraint in most environments I’ve managed and shared stories about) started to slip. Part of this can be attributed to the larger memory footprints assigned to the newer operating systems. That makes sense, but this only explains a portion of the story. The balance of memory has evaporated as a result of modern guest operating systems using large 2MB memory pages which will not be consolidated by the TPS mechanism (until a severe memory pressure threshold is crossed but that’s another story discussed here and here).

For some environments, many I imagine, this is becoming a problem which manifests itself as an infrastructure capacity growth requirement as guest operating systems are upgraded. Those with chargeback models where the customer or business unit paid up front at the door for their VM or vApp shells are now getting pinched because compute infrastructure doesn’t spread as thin as it once did. This will be most pronounced in the largest of environments. A pod or block architecture that once supplied infrastructure for 500 or 1,000 VMs now fills up with significantly less.

So when I said this discussion has come full circle, I meant it. A few years ago Duncan Epping wrote an article called KB Article 1020524 (TPS and Nehalem) and a portion of this blog post more or less took place in the comments section. Buried in there was a comment I had made while being involved in the discussion (although I don’t remember it). So I was a bit surprised when a Google search dug that up. It wasn’t the first time that has happened and I’m sure it won’t be the last.

Back to reality. After my lunch time discussion with Jim, I decided to head to my lab which, from a guest OS perspective, was all Windows Server 2008 R2 or better, plus a bit of Linux for the appliances. Knowing that the majority of my guests were consuming large memory pages, how much more TPS savings would result if I forced small memory pages on the host? So I evacuated a vSphere host using maintenance mode, configured Mem.AllocGuestLargePage to a value of 0, then placed all the VMs back onto the host. Shown below are the before and after results.

A decrease in physical memory utilization of nearly 20% per host – TPS is alive again:

124% increase in Shared memory in Tier1 virtual Machines:

90% increase in Shared memory in Tier3 virtual Machines:

Perhaps what was most interesting was the manner in which TPS consolidated pages once small pages were enabled. The impact was not realized right away nor was it a gradual gain in memory efficiency as vSphere scanned for duplicate pages. Rather it seemed to happen in batch almost all at once 12 hours after large pages had been disabled and VMs had been moved back onto the host:

So for those of you who may be scratching your heads wondering what is happening to your consolidation ratios lately, perhaps this has some or everything to do with it. Is there an action item to be carried out here? That depends on what your top priority when comparing infrastructure performance in one hand and maximized consolidation in the other.

Those who are on a lean infrastructure budget (home lab would be an ideal fit here), consider forcing small pages to greatly enhance TPS opportunities to stretch your lab dollar which has been getting consumed by modern operating systems and and increasing number of VMware and 3rd party appliances.

Can you safely disable large pages in production clusters? It’s a performance question I can’t answer that globally. You may or may not see performance hit to your virtual machines based on their workloads. Remember that the use of small memory pages and AMD Rapid Virtualization Indexing (RVI) and Intel Extended Page Tables (EPT) is mutually exclusive. Due diligence testing is required for each environment. As it is a per host setting, testing with the use of vMotion really couldn’t be easier. Simply disable large pages on one host in a cluster and migrate the virtual machines in question to that host and let them simmer. Compare performance metrics before and after. Query your users for performance feedback (phrase the question in a way that implies you added horsepower instead of asking the opposite “did the application seem slower?”)

That said, I’d be curious to hear if anyone in the community disables large pages in their environments as a regular habit or documented build procedure and what the impact has been if any on both the memory utilization as well as performance.

Update 10/20/14: VMware announced last week that inter-VM TPS (memory page sharing between VMs, not to be confused with memory page sharing within a single VM) will no longer be enabled by default. This default ESXi configuration change will take place in December 2014.

VMware KB Article 2080735 explains Inter-Virtual Machine TPS will no longer be enabled by default starting with the following releases:

The divergence is in response to new research which leveraged TPS to gain unauthorized access to data. Under certain circumstances, a data security breach may occur which effectively makes TPS across VMs a vulnerability.

Although VMware believes the risk of TPS being used to gather sensitive information is low, we strive to ensure that products ship with default settings that are as secure as possible.

This issue isn’t specific to Jetstress, Exchange, Microsoft, or a specific fabric type, storage protocol or storage vendor. Exceeding the virtual disk capacities listed above, per host, results in the symptoms discussed earlier and memory allocation errors. In fact, if you take a look at the KB article, there’s quite a laundry list of possible symptoms depending on what task is being attempted:

An ESXi/ESX 3.5/4.0 host has more that 4 terabytes (TB) of virtual disks (.vmdk files) open.

After virtual machines are migrated by vSphere HA from one host to another due to a host failover, the virtual machines fail to power on with the error:vSphere HA unsuccessfully failed over this virtual machine. vSphere HA will retry if the maximum number of attempts has not been exceeded. Reason: Cannot allocate memory.

Adding a VMDK to a virtual machine running on an ESXi/ESX host where heap VMFS-3 is maxed out fails.

When you try to manually power on a migrated virtual machine, you may see the error:The VM failed to resume on the destination during early power on.
Reason: 0 (Cannot allocate memory).
Cannot open the disk ‘<<Location of the .vmdk>>’ or one of the snapshot disks it depends on.

The virtual machine fails to power on and you see an error in the vSphere client:An unexpected error was received from the ESX host while powering on VM vm-xxx. Reason: (Cannot allocate memory)

A similar error may appear if you try to migrate or Storage vMotion a virtual machine to a destination ESXi/ESX host on which heap VMFS-3 is maxed out.

Cloning a virtual machine using the vmkfstools -icommand fails and you see the error:Clone: 43% done. Failed to clone disk: Cannot allocate memory (786441)

While VMware continues to raise the scale and performance bar for it’s vCloud Suite, this virtual disk and heap size limitation becomes a limiting constraint for monster VMs or vApps. Fortunately, there’s a fairly painless resolution (at least up until a certain point): Increase the Heap Size beyond its default value on each host in the cluster and reboot each host. The advanced host setting to configure is VMFS3.MaxHeapSizeMB.

Let’s take another look at the default heap size and with the addition of its maximum allowable heap size value:

After increasing the heap size and performing a reboot, the ESX(i) kernel will consume additional memory overhead equal to the amount of heap size increase in MB. For example, on vSphere 5, the increase of heap size from 80MB to 256MB will consume an extra 176MB of base memory which cannot be shared with virtual machines or other processes running on the host.

Readers may have also noticed an overall decrease in the amount of open virtual disk capacity per host supported in newer generations of vSphere. While I’m not overly concerned at the moment, I’d bet someone out there has a corner case requiring greater than 25TB or even 32TB of powered on virtual disk per host. With two of VMware’s core value propositions being innovation and scalability, I would tip-toe lightly around the phrase “corner case” – it shouldn’t be used as an excuse for its gaps while VMware pushes for 100% data virtualization and vCloud adoption. Short term, the answer may be RDMs. Longer term: vVOLS.

Updated 9/14/12: There are some questions in the comments section about what types of stoarge the heap size constraint applies to. VMware has confirmed that heap size and max virtual disk capacity per host applies to VMFS only. The heap size constraint does not apply to RDMs nor does it apply to NFS datastores.

Updated 4/30/13: VMware has released vSphere 5.1 Update 1 and as Cormac has pointed out here, heap issue resolution has been baked into this release as follows:

VMFS heap can grow up to a maximum of 640MB compared to 256MB in earlier release. This is identical to the way that VMFS heap size can grow up to 640MB in a recent patch release (patch 5) for vSphere 5.0. See this earlier post.

Maximum heap size for VMFS in vSphere 5.1U1 is set to 640MB by default for new installations. For upgrades, it may retain the values set before upgrade. In such cases, please set the values manually.

There is also a new heap configuration “VMFS3.MinHeapSizeMB” which allows administrators to reserve the memory required for the VMFS heap during boot time. Note that “VMFS3.MinHeapSizeMB” cannot be set more than 255MB, but if additional heap is required it can grow up to 640MB. It alleviates the heap consumption issue seen in previous versions, allowing the ~ 60TB of open storage on VMFS-5 volumes per host to be accessed.

When reached for comment, Monster VM was quoted as saying “I’m happy about these changes and look forward to a larger population of Monster VMs like myself.”

According to the agreement, Cirrus Tech extends its portfolio with StarWind storage virtualization software and will offer it to their customers as a dedicated storage platform that delivers a highly available and high performance scalable storage infrastructure that is capable of supporting heterogeneous server environments; as Cloud storage for private clouds as well as a robust solution for building Disaster Recovery (DR) plans.

“Companies are increasingly turning to cloud services to gain efficiencies and respond faster to today’s changing business requirements.” said Artem Berman, Chief Executive Officer of StarWind Software, Inc.“We are pleased to combine our forces with Cirrus Tech in order to deliver our customers a wide range of innovative cloud services that will help their transition to a flexible and efficient shared IT infrastructure.”

“Every business needs to consider what would happen in the event of a disaster,” shares Cirrus CEO Ehsan Mirdamadi. “By bringing StarWind’s SAN solution to our customers, we are helping them to ease the burden of disaster recovery planning by offering powerful and affordable storage options. You never want to think of the worst, but when it comes to your sensitive data and business critical web operations, it’s always better to be safe than sorry. Being safe just got that much easier for Cirrus customers.”

About Cirrus HostingCirrus Tech Ltd. has been a leader in providing affordable, dependable VHS and VPS hosting services in Canada since 1999. They have hosted and supported hundreds of thousands of websites and applications for Canadian businesses and clients around the world. As a BBB member with an A+ rating, Cirrus Tech is a top-notch Canadian web hosting company with professional support, rigorous reliability and easily upgradable VPS solutions that grow right alongside your business. Their Canadian data center is at 151 Front Street in Toronto.

About StarWind Software Inc.
StarWind Software is a global leader in storage management and SAN software for small and midsize companies. StarWind’s flagship product is SAN software that turns any industry-standard Windows Server into a fault-tolerant, fail-safe iSCSI SAN. StarWind iSCSI SAN is qualified for use with VMware, Hyper-V, XenServer and Linux and Unix environments. StarWind Software focuses on providing small and midsize companies with affordable, highly availability storage technology which previously was only available in high-end storage hardware. Advanced enterprise-class features in StarWind include Automated HA Storage Node Failover and Failback (High Availability), Replication across a WAN, CDP and Snapshots, Thin Provisioning and Virtual Tape management.

Since 2003, StarWind has pioneered the iSCSI SAN software industry and is the solution of choice for over 30,000 customers worldwide in more than 100 countries and from small and midsize companies to governments and Fortune 1000 companies.

VMware has unveiled a point release update to several of their products tied to the vSphere 5 virtual cloud datacenter platform plus a few new product launches.

vCenter 5.0 Update 1 – Added support for new guest operating systems such as Windows 8, Ubuntu, and SLES 11 SP2, the usual resolved issues and bug fixes, plus some updates around vRAM limits licensing. One other notable – no compatibility at this time with vSphere Data Recovery (vDR) 2.0 according to the compatibility matrix.

ESXi 5.0 Update 1 – Added support for new AMD and Intel processors, Mac OS X Server Lion, updated chipset drivers, resolved issues and bug fixes. One interesting point to be made here is that according to the compatibility matrix, vCenter 5.0 supports ESXi 5.0 Update 1. I’m going to stick with the traditional route of always upgrading vCenter before upgrading hosts as a best practices habit until something comes along to challenge that logic.

Site Recovery Manager 5.0.1 – Added support for vSphere 5.0 Update 1 plus a “Forced Failover” feature which allows VM recovery in cases where storage arrays fail at the protected site which, in the past, lead to unmanageable VMs which cannot be shut down, powered off, or unregistered. Added IP customization for some Ubuntu platforms. Many bug fixes, oh yes. VMware brought back an advanced feature which hasn’t been seen since SRM 4.1 which provided a configurable option, storageProvider.hostRescanCnt, allowing repeated host scans during testing and recovery. This option was removed from SRM 5.0 but has been restored in the Advanced Settings menu in SRM 5.0.1 and can be particularly useful in troubleshooting a failed Recovery Plan. Right-click a site in the Sites view, select Advanced Settings, then select storageProvider. See KB 1008283. Storage arrays certified on SRM 5.0 (ie. Dell Compellent Storage Center) are automatically certified on SRM 5.0.1.

View 5.0.1 – Added support for vSphere 5.0 Update 1, new Connection Server, Agent, Clients, fixed known issues. Ahh.. let’s go back to that new clients bit. New bundled Mac OS X client with support for PCoIP! I don’t have a Mac so those who would admit to calling me a friend will have to let me know how sharp that v1.4 Mac client is. As mentioned in earlier release notes, Ubuntu got a plenty of love this week. Including a new View PCoIP version 1.4 client for Ubuntu Linux. I might just have to deploy an Ubuntu desktop somewhere to test this client. But wait, there’s more. New releases of the View client for Android and iPad tablets. The Android client adds fixes for Ice Cream Sandwich devices, security stuff, and updates for the Kindle Fire (I need to get this installed on my wife’s Fire). The updated iPad client improves both connection times as well as external display support but for the most part Apple fans are flipping out simply over something shiny and new. Lastly, VMware created a one stop shop web portal for all client downloads which can be fetched at http://www.vmware.com/go/viewclients/

There are a lot of versions in play here which weaves somewhat of a tangled web of compatibility touch points to identify before diving head first into upgrades. I think VMware has done a great job this time around with releasing products that are, for the most part, compatible with other currently shipping products which provides more flexibility in tactical approach and timelines. Add to that, some time ago they’ve migrated a two dimensional .PDF based compatibility matrix into an online portal offering interactive input making the set of results customized for the end user. The only significant things missing in the vSphere 5.0U1 compatibility picture IMO are vCloud Connector, vDR, and based on the results from the compatibility matrix portal – vCenter Operations (output showed no compatibility with vSphere 5.x, didn’t look right to me). I’ve taken a liberty in creating a component compatibility visual roadmap including most of the popular and currently shipping products vSphere 5.0 and above. If you’ve got a significant amount of infrastructure to upgrade, this may help you get the upgrade order sorted out quickly. One last thing – Lab Manager and ESX customers should pay attention to the Island of Misfit Toys. In early 2013 the Lab Manager ride comes coasting to a stop. Lab Manager and ESX customers should be formulating solid migration plans with an execution milestone coming soon.

I receive a lot of communication from recruiters, some of which I’m allowed to share, so I’ve decided to try something. On the Jobs page, I’ll pass along virtualization and cloud centric opportunities – mostly US based but in some cases throughout the globe. Only recruiter requests will be posted. I won’t syndicate content easily found on the various job boards. If you’re currently on the bench or looking for a new challenge, you may find it here. Don’t tell them Jason sent you. I receive no financial gain or benefit otherwise but I thought I could do something with these opportunities other than deleting them. Best of luck in your search.