1. vSphere 5.0 or later
2. vSphere Enterprise Plus licensing (to support Network I/O Control)
3. VMs range from Business Critical Application (BCAs) to non critical servers
4. Software licensing for applications being hosted in the environment are based on per vCPU OR per host where DRS “Must” rules can be used to isolate VMs to licensed ESXi hosts

4. The cluster can be expanded with up to 14 more hosts (to the 32 host cluster limit) in the event the average VM size is greater than anticipated or the customer experiences growth

5. Having 2 x 10GB connections should comfortably support the IP Storage / vMotion / FT and network data with minimal possibility of contention. In the event of contention Network I/O Control will be configured to minimize any impact (see Example VMware vNetworking Design w/ 2 x 10GB NICs)

6. RAM is one of the most common bottlenecks in a virtual environment, with 16 physical cores and 256GB RAM this equates to 16GB of RAM per physical core. For the average sized VM (1vCPU / 4GB RAM) this meets the CPU overcommitment target (up to 400%) with no RAM overcommitment to minimize the chance of RAM becoming the bottleneck

7. In the event of a host failure, the number of Virtual machines impacted will be up to 64 (based on the assumed average size VM) which is minimal when compared to a Four Socket ESXi host which would see 128 VMs impacted by a single host outage

8. If using Four socket ESXi hosts the cluster size would be approx 10 hosts and would require 20% of cluster resources would have to be reserved for HA to meet the N+2 redundancy requirement. This cluster size is less efficient from a DRS perspective and the HA overhead would equate to higher CapEx and as a result lower the ROI

9. The solution supports Virtual machines of up to 16 vCPUs and 256GB RAM although this size VM would be discouraged in favour of a scale out approach (where possible)

11. Using smaller hosts (either single socket, or less cores per socket) would not meet the requirement to support supports Virtual machines of up to 16 vCPUs and 256GB RAM , would likely require multiple clusters and require additional 10GB and 1GB cabling as compared to the Two Socket configuration

12. The two socket configuration allows the cluster to be scaled (expanded) at a very granular level (if required) to reduce CapEx expenditure and minimize waste/unused cluster capacity by adding larger hosts

13. Enabling features such as Distributed Power Management (DPM) are more attractive and lower risk for larger clusters and may result in lower environmental costs (ie: Power / Cooling)

1. Business critical applications are excluded
2. Block based storage
3. VAAI is supported and enabled
4. VADP backups are being utilized
5. vSphere 5.0 or later
6. Storage DRS will not be used
7. SRM is in use
8. LUNs & VMs will be thin provisioned
9. Average size VM will be 100GB and be 50% utilized
10. Virtual machine snapshot will be used but not for > 24 hours
11. Change rate of average VM is <= 15% per 24 hour period
12. Average VM has 4GB Ram
13. No Memory reservations are being used
14. Storage I/O Control (SOIC) is not being used
15. Under normal circumstances storage will not be over committed at the storage array level.
16. The average maximum IOPS per VMs is 125 (16Kb) (MBps per VM <=2)
17. The underlying storage has sufficient performance to cater for the average maximum IOPS per VM
18. A separate swap file datastore will be configured per cluster

1. In worst case scenario where every VM has used 100% of its VMDK capacity and has 4GB RAM with no memory reservation and a snapshot of up to 15% of its size the 3TB datastore will still have 197GB remaining, as such it will not run out of space.
2. The Queue depth is on a per datastore (LUN) basis, as such, having 25 VMs per LUNs allows for a minimum of 1.28 concurrent I/O operations per VM based on the standard queue depth of 32 although it is unlikely all VMs will have concurrent I/O so the average will be much higher.
3. Thin Provisioning minimizes the impact of situations where customers demand a lot of disk space up front when they only end up using a small portion of the available disk space
4. Using Thin provisioning for VMs increases flexibility as all unused capacity of virtual machines remains available on the Datastore (LUN).
5. VAAI automatically raises an alarm in vSphere if a Thin Provisioned datastore usage is at >= 75% of its capacity
6. The impact of SCSI reservations causing performance issues (increased latency) when thin provisioned virtual machines (VMDKs) grow is unlikely to be an issue for 25 low I/O VMs and with VAAI is no longer an issue as the Atomic Test & Set (ATS) primitive alleviates the issue of SCSI reservations.
7. As the VMs are low I/O it is unlikely that there will be any significant contention for the queue depth with only 25 VMs per datastore
8. The VAAI UNMAP primitive provides automated space reclamation to reduce wasted space from files or VMs being deleted
9. Virtual machines will be Thin provisioned for flexibility, however they can also be made Thick provisioned as the sizing of the datastore (LUN) caters for worst case scenario of 100% utilization while maintaining free space.
10. Having <=25 VMs per datastore (LUN) allows for more granular SRM fail-over (datastore groups)

Alternatives

1. Use larger Datastores (LUNs) with more VMs per datastore
2. Use smaller Datastores (LUNs) with less VMs per datastore

Implications

1. When performing a SRM fail over, the most granular fail over unit is a single datastore which may contain up to 25 Virtual machines.

2. The solution (day 1) does not provide CapEx saving on disk capacity but will allow (if desired) over commitment in the future

Thanks to James Wirth (VCDX#83) @JimmyWally81 for his contributions to this example decision.

I am very pleased to announce that I have decided to take on a new challenge and will be joining the innovative team at Nutanix starting July this year in the Solutions and Performance engineering team.

It was only a few short months ago when I first discovered what Nutanix was all about, after previously seeing the classic “No SAN” advertisements on various blogs and at VMworld in 2012, and embarrassingly I have to admit I did not make the time to look into the solution.

Since then, I have spent a lot of time looking into the Nutanix solution and have spoken to a number of people in the industry including several members of the Nutanix family. It has become obvious to me why Nutanix is one of the most successful and fastest growing start-ups in the industry, although im not sure I’d call Nutanix a “Start-up” any more.

The linearly scale out solution provided by Nutanix aligns perfectly with the virtualization best practices that most of us have known for many years, and combines PCIe SSD (Fusion-io) with SATA SSDs and high capacity SATA drives into a high performance , hyper-converged 2RU platform.

Over my many years in the industry I can recall countless scenarios where the Nutanix solution would have been a perfect fit, and solved numerous problems, both at the technical/architectural level and importantly at the business level for both SMB and Enterprise customers.

Now with the release of a wider range of Nutanix blocks including the NX-1000 and NX-6000, the solution is becoming more and more attractive.

In my role I will be part of the team who is responsible for creating high performance solutions and developing best practice guides, reference architectures and case studies for things like virtualization of business critical applications on the Nutanix platform.

A lot of people are already aware of how good the platform is for virtual desktops, but I am not only focused on showing how good the solution is for VDI, but for a wider range of workloads, including Business Critical Applications / server and Big Data workloads.

I am very much looking forward to being a significant part of this exciting company, which already boasts exceptional talent, including two VCDXs in Jason Langone @langonej & Lane Leverett @wolfbrthr . So I am very pleased to be working along side such talent and to be the third VCDX in the Nutanix family.

As I have been doing for the last year or so, I intend to continue to share my experience with the virtualization community via Twitter, Blogging, VMUGs etc, which will now include (but not be limited too) the Nutanix platform.

So stay tuned as the Nutanix team and I have a number of very interesting projects coming up in the next few weeks and months which I cant wait to share with you.

If your not already familiar with what Nutanix is all about, here are a couple of quick introductory YouTube videos which I highly recommend you take the time to watch (as I wish I had sooner!)