In the storage module, we talked a lot about protocols, IOPS, capacity, tuning, zoning, multipathing, and RAID types. Furthermore, we applied rating indexes to the protocols and technologies such as cost, complexity, and flexibility. Understanding this information aids in formulating design choices for the project. A few interesting tidbits I pulled out of today’s discussions:

VMFS block size selection has a negligible effect on the design. Don’t stress over 1, 2, 4, or 8MB. If you want the most flexibility, choose 8MB and be done. There is no measurable performance impact and the amount of disk space wasted due to placing 10-20 VMs worth of small files on a 500GB LUN is negligible, with or without sub block allocation.

This one I was a bit surprised at: “Don’t adjust LUN queue depth“. Rationale: if you are having performance issues in the form of queuing, it is likely the result of a larger problem elsewhere such as front end port saturation on the array. I will agree that leaving queue depth at its factory defaults on the HBA and VMkernel side (Disk.SchedNumReqOutstanding) makes implementation that much easier, but it doesn’t mean it’s always the right solution. In my VCDX design submission, I increased queue depth from the default of 32 to 64 (128 is max). By the way, the best whitepaper on this discussion I’ve ever found is Scalable Storage Performance. It’s quite short and in my opinion explains the key details in better detail than the VMware KB articles.

The VM design module was fairly predictable. It basically focused around right-sizing and choosing the appropriate virtual hardware. It touched a bit on security and naming conventions. Nothing earth shattering here.

Final thoughts… good class, I’d recommend it. I felt many ties to the VCDX design process and hopefully this class will be beneficial to those in pursuit of that certification (or the VCAP-DCD). My feedback on the class was for added days. We got through the entire 400 page classroom book in 3 days but we flew through some sections. The biggest compromise of time was probably the labs. I felt that we did not spend a lot of time hashing design decisions, justifications, and upstream/downstream impacts. This is where you need to spend considerable time thinking through each design decision, not only for the benefit of your customer, but you’re going to get hammered by the VCDX panel on decisions as well. Ample time spent here is a good practice.

The instructor is sitting the VCAP-DCD BETA exam the same day as I. He seemed a bit curious as to whether or not this course will cover all of the exam questions we both will face. I’ll solicit his feedback after the exam to see what he thought, and then compare it with my thoughts. I’ll post a recap on my exam experience, as I usually do, so stay tuned if your interested in the VCAP-DCD.

I missed the release (probably due to VMworld 2010 excitement), but a month later I’ve discovered that Dutch rock star Rob de Veij has provided the community with an update to his free VMware virtualization utility RVTools. Version 2.9.5 boasts the following new features:

red – no heartbeat, guest operating system may have stopped responding.

yellow -intermittent heartbeat, may be due to guest load.

green – guest operating system is responding normally.

Feedback for the next release: The heartbeat status isn’t entirely reliable. For instance, the Celerra VMs show a heartbeat status of red. In reality, the VMware Tools are reported as not installed by the vSphere Client.

The VCDX3 logos have been released by VMware for use on business cards and professional communications. Going forward, VMware will be making these new logos available on shirts, certificates, business cards, etc.

Are you looking for something fun and exciting to do in November? How about a free technology event with a tie to virtualization? Nexus Information Systems, a regional leader in sales and service of hardware, storage, networking, and managed services, has cooked up something for you!

On Wednesday November 10th, Nexus is hosting an all day event called GeekFest 2010 at their offices in Minnetonka, MN. Here is what they are saying about it:

The day is comprised of different technologies and industry sessions focused around challenging data center solutions and services. GeekFest is a FREE event where you can register to come and go to just individual sessions or register for an all-day pass. Attendees to GeekFest will be exposed to the newest technologies from both industry-leading and up & coming providers.

What else is cool? GeekFest 2010 has a special guest moderator Greg Schulz of Storage IO who is a 25+ year technology veteran, storage industry analyst, and vEXPERT.

Day 2 of 3 is in the books. We started the morning on Module 4 VMware vSphere Virtual Datacenter Design. The discussion included topics such as:

vCenter Server requirements, sizing, placement, and high availability

vCenter and VUM database sizing and placement

Clusters

Size

HA, failover, isolation, design

DRS

FT

DPM

Resource Pools

Shares

Reservations

A lot of networking, including the standard vSwitch, vNetwork Distributed Switch, and the Cisco Nexus 1000V

FCoE

VLANs

PVLANs

Load balancing policies

Link State and Beacon Probing network failure detection (beacons are sent once per second per pNIC per VLAN; beacons are sent whether or not beacon probing is enabled – an advanced VMkernel setting permanently disables beacons)

We accomplished a lab or two today as well. We made some design decisions around vCenter, databases, and networking. Along with those design decisions, we provided justifications and impacts. This process is very familiar to me as I spent a lot of time providing information like this when filling out the VCDX defense application.

One thing I noticed tonight which I hadn’t seen before is that VMware posted an Adobe Flash demonstration of the new VCAP-DCD exam. Take a look. This will help candidates be better prepared overall for the exam experience. Exam time is valuable – you don’t want to waste it trying to learn the UI.

Tomorrow we start with Module 6 VMware vSphere Storage Design. I expect a lot of time spent here as the options for storage are vast. The instructor hails from EMC and I’m sure he has plenty to say about storage. On the subject of storage, the instructor passed along some tidbits on a NAS device offerings from QNAP. In particular, take a look at the TS-239 PRO II Turbo NAS. At 83.6MB/s throughput, it beats the pants off any other consumer based NAS appliance on the market (even Iomega). Cisco also rebrands this NAS model as the NSS322 so you can find it there as well. Lastly, take a look at smallnetbuilder.com. This site reviews wireless equipment as well as NAS appliances for the public. They have a nice chart rating most of the NAS appliances out there. It is here where you can see how fast the QNAP unit above screams compared to the competition.

I was researching FT documentation to find out more about asymmetric logging traffic between primary and secondary FT VMs when I stumbled onto a KB article which the document mentioned. VMware KB 1011965 talks about changing the traffic pattern on the FT logging network. This is particularly helpful for a high read disk I/O FT protected VM. Normally, all disk I/O is going to traverse the FT logging network from primary to secondary VM. For FT protected VMs which have disk I/O read patterns, the FT logging network may become saturated depending on the bandwidth (1Gb vs. 10Gb) and depending on the number of protected VMs on that network, not only between two hosts, but between all the hosts in the cluster or perhaps spanning clusters, depending on how far the FT network is stretched. What the workaround does is it makes the secondary VM issue disk reads directly to the shared disk (out of band) instead of getting that data over the FT logging network to stay within vLockstep tolerances.

Given the many restrictions for FT, particularly the 1vCPU requirement, you may not have run into FT logging network saturation. However, when some of these FT restrictions are lifted, I expect disk I/O to scale up on FT protected VMs. FT will become more popular and I can see where this tweak may come in handy, particularly for those who are looking to get more mileage out of 1Gb network infrastructure which FT networks are tied to.

The workaround in the KB article is applied at the VM level by adding the following line to the .vmx configuration:

replay.logReadData = checksum

In addition, the VM must be powered off before making the .vmx change, andthen unregistered and re-registered on the host after making the .vmx change.

I wouldn’t call the configuration itself t very scalable as it’s hidden and could become an administrative burden to document and track. Perhaps we’ll see this tweak move to a spot somewhere in the GUI and maybe the option of making it a host/cluster level configuration.

Today was day 1 of 3 for my VMware vSphere Design Workshop training. I’ve been looking forward to this training since spring of this year when I scheduled it. The timing couldn’t be better since I’m scheduled to sit the VMware VCAP-DCD BETA exam in November. I’m told by the instructor, an EMC employee of eight years as well as a CLARiiON and SRM specialist, that this is the VMware recommended classroom training for the VCAP-DCD exam. To be perfectly honest, I haven’t looked at the exam blueprint yet but I intend to tomorrow. My hope and expectation at this point is that the class is going to cover the blueprint objectives. Beyond the introductions, I don’t think we were 30 minutes into the class and conversation had already turned to Duncan Epping and Chad Sakac, along with their respective blogs. By then, I knew I was in for a great three days.

The scope of the course covers vSphere 4.0 Update 1. I was slightly disappointed by this in that it’s covering a release that is nearly one year old, however, if the exam objectives and the exam itself is based on 4.0 Update 1, then the training is appropriate. That said, the instructor is willing to notify the class of any changes through the current version – 4.1. Looking more closely at the scope, the following areas will be covered:

ESX

ESXi

Storage

Networks

Virtual Machines

vCenter Server and related databases

DRS

HA

FT

Resource Pools

Design Process

Design Decisions

Best Practices

Two comprehensive design case studies to apply knowledge in the lab:

SMB

Enterprise

Design is a different discipline than Administration. Administration focuses on tactical things like installation, configuration, tools, CLI, Service Console, clients, etc. Having said that, there is ample opportunity for working in a vSphere lab to master the various administrative tasks covered by the VCAP-DCA blueprint. In fact, as most may know by now, the DCA exam is lab based. Design is different. It’s a step higher than the tools and the CLI which are generally abstracted from the logical design discussion. The focus is shifted to the big virtual datacenter picture and the components involved to architect a solution which meets customer requirements and other variables used as design criteria input for the engagement. As mentioned above, there are a series of paper-based labs which follow 2 design case studies: SMB and Enterprise.

It is just a three day class and we covered quite a bit of ground today. Much of the time was spent on Design Methodology, Criteria, Approach, and VMware’s Five-Step Design Process:

Initial Design Meeting

Current-State Analysis

Stakeholder and SME training

Design Sessions

Design Deliverables

Having years of consulting experience under his belt, the instructor volunteered helpful insight toward what he often referred to as the consultative based approach/discovery. We talked about phases of the engagement, design meetings to hold, who to invite, who not to invite, and the value and persuasion power of food. We got into some conversations about hypervisor choices (ESX vs. ESXi), with a sprinkling of hardware tangents (NUMA, PCIe, processors, storage, etc.) We closed the day with discussions on resource planning, peaks, and averages, as well as our first lab exercise which was to decide on a hardware standard (blade versus rack mount) and plan for capacity in terms of number of hosts and cluster sizes given data from customer interviews.

I’ll close here with an infrastructure design growth formula and practical application:

The scenario: Contoso, Inc. has a consolidation ratio of 30:1 on an existing cluster. Contoso expects 25 percent annual growth of a 200 VM cluster over the next four years.