Today, I ran into an issue where I was upgrading ESXi 6.0 servers to 6.5 Update 1 using an HPE custom ISO. Here’s another example of how PowerCLI can make you more productive.

Conflicting VIBs problem

While working with a customer on a vSphere 6.0 to 6.5 upgrade, I prepared everything as it should be. I got the latest custom ISO from HPE for ESXi 6.0 Update 1, created a VUM baseline, and attached it to the clusters in question. Upon scanning the ESXi hosst with VMware Update Manager, I received a warning that the HPE custom ISO was incompatible.

Note that there aren’t actually four conflicting VIBS. It repeated the problematic modules twice. There’s actually only two.

Basically, these conflicting modules should be removed prior to upgrading the ESXi hosts.

Removing conflicting VIBs the manual way

There’s nothing special about how to remove them via ESXCLI. You need the name of the conflicting module, and enable SSH on the ESXi hosts. Then, run the following command:

esxcli software vib remove –vibname conflicting-vib-name

In the case above, they are named scsi-qla2xxx and scsi-lpfc820.

Watch for indications if the server needs to be rebooted when you run the command. If so, reboot the servers before proceeding with the upgrade.

Removing conflicting VIBs the PowerCLI way

It’s even easier with PowerCLI to remove these conflicting VIBs. You don’t have to enable SSH on all your ESXi hosts. First, make a text file with the names of each conflicting VIB name, with one name per line.

Next, run the following commands after connecting to your vCenter server via PowerCLI:

You could obviously make a variable of all your ESXi hosts and do them all at once, but you might not want to leave your ESXi hosts sitting there waiting for a reboot for a while. It’s your call how to handle that part, but this is how you can remove conflicting VIBs at a basic level.

Hope this helps!

Share this:

Accurate timekeeping is important in almost every environment. If time is not synced across your environment, authentication errors can occur, services and applications may not function properly, event logs and alerts can be off, which can inhibit troubleshooting. You’re probably aware already that this is a big deal. Beyond just referencing KB articles, I want to spend time to discuss NTP timekeeping in general, as well as practical methods and strategies that work, and in my experience what doesn’t work.

This will be a series of posts to try to address all major considerations with timekeeping via NTP, beginning with timekeeping within virtual machines.

NTP – Accuracy vs. Internal Synchronization

Obviously, you need your internal to have accurate time and synced with authentication sources. Protocols like Kerberos for good reason don’t allow for much clock skewing in order to protect against authentication replay attacks. For example, Active Directory’s default tolerance for clock skewing is five minutes.

But sometimes both of those goals conflict each other. In these cases, which is more important? For probably almost all environments that the priority should be that clocks are synced over how accurate the clocks actually are.

Why? Simple – application and service availability. Chances are, if clocks are skewed too much within your environment, services and applications will become inaccessible to some or all users. Generally, razor sharp clock accuracy in the real world if lacking can often be an annoyance, not a downtime event. Obviously, that may not be the case for everyone, such as real time stock trading companies, but that’s generally for the most part true. When making choices about how to configure things for time usually through NTP, if faced with a scenario where you must choose better internal synchronization instead of better accuracy to what the real time is, choose better synchronization over actual time accuracy.

When would these goals come into conflict? As an example, VMs could be set to synchronize their clocks with their VM host via VM Tools, or they could be configured within their OS to use an external NTP server. It’s theoretically possible that for some reason, your ESXi host’s clock might be more trustworthy than your Domain Controllers more often than not. For most customers though, even if that were true, prioritize synchronization over clock accuracy. Allowing VMTools to sync the clock of the VM to the host effectively means VMs running on different hosts could have different time. Maybe the NTP service stopped on one ESXi host. Maybe they’re not configured consistently. It doesn’t matter why. Prioritize synchronization instead by configuring each VM’s OS to synchronize to the same NTP servers somehow, some way.

How many NTP servers, and which ones?

When configuring anything for NTP, whether it be an ESXi server or guests, the question always comes up – how many servers should an NTP client be set to use?

Many people know some obvious ones. More than one, right? Of course. Providing more than one offers redundancy in case an NTP server fails. However, I’ve encountered many environments where there were just two configured. Of course three would be better just for resiliency, but configuring two NTP servers has risks beyond that.

Remember that NTP clients function by polling all their configured NTP servers, and then adopting the most consensus time values across all of them. For example, if two NTP servers configured provide different values, the NTP client will adopt a value that’s a compromise between them. In a scenario where NTP server 1 says the time is off by twenty minutes, but NTP server 2 is correct, the NTP client will likely to adopt a value of 10 minutes too fast, which is incorrect, and worse may cause clock skewing within the environment. I recommend you use instead an odd number of NTP servers greater than one, and the more the merrier generally speaking.

But which ones? Diversity that improves availability is good, but diversity that will be more likely to result in disparate values is bad. Using NTP servers that are for example on separate compute, storage, and physical sites is good. Mixing and matching for example internal and external NTP servers that are managed by different people on the same NTP client is generally bad, although it might be the best alternative among non-optimal choices.

In my next post in this series, I’ll go into specifics on how I generally apply these considerations to VMware environments.

Share this:

Here are my notes for 2VB-601. These notes I took to help me prepare for the exam as I went through the Deploy and Manage VSAN course, and through the recommended documentation. Often, if I already knew the info, I didn’t necessarily put it in my notes.

Hope these help!

Lifespans of SSD drives

SLC

100,000 writes

MLC

3000-10000 writes

TLC

1000 writes

eMLC

20,000-30,000 writes

NVMe

Specification developed specifically for SSDs, more parallelism, better performance

3D Cross Point

PCIe NVMe cards

Improved even more on performance

HDDs

Slower but higher capacity than SSDs

15K, 10K, 7.2K RPM drives, higher = better latency

Return to 2VB-601 Exam Guide.

Share this:

On Friday, I sat the VMware VSAN Specialist 2VB-601 Exam. I’ll be deploying more VSAN soon, so I used this recently released 2VB-601 exam as a guide to thoroughly learn the product. Passing 2VB-601 along with a VCP6 version of a VMware certification grants you the VMware VSAN Specialist 2017 badge. This isn’t a full certification, but it acknowledges candidates with VSAN knowledge and skills.

2VB-601 Exam Format

The 2VB-601 exam consists of 60 multiple choice questions, and you have 105 minutes to complete the exam. It is very comparable to VMware VCP exams as far as format goes. If you’ve taken VCP exams before, you certainly know the drill here. The top score for the exam is 500, and passing is 300, just like VCP exams.

2VB-601 Exam Resources

There aren’t a ton of affordable learning resources out there for this exam as far as books and what not go unfortunately. However, if you follow the exam guide and read the documents provided in VMware’s study guide, along with hands on experience, you can certainly pass 2VB-601. I also highly recommend the VMware hands on labs pertaining to VSAN. You could also build your own lab using EvalExperience included with VMUG Advantage.

VSAN 6.6 Deploy and Manage training isn’t necessary, but I did attend it to fulfill a partner requirement. If you’re a partner and need to do the same, this will definitely help.

2VB-601 Exam Experience

The 2VB-601 exam is very straight forward. VMware VCP exams are notorious for sometimes asking rote memorization type questions, such as the exact word for word options for configuration choices. I did not find that to be the case generally speaking with this exam. Most questions are fair and are generally more conceptual in nature. You do need to know what needs to be done or what happens in various scenarios. I also generally didn’t find myself wondering to which exam objective questions were related, like I often do on VCP exams for more questions than I’d like. Virtually every question I felt was fair game on.

With that said, the exam I found wasn’t nearly as difficult as every other VCP exam I’ve taken, which are numerous at this point. If you look over the exam guide and feel you know them, have hands on experience with VSAN, I would recommend reading over the substantial documents in the guide prior to sitting the exam. Otherwise, you should be in good shape.

I passed it on my first attempt with a 456, which is the highest I’ve ever gotten on any VMware exam. I finished the exam with 45 minutes to spare, so time won’t be an issue. I found the questions mostly fell in the category of “you know it or you don’t”.

I’ll be posting resources to help study for the exam for those of you who wish to take the exam.

Hope this helps!

Share this:

I get a lot of questions about the finer points of vSphere networking. I wanted to provide a consolidated list of general recommendations and info about some of the options within vSphere networking as a quick reference.

Please keep in mind recommendations below for vSphere networking are extremely vague and generally speaking. You should not necessarily assume any recommendations below are the right choice for your environment!

When using Virtual Distributed Switches (VDS) in vSphere networking configurations, what should distributed port group port binding be set to – static, dynamic, or ephemeral?

Short answer – Generally use static for most workloads, and consider ephemeral for infrastructure type workloads related to vCenter, such as vCenter, PSCs, domain controllers/DNS servers, etc.

Long answer – Dynamic was deprecated in vSphere 5.0. Don’t use it. Use static or ephemeral. Static has a few advantages; a VM will always stay on the same virtual switch port even if powered off, so statistics are easier to get for a particular virtual NIC. However, VMs will consume more VDS ports since they will consume them even when powered off, but that’s a lot of ports to eat through. It also generally results in less load on vCenter and ESXi hosts, as ports aren’t constantly allocated and unallocated. Ephemeral basically is no binding at all. This reduces the number of VDS ports consumed, as powered off VMs don’t use up VDS ports. However, ephemeral slows down operations within the VDS as ports are allocated and unallocated when VMs are powered off and on. One advantage ephemeral does have is it does not require vCenter to be available for VMs to make use of those ports; static sometimes does, hence the recommendation for vCenter and related workloads to use ephemeral port bindings to avoid chicken and egg type scenarios, as vCenter is the control plane for the VDS, and is therefore mainly responsible for port bindings.

When using Virtual Distributed Switches (VDS) in vSphere networking configurations, what should distributed port group port allocations be set to – Elastic or Static?

Generally, use the default 8 for number of ports, and set allocation to elastic. This should keep the number of unused virtual switch ports to a minimum while allowing the port groups to scale up and down as needed.

What load balancing should be used in vSphere networking – Virtual Port ID, MAC hash, IP Hash, LACP, or Load Based Teaming?

Short answer – there’s no right answer for everyone. Read the long answer.

Long answer – There are MANY MANY considerations when selecting between load balancing. I want to throw out some cavaets first, and then give some general hand rules. Take these into account before reading my general recommendations! One or more of these might force you in a specific directions or rule out some of options. They should be considered first and foremost before the general recommendations.

Caveats

You can only use LACP and Load Based Teaming if you’re using VDS.

If you want to use port mirroring for any reason, LACP doesn’t support it.

If you’re using Host Profiles to configure host networking, LACP can’t be configured using them, an important consideration when using stateless autodeploy.

The only two load balancing modes that can in anyway grant a single vNIC more bandwidth than a single physical NIC in the team are LACP and IP Hash.

Both LACP and IP Hash require special switch port configurations.

LACP does not support beacon probing.

The presence of NSX drastically impacts which modes you can, can’t, should, and shouldn’t use.

Load based teaming is not supported with logical switching or edge gateways.

General recommendations

If you are using standard switches, and no VM’s vNIC requires more bandwidth than a single physical NIC in the team, use virtual port ID. It keeps CPU utilization the lowest, and requires no special switch configuration, making it generally less problematic than IP Hash.

If you’re using Virtual Distributed Switches, and no VM’s vNIC requires more bandwidth than a single physical NIC in the team, use Load Based teaming. While it costs more in CPU than Virtual Port ID, it provides a worthwhile enhancement to network performance via better load balancing, and it also requires no special switch configuration, making it generally less problematic than IP Hash and LACP.

Should Network I/O Control (NIOC) be used in vSphere networking?

If you are converging host management, VM, vMotion, Fault Tolerance, and/or Storage on the same physical NICs, NIOC should be enabled. NIOC helps ensure that no traffic type can overwhelm the others. This is especially important when it comes to IP storage traffic sharing physical links with other traffic, regardless if it’s iSCSI, NFS, VSAN, or other hyperconverged traffic such as Nutanix.

If you aren’t converging different types of traffic on the same physical NICs, there’s little reason not to enable NIOC assuming you’re using Virtual Distributed Switches.

Beacon Probing looks like a better failover detection in vSphere Networking. Should I use it?

Generally speaking, no. It requires three uplinks within the team minimum to use. Generally, making use of Link State Tracking within switches if possible is a better solution.

Share this:

Alright, Day 2 General Session Time! They hinted at a big announcement, so let’s see what they got for today!

Michael Dell and Pat Gelsinger start off with a “fireside chat” to answer submitted questions by attendees.

Acknowledged disappointing customer satisfaction for support.

Skyline from VMware for proactive support coming.

New AI and machine learning along with quantum computing are creating a new human-machine interactive atmosphere that isn’t just IT focused. It also will involve C level executives to be successful.

Everyone needs to integrate machine learning and AI into their product.

The next 30 years are going to make the previous 30 years look boring.

Today in tech will be the slowest tech day for the rest of your life.

Reinforced VMware as an open ecosystem for competition, not VMware as some kind of subsidiary of DellEMC. “If it’s good for VMware, it’s good for Dell.” The ecosystem is growing, which I actually agree.

Pivotal – Cloud Foundry is used in over 50% of Fortune 500 companies.

Pivotal Container Services announced, which will include Kubernetes and NSX.

Share this:

Have you ever gone through your vCenter and configured alarms to email? If you have, you know that if anything ever screamed for automation within vSphere, it’s this, as it is extremely tedious. You really want to use a PowerCLI script to masnage vCenter alarm email actions!

For one, the action must be set individually on each alarm. On top of that, you configure each alarm with the email address(es) you want to be sent alerts. You can also set repeated email actions on each alarm. While this provides granularity and customization opportunities, it creates a boring snoozefest, inviting confusion and human error in configuring them. Plus, vCenter contains many alerts, and many people do not know which ones you should configure for emails, and how critical each one is.

I found a good script about a year ago to do this with PowerCLI script using a CSV of the alarms. I cannot seem to find who made it now. If you stumble on this and know who originally made the similar script, please comment below. I very much want people recognized for their hard work.

Not to simply take the script, I added additional functionality to it, to provide end to end configuration, including vCenter SMTP email server settings, as well as the ability to configure multiple vCenter servers in one fell swoop. Also, I really love the design of this script for several reasons:

Simplicity rules this script. Even if you want the script to do something else, it’s easy to follow and adapt.

One can easily adapt the script to new versions of vSphere. Each version of vSphere adds, removes, or changes alarms. I can easily dump those alarm names into a new CSV and set their values.

It provides a three-tiered priority system for email frequency. Don’t like them? It’s easy to change them within the script.

Ongoing maintenance of the alerts is a snap. Change the values within the script, and simply rerun it.

The CSV file provides an opportunity to track changes to alerts and a deliverable document. I intend also to maintain the CSV files here as I receive feedback on the alarms.

Ready to use a PowerCLI script to configure vCenter alarm email actions?

This PowerCLI script to configure vCenter alarm email actions is very straight forward. You simply set the variables at the beginning of the script, which includes info such as vCenter servers, the CSV file to be used, vCenter user name and password, the SMTP server and port to use, etc.

Next, set the values within the CSV according to how you want that alert configured within the Priority column. The values are as follows :

Low Priority – Removal of all email alert actions, and reconfigured to send one email without repeat emails for non-critical alerts.

Medium Priority – Removal of all email alert actions, and reconfigured to send one repeated email daily for more serious alerts.

High Priority – Removal of all email alert actions, and reconfigured to send one repeated email every four hours for more serious alerts. (I originally set this to hourly, but customer feedback said this was far too much.)

Disabled – Removal of all email alert actions. Use this on alerts that are far too chatty (looking at you VM CPU and memory usage!), and you wish to turn emails for them off.

Blank (aka no value) – Leave the alarm as is in case you manually configured the alert and wish to keep it that way.

These CSV files include the alarms for each version, what I’ve found with customer feedback so far as the best values for each one, a notes field that has any relevant info that may help you decide how to configure each one, and an IfUsed column for each one I’ve set to disabled if you want to enable it what I would recommend.

When to Run PowerCLI Script to Configure vCenter Alarm Email Actions

I designed the PowerCLI script to configure vCenter alarm email actions not just for initial configurations, but also to set any changes that are made. That’s why the script removes any email alarm actions for Low, Medium, High, and Disabled. If you wish for the script to not change an alarm’s configuration, leave the Priority column blank for the alarm.

This way, you will have a current documented configuration of alarms that you can simply update and run the script again.p