Archive

Something has been bugging me for some time now about vCenter disk performance statistics. Basically vCenter shows each SCSI LUN with a unique ID as per the following screenshot. When viewed through the disk performance view it’s impossible to tell what is what unless of course you know the NAA ID off by heart!?

I was working on a project this weekend putting a Tier 1 SQL server onto our vSphere 4.0 infrastructure, therefore insight into disk performance statistics was key. So I decided I needed to sort this out and set about identifying each datastore and amending the SCSI LUN ID name, here is how I did it.

Identify the LUN

First of all navigate to the datastore view from the home screen within vCenter

Click on the datastore you want to identify and then select the configuration tab

Click on the datastore properties and then select manage paths

Note down the LUN ID in this case 2 and also note down the capacity

Change the SCSI LUN ID

Now navigate to the home screen and select Hosts and Cluster

Select a host, change to the configuration tab and then select the connecting HBA

At the bottom identify the LUN using ID and capacity noted earlier and rename the start of ID. I chose to leave the unique identifier in their in case it is needed in the future.

Now when you look at the vCenter disk performance charts you will see the updated SCSI LUN ID making it much more meaningful and useable.

Raw Device Mappings

If you have Raw Device Mappings (RDM) attached to your virtual machine then these to are capable of showing up in the vCenter disk performance stats. It’s the same process to change the name of the SCSI LUN ID however it’s slightly different when identifying them. To do so carry out the following.

Edit the settings of the VM, select the RDM file, select Manage Paths and then note down the LUN ID for the RDM. Use this to identify the LUN under the Storage Adapter configuration and change it accordingly.

Following making these changes I can now utilise the vCenter disk counters to compliment ESXTOP and my SAN monitoring tools. Now I have a full end to end view of exactly what is happening on the storage front, invaluable when virtualising Tier 1 applications like SQL 2008.

There are a plethora of metrics you can look at within vCenter, if you would like to understand what they all mean mean then check out the following VMware documentation.

In today’s economic climate it’s currently the done thing to sweat existing assets for as long as you possibly can. At the moment I am working on a vSphere deployment and we are recycling some of our existing ESX 3.5 U4 hosts as part of the project. So over the weekend I was testing out vMotion between a new host with the Intel Xeon X7460 processor and an old host with the Xeon 7350 processor. I was getting the following error message displayed which pointed to a feature mismatch relating to the SSE4.1 instruction set. Thankfully the error pointed me to VMware KB article 1993

Within this KB article it immediately refers you to using Enhanced vMotion Compatibility (EVC) to overcome CPU compatibility issues. I had never used EVC in anger and wanted to read up on it a bit more before making any further changes. A quick read of page 190 on the vSphere basic configuration guide gives a very good brief overview for those new to EVC.

So I was referred to VMware KB article 1003212 which is the main reference for EVC processor support. Quite quickly I was able to see that EVC was supported for the Intel Xeon 7350 and 7460 using the Intel® Xeon® Core™2 EVC baseline. In essence as far as vMotion is concerned all processors in the cluster would be equal to an Intel® Xeon® Core™2 (Merom) processor and it’s feature set. This basically masks the SSE4.1 instruction set on the Intel Xeon 7460 that was causing me the problem.

So I set about enabling my current cluster for EVC, however when I went to apply the appropriate baseline I was getting the following error displayed. The error related to the host that was currently running 3 Windows 2008 R2 x64 servers. These servers were obviously using using the advanced features of the Intel Xeon 7460 and as such that host could not be activated for EVC.

All virtual machines in the cluster that are running on hosts with a feature set greater than the EVC mode you intend to enable must be powered off or migrated out of the cluster before EVC is enabled. (For example, consider a cluster containing an Intel Xeon Core 2 host and an Intel Xeon 45nm Core 2 host, on which you intend to enable the Intel Xeon Core 2 baseline. The virtual machines on the Intel Xeon Core 2 host can remain powered on, but the virtual machines on the Intel Xeon 45nm Core 2 host must be powered off or migrated out of the cluster.)

Now here is the catch 22, my new vCenter server is virtual and sits on the ESX host giving me the EVC error message. I had to power it off to configure EVC but I can’t configure the EVC setting on the cluster without vCenter, how was I going to get round this? Luckily VMware have another KB Article dealing with exactly this situation. The aptly titled “Enabling EVC on a cluster when vCenter is running in a virtual machine” was exactly what I was looking for. Although it involved creating a whole new HA / DRS cluster complete with new resource groups, etc it was a lot cheaper than buying a large number of expensive Intel processors. It worked perfectly, rectifying my issue and allowing me to use all servers as intended.

Moral of the story…..… Check out VMware KB article 1003212 for processor compatibility before buying servers and always configure your EVC settings on the cluster before adding any hosts to the cluster. If it’s to late and you have VMs created already, well just follow the steps above and you should be fine.

I’m currently working with my colleagues on an upgrade of our VI 3.5 infrastructure to vSphere Enterprise Plus. We have recently been mulling over some of the design elements we will have to consider and one of the ones that came up was virtual Distributed Switches (vDS). We like the look of it, it saves us having to configure multiple hosts with standard vSwitches and it also has some nice benefits such as enhanced network vMotion support, inbound and outbound traffic shaping and Private VLANs.

One of the questions that struck me was, what happens if your vCenter server fails? what happens to your networking configuration? Surely your vCenter server couldn’t be a single point of failure for your virtual networking, could it?

Well I did a bit of digging about, chatted to a few people on twitter and the answer is no it would not result in a loss of virtual networking. In vSphere vDS the switch is split into two distinct elements, the control plane and the data plane. Previously both elements were host based and configured as such through connection to the host, either directly using the VI client or through vCenter. In vSphere because the control plane and data plane have been separated, the control plane is now managed using vCenter only and the data plane remains host based. Hence when your vCenter server fails the data plane is still active as it’s host based where as the control plane is unavailable as it’s vCenter based.

One thing I was not aware of was where all this vDS information is stored . Mike Laverick over at RTFM informed me that the central config for a vDS is stored on shared VMFS within a folder called the .dvsData folder. I’ve since learnt that this location is chosen automatically by vCenter and you can use the net-dvs command to determine that location. It will generally be on shared storage that all ESX hosts participating in the vDS have access to. As a back up to this .dvsData folder a local database copy is located in /etc/vmware/dvsData.db which I imagine only comes into play if your vCenter server goes down or if your ESX host loses connectivity to the shared VMFS with the .dvsData folder. You can read more about this over at RTFM

For those of you that will have heard of Alan Renouf you will undoubtedly know of his talents in the dark art of VMware CLI / Powershell. For those of you who don’t know him I suggest you check out his web site to sample some of the many great articles and scripts he’s already produced.

His latest powershell creation has recieved a lot of attention in the last couple of days and with good reason. The Daily Report is a configurable script where you can set thresholds and variables such as snapshot age, datastore space free thresholds or number of days to look at for vCenter warnings and errors. The script when run goes off and examines your Virtual Infrastructure based on these variables and then proceeds to email you a nice html report on the following items.

·VMs created in the x number of days and who created them.

·VMs deleted in the x number of days and who deleted them.

·Datastores which have less than x% of free space remaining.

·VMs that have CD-Rom or Floppy drives connected.

·VMs with no VMware Tools installed.

·Snapshots that are older than x number of days.

·Current state of vCenter Services.

·vCenter events that have been logged in x number of days.

·Windows events on the vCenter server that relate to VMware.

·Hosts in maintenance mode or a disconnected state.

Get yourself over to Alan’s site and download a copy of the script and give it a try, I did today and the results were enough for me to go ahead and implement this as a scheduled task. If you’d like to see more features in Alan’s Daily Report script then give him some feedback, there are a few good suggestions on the blog post already and I’m sure the next version isn’t far away. Great work Alan, keep it up!

Duncan Epping over at Yellow bricks has posted a most interesting article which I read tonight when reviewing my RSS feeds. It instantly struck a cord with me because recently we have been having issues with an ESX host at a site remote from our recently upgraded vCenter 2.5 U3 server.

The article refers to an issue with vCenter Update 3 in combination with firewalls using state-ful inspection. The problem occurs because of SOAP timeouts, and this behavior did not exist in VC 2.0.x or 2.5 GA, as they used a different mechanism to communicate with ESX. The official KB article hasn’t been released yet but a temporary workaround has been published by Richard. If you run into any of the before mentioned issues head over to Richard’s website and try out the workaround until the fix or official KB article is released.

When conducting operations with this particular host using VI client attached to vCenter server, we get “An error occurred communicating to the remote host” pop up more often than not. I have been looking through the logs in vCenter for this host and it appears as well as manual tasks, our overnight Platespin protection replication jobs are also getting this message when executing. This might explain some of the issues we’ve been having with some of our newer replication jobs not completing.

I’ve had a quick look at VmwareWolf’s workaround and have asked Richard if you need to create a dummy vm on each host or just the hosts that experience the problem as its not completely clear. If i get a response I’ll let you all know what it is, meantime we look forward to an official KB hot fix release from VMware

* UPDATE – 09/02/09*Richard at VMwareWolf came back to me and informed me that the dummy vm only needs to be setup on the affected host. I set ithe workaround up yesterday and it appears to have resolved our issue for a job that was consistently reporting the error. A permanent fix is still outstanding from VMware.

It would appear that some new products have appeared on the VMware product page under the “Coming in 2009 ” banner. Although these products were demonstrated at VMWorld 2008 in Las Vegas, they weren’t publicly announced through the usual communication channels, i.e. the customary VMware press release. They just seem to have appeared on the site !?

The majority of the products listed above you will have heard of before such as Fault Tolerance, AppSpeed and VMSafe. The new ones that catch the eye are highlighted above with links to the introductory data sheets listed below.

Now unfortunately I didn’t get the chance to go to VMWorld 2008 in Las Vegas so never had the chance to see any of the demonstrations of the products. However I’ve seen some videos and had a quick look at the product information and as always have a few observations.

VMware vCenter CapacityIQVAC partners have had access to the VMware capacity planner tool for some time now. One of the things that always frustrated me was that normal customers had no access to this tool. Customers would have to turn to products like PlateSpin PowerRecon in order to examine those “what if” scenarios that may require scaling up your virtual environment.

I like the idea of proactive capacity management, helping to combat the constant under and over provisioning you invariably get in virtual environments. Combine this with the capacity forecasting and your in a position to better utilise your current investments as well as budget better for future procurement. This is going to be a key factor in 2009, squeezing more out of what you got and only buying when absolutely necessary. Ahhh the good old credit crunch!!!

VMWare vCenter Data RecoveryI really like the look of this one and am presuming it is the VCB replacement. What I like about this over VCB is that it is a virtual machine deployed within your infrastructure, as opposed to the SAN attached windows storage server that was required for VCB. managing it all from the single interface that is vCenter is great news, slowly but surely vCenter is maturing into the centralised management tool it should be.

This is the one that I’ll be keeping an eye out for in the near future. Wondering what the licensing and pricing will be for this one, will it be free or is it intended as a competitor to products such as vRanger, Veeam backup and esxpress. One has to presume that full support for ESX 3.5i will be included as currently only vRanger does that, as announced on Eric Sloof’s blog post today

VMware vCenter Config ControlNot sure how much I can comment on this one as I don’t run that big a VMware environment so configuration control isn’t such an issue for me. That is until something goes wrong )

Although it’s not to important for me, I can see how configuration control could be a massive issue in large scale environments. I’m thinking back to the VMworld 2007 talk that I saw on HSBC’s virtualisation setup in the UK and the sheer size of their implementation. Due to the current lack of configuration control software out there a lot of these companies have likely put in place strict processes and policies to govern configuration management. Will they need something like this in their setup? maybe not desperately but I’d imagine anything that can lessen the load on red tape and control processes will be welcomed with open arms. Depending on the price that is!!!

I follow a lot of people on twitter who write about virtualisation and have there own blog. So up popped a message from Jason Boche over at www.boche.net about a new article he’d put up on the site about Vmotion performance.

Written by a gentleman called Simon Long it is a excellent article. It just goes to shows what happens when something doesn’t work as quickly as you’d like, some people have the determination and spend the time finding out why. This post is a gem and one of those little tweaks that will come in very handy in the future

I’ll set the scene a little….

I’m working late, I’ve just installed Update Manager and I‘m going to run my first updates. Like all new systems, I’m not always confident so I decided “Out of hours” would be the best time to try.

I hit “Remediate” on my first Host then sat back, cup of tea in hand and watch to see what happens….The Host’s VM’s were slowly migrated off 2 at a time onto other Hosts.

“It’s gonna be a long night” I thought to myself. So whilst I was going through my Hosts one at time, I also fired up Google and tried to find out if there was anyway I could speed up the VMotion process. There didn’t seem to be any article or blog posts (that I could find) about improving VMotion Performance so I created a new Servicedesk Job for myself to investigate this further.

3 months later whilst at a product review at VMware UK, I was chatting to their Inside Systems Engineer, *********, and I asked him if there was a way of increasing the amount of simultaneous VMotions from 2 to something more. He was unsure, so did a little digging and managed to find a little info that might be helpful and fired it across for me to test.

After a few hours of basic testing over the quiet Christmas period, I was able to increase the amount of simultaneous VMotions…Happy Days!!

But after some further testing it seemed as though the amount of simultaneous VMotions is actually set per Host. This means if I set my vCenter server to allow 6 VMotions, I then place 2 Hosts into maintenance mode at the same time, there would actually be 12 VMotions running simultaneously. This is certainly something you should consider when deciding how many VMotions you would like running at once.

Here are the steps to increase the amount of Simultaneous VMotion Migrations per Host.

1. RDP to your vCenter Server.
2. Locate the vpdx.cfg (Default location “C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter”)
3. Make a Backup of the vpdx.cfg before making any changes
4. Edit the file in using WordPad and insert the following lines between the <vpdx></vpdx> tags;

A Cold Migration has a cost of 1 and a Hot Migration aka VMotion has a cost of 4. I first set mine to 12 as I wanted to see if it would now allow 3 VMotions at once, I now permanently have mine set to 24 which gives me 6 simultaneous VMotions per Host (6×4 = 24).

I am unsure on the maximum value that you can use here, the largest I tested was 24.

Time Different = 3.063 Mins!! I wasn’t expecting it to be that much. Imagine if you had a 50 Host cluster…how much time would it save you?
I tried the same test again but only migrating 6 VM’s instead of 16.

Migrating off 6 VM’s with only 2 simultaneous VMotions allowed.Time taken = 2.24

Now don’t get me wrong, these tests are hardly scientific and would never have been deemed as completely fair test but I think you get the general idea of what I was trying to get at.

I’m hoping to explore VMotion Performance further by looking at maybe using multiple physical nics for VMotion and Teaming them using EtherChannelor maybe even using 10Gbit Ethernet. Right now I don’t have the spare Hardware to do that but this is definitely something I will try when the opportunity arises.

I was looking into an issue following an upgrade to vCenter Server 2.5 last weekend. So I set about searching through the file system for the log files on the server with very little luck to be honest.

I then found two excellent posts from Rick Blythe a.k.a the vmwarewolf, the posts detail the locations of the logs and what each one means. This is an excellent post and one that I’m going to keep handy for all those strange little issues where insight into the logs might give a clue to the problem.

Well this weekend I had the job of updating our Virtual Center deployment from 2.0.2 to 2.5 update 3. The primary reason for this was to prepare for the introduction of ESX 3.5 hosts into the Virtual infrastructure.

Due to the importance of our Virtual infrastructure I decided to do as much reading and preperation as possible to ensure it all went smoothly. It’s easier to convince our change management team to let us make changes if we get them right first time, so this one was important to facilitate an easier path for future change.

So what did I read to ensure I’d covered everything, here are a few links to get you started

The above PDF is a brilliant guide to the process you should follow, including rollback. You should read this thoroughly so you understand all the pre-requisites and can avoid those silly problems that could cause your upgrade to fail.

Of course vCenter Server 2.5 introduces Update Manager. Although we can’t use it as we have ESX 3.0.1 and 3.0.2 hosts ( supports 3.0.3, 3.5 and 3.5i only ) I decided to install it anyway so it’s there for the future.

Here are some of the links I used to plan out my update manager deployment.

One of the main mistakes that people tend to make is to not give the SQL accounts the correct permissions on the new update manager database and the MSDB database. Make sure you cover this one or you upgrade will fail.

It all went quite smoothly, I initially had a couple of issues which appeared to be related to me attempting to do a custom install. I wanted to ensure I could go through all settings and customise as required, the install however failed with various MSI error messages. I started the install again and didn’t choose the custom setup this time. This however created an issue whereby the Update Manager database appeared to install as SQL Server 2005 Express. I wanted to put it on the same SQL 2000 server as our Virtual Center database but I never got the option as far as I can remember. I have today uninstalled Update Manager and re-installed it using the “use an existing database” option and the SQL 2000 database. It worked fine the second time around.

No immediate problems following the upgrade, I had read some horror stories about issues with the Virtual Centre Agent on the host not updating. Luckily for me it wasn’t an issue. Good luck with your upgrade