This morning I was greeted by vCenter with high CPU utilization on all my VDI ESXi host. vCent alarm shows that it started at around 1AM. My gut feeling tells me that it was a scheduled network vulnerability scan or misconfigured McAfee EPO on Virus scan. Looking at various logs did not really revealed any evidence of such but I spent about 3 hours carefully looking cause I wanted to pin the blame on another group (I’m sure someone can relate).

I started looking carefully at “taskhost” in Task Manager since this is the one consuming the most CPU on multiple sampled machine. The problem is that “taskhost.exe” is the process for scheduled task and your really need to use Process Explorer to really get more info about the process. I don’t know any other tool that will do it other than Process Explorer.

I look for taskhost.exe that has high CPU utilization. The image does not show multiple taskhost but on one of my vm there was about 2 or 3 depending on the running task at the time. I hover my mouse on the taskhost.exe to see the details. I am interested in the last line “\Microsoft\Windows\RAC\RacTask”, this is the path on the task scheduler.

Checking the details of the task shows that it gets triggered two ways. Mine was triggered by EventID:1007. Now I did not have time to investigate the event and besides I don’t think I would need Reliability Monitor on my VDI environment taking up precious CPU resources.

THE FIX:

– Finding it was the hard part, the fix is real easy. The fix is just to kill the process via the GUI or command line. Once I issue the command I see the Task Manger CPU utilization effect immediately.

SCHTASKS.EXE /end /tn \Microsoft\Windows\RAC\RacTask

– To stop it from re-occurring I disable the task:

SCHTASKS.EXE /change /disable /tn \Microsoft\Windows\RAC\RacTask

Now to send it on all virtual machines you can use so many different tools of your choice. If you want something free you can use PSEXEC by Microsoft Sysinternals. Just read up on how to do it, there’s plenty of resources online.

DISCLAIMER: Sorry I did not really test my syntax on PSEXEC, I’m just doing it from memory. We use PDQ Deploy (There’s free one) on this particular one; GFI Languard is another tool that can do it.

Just to be sure I added “c:\Windows\System32\schtasks.exe /change /disable /tn \Microsoft\Windows\RAC\RacTask” on startup scripts for those once PDQ missed. It will be a good idea to update my gold image. I use the old Quest vWorkspace Desktop Optimizer but I guess this setting is untouched using the tool.

When you need to install View agent on a physical box or an unmanged desktop source. When you don’t control the VM infrastructure or maybe the VDI is in the cloud. When you don’t have vCenter or license for vCenter managing the ESXi, one would argue that if you have license for VDI you have license for all the component to run VDI. For my special use case, my View Connection Server is not going to be able to talk to the backend vSphere management Infrastructure for it is in a complete separate network. In other words the virtual machine network and the vSphere network is physically separated and they don’t talk to each other. There is one nic from each ESXi to the virtual machine network to expose the Win7 VMware View vm’s. There would be zero attack footprint from the virtual machine network to the vSphere network infrastructure. The only way to attack the vSphere infrastructure is through some kind of VMware tools to hypervisor vulnerability exposed on the VM itself that can attack the underlying hypervisor. I don’t know of such vulnerability but it doesn’t mean there’s none and does not guarantee the future. The possibility of such attack exist. I don’t know what kind of sandboxing techinique VMware has for their vmtools for protection. The other attack is, pretty obvious, if you are in the vSphere network itself, duh!! Enough blablabla, this will take you to the GUI install and prompt you to supply the View Connection Server IP or FQDN.

For the past couple of weeks a newly created site-to-site VPN has been showing inconsistency. Some of the machines you can not ping through the VPN when more than half you can. I can ping from one direction yet the ping from remote end coming back is bad. There was one that I ran a continous ping and it did not succeeded until 2 minutes had passed. Another weird part is typically you can issue “clear crypto isakmp sa” to reset all VPN connection but with this particular one, the only course of action was to reboot one or both the ASA endpoint. Which you can imagine it is not a pretty fix and would be frown upon. The only thing special on this config is I am specifically using IKEv2 on both ends. I mean why not, they are both 5520 using the same latest firmware so there should be no conflict or compatibility issue.

After wresting with the debug for days, and looking at the cyphertext side from Wireshark, I finally narrow it down to one error “Need to send a DPD message to peer” in which there is zero to no information on the web. After reading a couple of sources I realize that IKEv2 has a built-in feature to detect neighbor state. DPD and keepalive are just product birthed by the shortcomings of the original IKEv1. I change my VPN config:

Some might ask if I tried “isakmp keepalive disable”. Yes, I tried the disable but the output of “sh crypto isakmp sa detail | in DPD” still shows it is on to its default threshold 10 and retry 2 even after reboot. And even with the disable keepalive I am still getting inconsistent VPN behavior.

These 3 combination for whatever reason cripples Windows update. This issue is still unsolved for me due to lack of any Mcafee logs that can point to the signature that is causing it. I would disable IPS every patch Tuesday to get the updates as a workaround for the time being. Here are the symptoms.

The hard part is not have a clear log that points the the root cause. Another issue is whether to call Mcafee or the VMware View team. This is going to require more time to be diagnose properly in the near future.

VMware customization scripts does not complete when Mcafee HIPS IPS is enabled. When creating a Windows 7 master image for VMware View or just a regular vm, make sure that the IPS is disabled on your golden image.

Love VDP since from the start. We use Veeam backup before but we found out VDP is better suited for our environment. No offense to Veeam, I think they wrote a very good piece of software but as far as simplicity, future proof and $$ you can’t beat VDP.

In the past I get error and they are all mainly due to stale snapshots related. My goal is just to share how I troubleshoot and delete stale snapshots. You must be really, really carefull when you are manually deleting snapsthots in the datastore, the best bullet-proof advice is that to make sure the VM is “ON” and running during deletion of stale vmdk. Why? VMFS, locks the files he is using when running so any attempt to delete/rename/move to the vmdk will be unsuccessful, which is what we want.

1. Delete all snapshots you have from snapshot manager that you are using. 99 percent of the time I tend to clean-up all my snapshots on the server as best practice. RUN VDP

2. Run “Consolidate” if necessary. RUN VDP

3. If you still get VSS error or something like “Cannot take a quiesced snapshot”. STALE SNAPSHOTS – Navigate to the datastore that this particular VM resides and look for something like “VM000002.vmdk”. Remember that we deleted all snapshots so there should be nothing like this there. Before you delete it, double check the date.

4. VMware agent and third party VSS contention. Be aware of any third party backup agent that has its own VSS like Backup-exec. VMware agent can use backup-exec agent without a problem, the issue is the order you install the Backup-exec agent. Backup-exec agent should be the last to be installed. Gotchas… Be aware of this when you are upgrading the VMware agent, you will need to reinstall the backup-exec agent again if you start getting backup error.