I need some help with a some CPU constraint issues( i think) diagnosis. I have a client with a Citrix Xenapp farm (5 desktop servers/9 Application servers) spread accross 12 ESX hosts running ESX4.0. All Citrix servers, desktop and application servers have 2 cCPU's allocated and 54% of the VM's running on the ESX hosts are allocated 2 vCPU's. The problem we are running into is that the Citrix Application servers are running(crawling) when trying to access the published applications. I personally think it is a multiple VCPU issue but I would like to get an opinion on if my findings are true.

My questions are what are the best practices for Citrix on ESX?
Is it typical to have multiple VCPU's per server in a Citrix environment?
For ESXTOP readings should I be looking at the %RDY and overall performance of the host?

2 Answers
2

Take a look at some of the performance papers at Virtual Reality Check. They've dome alot of benchmarking and testing of various configs for term servers and VDI. In general the "old" guidance for citrix for 1 cpu per host is no longer true. Additionally scheduling in 4.0 is much better than in older versions and scheduling is likely not to be the issue. %rdy is the metric to monitor to verify CPU contention. My bet would be on poor network performance or poor disk performance before CPU issues.

The best practice has been debated for a while, some say 1vCPU some say 2+. IMO, 2 vCPU is pretty good. The problem with 1 vCPU, is a logon can hog all CPU which affects everyone else on the server.

What OS are your app servers running? How over commited (if any) is your ESX hosts resources? Also, is it the virtual desktops accessing the virtual application servers that are having issues? Try accessing the virtual apps from a physical system and see if they're any better?

If you think it's a CPU schedulign problem, meaning you think you've over provisioned the VM and it's waiting for two cores to free up to execute, then watching the %rdy would be what you'd want to monitor. I think consistently above 500 is when you have an issue. Don't forget to monitor your disk latency as well.

Finally, what A/V do you have? We had a lot of performance problems at first and it all came back to mcafee excessivly swaping (some sort of bug). We switched to MS forefront (we were going down this route anyway) and a lof of those issues went away.