Related topics

Virtual management? It's complicated

Load balancing VMs by hand

Common Topics

Sysadmin Blog Virtualisation is complicated. I am not talking here about implementation, or even the concepts or technologies underpinning virtualisation. I am talking about the realities of managing and maintaining a virtualised infrastructure.

My particular quest of late has been one of decreasing power utilisation, a difficult task in an almost entirely virtualised environment. Virtualisation is mature enough now to have many different vendors offering many tiers of products to help you get meet your goals. All these management tools and virtualisation platforms exist only to solve the same basic set of problems. Strip the particular approaches taken by specific vendors away, and we can take the time to look solving the problems themselves.

Once you know how to solve the problem the hard way, gauging the effectiveness of tools designed to take the grunt work off your hands will probably be easier. From a power management standpoint, critical workloads with a stable resource requirement are probably bad candidates for virtualisation. I can best deal with my power consumption requirements by tailoring a piece of hardware specifically to them. (Or more accurately, introducing a new class of server into my fleet designed for ultra-low-power use.) What then of dynamic workloads? (Services with bursty resource requirements, or which are very cyclical, such as only operational during the day?)

I think these are ideal candidates for virtualisation; many of these workloads with offset usage patterns would keep server utilisation high. This becomes a game of load balancing, and load balancing is the art of knowing your network. It takes some very careful consideration and planning to deal with virtualising dynamic workloads. It is easy to look at a CPU graph without understanding the impact the workload has on other system resources. CPU usage and disk I/O are the traditional banes of virtualisation, but both network and even RAM bandwidth can become constrained by the improper mix of virtual machines (VMs) on a host.

Sadly, there are no magic bullets to solve this problem. Unless you are planning on massively overspecifying the hardware you run your workloads on, then you need to do the legwork of workload profiling. Establish a baseline for the systems currently supplying that service to your network. Look at total resource consumption, and try to identify patterns that recur over time. In order to accomplish this, I actually find that virtualising the workload onto its own server for a while is a great help.

Virtualisation tools often offer you some reasonably good reporting on the resource consumption of individual VMs. Run your VM on its own server for a while until you have it profiled, and the configurations tweaked for running in a virtual environment. After a few days or weeks of this, you should be safe to put it in the cage with the other hamsters and introduce it to a shared resource environment.

Now to get really complicated you can start trying to power off systems or at least send them into a lower power state when your network is not under load. This requires some really advanced load balancing where you are taking into account not only resource division, but trying to align those resources in such a way that for long periods you can power down whole systems. In this sense VDI is almost optimal.

If your workers are primarily the 9-5 type, then it should be possible to set some basic scripts up to help you in your quest. Remember that an operating system which “suspends” itself while virtualised actually tells the virtualisation software to suspend it. Suspended VMs have their RAM written to a file on disk, ready to be woken up later. (In this way, “suspending” a virtualised operating system actually becomes a form of “hibernating” it.) If you set your VDI guests up such that their power management will suspend the operating systems after a given period of inactivity after hours, then in theory there will come a point where all those user VMs are suspended.