Everything You Always Wanted to Know about XenMotion

I met with group of consultants and sales people for a large systems integrator a few weeks ago to discuss server virtualization with Citrix XenServer. During the discussion, someone asked for a demonstration on XenMotion. Unfortunately I did not have the hardware with me at the time needed (two XenServer hosts) to show a demonstration of live migration of virtual machines with XenMotion. After that meeting it occurred to me the best way to have a demo of XenMotion available any time is to post one up on the Official Citrix Blog.

XenMotion is a feature of Citrix XenServer Enterprise that gives an administrator the ability to move a running virtual machine from one XenServer to another. Virtual machines can be moved from server to server without service interruption for zero-downtime server maintenance. Administrators can move running application work loads to take advantage of available compute power.

-

-

XenMotion is a very popular feature of Citrix XenServer Enterprise that works in conjunction with Resource Pools. You can live migrate a virtual machine to any other XenServer host running in the same resource pool. For more on Resource Pools, see this document in the KnowledgeBase.

One of our sales engineers, Adam Lotz, recorded a video of XenMotion in his lab. Adam sent me the video, and I uploaded it to YouTube. In this video, Adam has two XenServer hosts running in his lab with multiple VMs on each.

-

-

In the first 30 seconds of the video, Adam live migrates a running CPS 4.5 virtual machine from one XenServer host to the other. The low resolution offered by YouTube makes it difficult to see admittedly, but you can see the highlighted virtual machine pop over from one host to the other in a matter of seconds.

At about the 1:07 mark, Adam brings up a command window and starts to ping citrix.com. While the virtual machine is running, the ping time is consistently 1 ms. At about the 1:31 mark, Adam clicks and drags the running CPS 4.5 virtual machine from one host to the other to start the XenMotion. He clicks yes when the Confirm dialog box pops up at 1:36. You then see the ping time increases to 11 ms for two pings. The ping time drops down to 9 ms, up to 10 ms, then the virtual machine pops over to the other XenServer host and the live migration is complete by about 1:45 – nine seconds after clicking OK on the confirm dialog box. Next, Adam expands the command window and highlights the ping time to show the impact of the XenMotion. You see the ping time jumped up to 49 ms while the live migration actually occurred, then immediately dropped back down to 1 ms. NO packets were lost at all.

(Since Adam narrates the video live as he shows it, there are portions of the video where Adam built in some talk time, so you see nothing happening on the screen for 10-20 seconds)

I plan to upload the video as an attachment so you can see all the video more clearly at a higher resolution. There is an issue with adding attachments currently, but once that is resolved I will upload the video file.

If you would like to see another video demonstration of XenMotion, Doug Brown and Chas Setchell of 2Virtualize include XenMotion in the extensive technical overview video on Doug’s site. You can see the entire video “Citrix XenServer Enterprise v4 Technical Overview Video – DABCC-TV – 12-3-07 – Episode #2” at this link.

Summary
This FAQ article includes questions related to XenMotion using XenServer 4.0.1. Refer to CTX115716 – Citrix XenServer 4.0.1H FAQ for the full set of FAQs.

Q: What is XenMotion?

A: XenMotion is a feature that allows you to move a running virtual machine (VM) from one physical XenServer Enterprise server to another without any downtime.

Q: Which of your products support XenMotion?

A: XenMotion is only provided in our XenServer Enterprise product.

Q: What are the requirements to enable XenMotion?

A: You need at least two XenServer Enterprise servers running in a resource pool.

The XenServer Enterprise servers must have similar processor configurations, some type of remote shared storage such as iSCSI or Network File System (NFS), and a gigabit network connecting them.

Q: How similar do the processors need to be on the XenServer Enterprise servers?

A: To use XenMotion, the processors must be the same type, but can have slight differences (such as CPU speed). So, for example, all the systems would need to have Intel Xeon 51xx series processors. They could be different speeds, so you can mix systems with Xeon 5130 and Xeon 5140 processors. The same is true of AMD processors.

Q: Can you XenMotion a VM between an Intel and AMD system?

A: No. You can only XenMotion a VM between systems with the same processor manufacturer and type.

Q: Does XenMotion require you to have the same exact configurations for your server systems?

A: While you do need to have the same type of processor in each system, other configurations can differ. You can have different amounts of memory, different storage controllers, and different network controllers in each system.

Q: What type of storage does a VM need to be stored on to enable XenMotion?

A: A VM must be stored on remote shared storage to allow for XenMotion.

Examples of this are connections to NFS- or iSCSI (through a software iSCSI initiator)-based storage.

Q: What networking speed is required for XenMotion?

A: We recommend that you use Gigabit Ethernet between your physical servers.

Q: How much downtime occurs during a XenMotion?

The actual downtime during a XenMotion is generally 100-150ms. This downtime is so slight that services running in the VM are not interrupted. Most of the 100-150ms downtime is caused by your network switching equipment moving traffic to a new port.

This document applies to:

XenServer 4.0

XenExpress 4.0

XenEnterprise 4.0

For those looking to plunge off the deep end and dive down deep into the theory and thought process behind live migrations in Xen, I also found a link to the original white paper written on live migration of virtual machines on Xen by a number of the developers of the Xen open source hypervisor project, including Ian Pratt (founder of the Xen Project). You can read the entire document at this link

Below is a portion of the of the text from the introduction –

Migrating operating system instances across distinct physical hosts is a useful tool for administrators of data centers and clusters: It allows a clean separation between hardware and software, and facilitates fault management, load balancing, and low-level system maintenance.

By carrying out the majority of migration while OSes continue to run, we achieve impressive performance with minimal service downtimes; we demonstrate the migration of entire OS instances on a commodity cluster, recording service downtimes as low as 60ms. We show that that our performance is sufficient to make live migration a practical tool even for servers running interactive loads.

In this paper we consider the design options for migrating OSes running services with liveness constraints, focusing on data center and cluster environments. We introduce and analyze the concept of writable working set, and present the design, implementation and evaluation of high performance OS migration built on top of the Xen VMM.

A very interesting portion of this paper is the section covering a set of phases for live migrating the work sets in memory.

Moving the contents of a VM’s memory from one physical host to another can be approached in any number of
ways. However, when a VM is running a live service it is important that this transfer occurs in a manner that balances the requirements of minimizing both downtime and total migration time. The former is the period during which the service is unavailable due to there being no currently executing instance of the VM; this period will be directly visible to clients of the VM as service interruption. The latter is the duration between when migration is initiated and when the original VM may be finally discarded and, hence, the source host may potentially be taken down for maintenance, upgrade or repair. It is easiest to consider the trade-offs between these requirements by generalizing memory transfer into three phases:

Push phase- The source VM continues running while certain pages are pushed across the network to the new
destination. To ensure consistency, pages modified during this process must be re-sent.

Stop-and-copy phase The source VM is stopped, pages are copied across to the destination VM, then the new
VM is started.

Pull phase The new VM executes and, if it accesses a page that has not yet been copied, this page is faulted in
(“pulled”) across the network from the source VM.

(This document is written in 2005, so many things may have changed in that time. This paper is specific to the Xen open source project itself.)

-

This powerful XenMotion feature for live migration is one of the most popular capabilities of Citrix XenServer 4.0. If you have any specific questions about XenMotion, please post them in the comments. As I find more resources (especially visuals and demos) I will post them on the blog.