Contents

In order to live-migrate an application or a container you should make sure, that files, that are/can be accessed by processes you're migrating are available on both nodes -- source and destination. This can be achieved by using either shared file-system such as NFS, GlusterFS or CEPH, or by using rsync to copy files from one box to another. Further in this article we assume, that the file-system is the same on both sides.

Another thing you should take care of is the networking. General rule here is that IP addresses, that your application uses should be available on the destination host. The reason for that is -- when restoring TCP sockets CRIU will try to bind() and connect() them back using their original credentials, and if the requested IP address is not available for some reason on the destination side, the respective system call will fail. Also during migration the connections will be locked by CRIU and there are two options here.

First, is when your app shares the networking with the host. In this case CRIU locks connections using iptables rules, so you should make sure the rules are available on the destination side. Second option is when the app lives in a net namespace (a container). In this case CRIU will call action scripts to lock the network and it's up to you how to lock it. In case of Docker the latter daemon handles it by the libnetwork library.

The directories with images would contain two copies of applications memory, which may be space-consuming. The CRIU can perform disk-less migration to address this.

Another issue with this way of doing live migration is that while copying memory on remote host tasks remain frozen. If there's a LOT of memory, this freeze time can be big. CRIU can speed this up by doing iterative migration.

If you're live migrating a shell job, remember that --shell-job option must be used on both stages -- dump and restore. See more details about shell jobs here.