Imageimport

Emulab Installation Documentation

Importing and Customizing an Image

Importing an OS Image from Utah's Emulab

Introduction

An easy way to get new images into your Emulab is to import them from Utah. We often make images available at http://www.emulab.net/downloads/images-STD and this document describes how to take an image from there and make it yours!

The basic idea is that you take one of Utah's exported images, load it onto one of your nodes, perform any necessary site specific changes (such as adding missing drivers), then save a local copy. In fact, if you do not need to perform any site specific changes (the image boots just fine) then you do not even need to save a new copy; you can just copy the image file from Utah into place. We call this method express import cause, well, everything needs a name.

An older and more involved method is also described later in this document. Hopefully you will not have to use this method, but if you run into problems (image does not boot cause of a missing driver), or if your testbed has not been updated to at least the stable-20120409 tag of the emulab-stable repository, you should use the old method. We call this the "slow import" method.

Whichever method you use, please read through this document before starting.

In addition, before starting here, you should have already completed the instructions on MFS Import page. You cannot proceed until you have working MFSs (boot on your particular nodes).

Image Types

When we talk about images, we are referring to images created with Emulab's Frisbee disk imaging system. By (our) convention, these files end in .ndz. There are two basic types of images: whole-disk and partition.

Whole-disk images include an x86 DOS-style Master Boot Record (MBR) which includes a partition table describing one or more partitions on the disk. Each partition has a type, offset and size. The type is used by the imaging system to determine what OS is in the partition.

Partition images include the contents of a single DOS partition. The important thing to understand here is that a partition image cannot be properly restored to a disk unless the disk already has a valid MBR which defines the partitions.

There is also a third, now deprecated, format called the Combo image. This is a whole disk image that contains multiple OS installs (typically FreeBSD and Linux) in multiple partitions. These were traditionally used as the default image on nodes so that free nodes would come ready to boot with either FreeBSD or Linux. For the most part, combo images just added a layer of confusion (in the form of "image IDs" vs. "OS IDs") to disk imaging in Emulab. In place of combo images, you can now specify multiple images to load on a node. In other words, don't worry about combo images if you are setting up a new testbed.

MBR versions

We currently support two MBR formats for partition images, both defining three partitions (with an "empty" forth partition). The original "V1" format had two 3GB partitions and one 128MB partition. The intent was that the first two could be used for OSes (the notorious combo image with FreeBSD and Linux) and the third was a swap partition, and everything would fit on a 9GB disk with room left for a user-definable fourth partition.

The newer, "V2" image recognizes that it is no longer 1998 and that disks (and OSes) have gotten larger. It has two 6GB partitions and one 1GB partition all to fit on a 13GB disk. Yes, that is still pathetically small, but we wanted this format to work with even our oldest machines (pc600s).

The newest, "V3" image recognizes that it is no longer possible to build a kernel on a 6GB partition. It has one 18GB partition and one 3GB partition for swap.

Why does the version matter? Two reasons. First if you load a V2 image on a disk with a V1 MBR, it won't fit. While the end result might boot, you will be missing the back half of your filesystem. Second, if you load a V1 image on a V2 MBR, it will fit but might not boot because the disk offset of the partition may be different (for partitions two and higher).

At any rate, both V1, V2, and V3 are available from the downloads directory, if you need them. Typically you won't if you are using the "express" version of the import instructions.

Images in the database

The Emulab database has a record for each image in the system. In addition to the name of the image and its unique ID, it also contains info about whether it is a whole-disk image or not, what MBR should be used, and what OS is in each partition.

Normally, an image is created in Emulab through the web interface by customizing an existing image. We also support an XML-format description of images that can be loaded via a command line tool (load-descriptors). When we make an image available for download, we also include an .xml description file as well.

Getting Started

Whichever method you use, the initial steps are the same, so lets get those out of the way.

Step One; Get an Image

Download the desired image and the corresponding XML description from http://www.emulab.net/downloads/images-STD. We have an assortment of FreeBSD and Linux (Fedora, Ubuntu, CentOS) images available for both 32- and 64-bit architectures. We cannot provide Windows images due to licensing.

Now place the .ndz file in /usr/testbed/images (be sure to back up any existing image of the same name):

boss> sudo cp $IMAGE.ndz /usr/testbed/images

Step Two; Import the Descriptor

Create the database descriptor using the XML file you downloaded in the previous step.

boss> wap /usr/testbed/sbin/load-descriptors $IMAGE.xml

If this complains that the image already exists, don't worry. There is no reason to create it twice.

If it complains that no node types were supplied, then rerun it with the force option:

boss> wap /usr/testbed/sbin/load-descriptors -f $IMAGE.xml

You will set the node types for this image later on.

Step Three; Allocate a Node

You need a node to work with. If you are adding an image to an already running emulab (ie: this is not your first image import), then just create a one node experiment in the emulab-ops project. It does not matter what OS you tell it to boot in the NS file. Wait for the experiment to swapin and then skip the next paragraph.

If on the other hand, you are in the process of installing your Emulab and this is your first image, you should have nodes sitting in the hwdown experiment since you have just completed adding nodes to your Emulab. The easiest approach is to force the node to reload the image. To make things go a little smoother though, this command ensures that the DNS entries are in place:

boss> named_setup

If you are using the Express method then continue with the next section, otherwise skip down to the Slow section.

Express Method

Force the node to load the image:

boss> wap os_load -i $IMAGE pcXXX

where pcXXX is the name of one of the nodes sitting in hwdown.

Now is the time to be watching the console of your node! If all goes well, it will boot up just fine. If it does, you are done! There is no need to take a snapshot since each time the image is loaded onto a node, the Frisbee loader will localize the image. There is more info on the MFS Import page regarding what is localized.

Of course it you really want to take a snapshot, then by all means do so, as described in the Taking a Snapshot section, but first update the image descriptor as described in the next section.

Updating the Node Types

If this is the first time you have imported this particular image, or if you have added new node types to your Emulab and confirmed that the image boots on them okay, you need to edit the image descriptor to update to list of node types that can boot this image. This is done via the web interface, in red dot mode. Log into the web interface and go red dot. Then in the Experimentation drop down menu, click on List ImageIDs. Then find the image descriptor; it will typically be named the same as the .ndz file you downloaded. Click on it.

In the upper left is another menu; click on Edit Image Descriptor.

Now click on the types you want to add or subtract. You have to have at least one node type selected. Then click on submit.

Taking an Image Snapshot

To take a snapshot of an image from your node, you need to go back to the image descriptor on the web interface (described in the previous section), and choose the Snapshot Node Disk to Image. On the next page you will be prompted for the node. Fill this in and confirm.

This operation will take a while and is another good time to be watching the console. The node will reboot into the Frisbee MFS and you should see the login prompt. The image capture process will start in a minute or two and run in the background, writing the image to /proj/emulab-ops/images. You won't see anything on the console until the snapshot is complete, but you should see the .ndz file slowly growing. At some point the node will reboot.

The snapshot process is officially complete when you receive the email message from Emulab that includes the log. Hopefully the message says it completed okay! Note that the email message is sent to the user you are logged in as, so if you are doing this as elabman be sure to watch for that email.

Testing the Image

The easiest way to test the image is to create a new single node experiment. For example:

But of course replace FEDORA10-STD and pc2400 with the appropriate image ID and node type. Swap in this new experiment and if all goes well, you are officially done and the image is ready for use.

Slow Method

First off, make sure you have read and completed the three steps mentioned in previous sections of this document.

Load the New Image

Force the node into the admin MFS using the node_admin command:

boss> node_admin on pcXXX

Once the node is in the admin MFS, you should be able to ssh as root from your boss machine.

Login to your "boss" machine. You will use ssh from the boss machine to do the rest.

For the benefit of cutting and pasting the following commands, we will set some shell variables for values that might change. And to avoid confusion of sh vs bash vs csh vs whatever, we will just use bash, so once you are logged in to boss, start by running bash and setting some environment variables:

where <nodename> is whatever node you have allocated above, and <image> is the basename of whatever image you downloaded from Utah. For example, if you downloaded FBSD82-STD.ndz and FBSD82-STD.xml, then set IMAGE=FBSD82-STD.

Figure out what the node root disk is. You can look through the boot time output of FreeBSD on the node to find it. If you missed that output, you can ssh into the node and run dmesg:

sudo ssh $NODE dmesg

It will likely be either "da0" (SCSI, SAS or some HW RAID controllers), "ad0" (old IDE), or "ad4" (SATA). Other types of RAID controllers may have a variety of names like: "ar", "aacd", "twed", etc., depending on the controller you have. If you cannot find anything in the output that looks like a disk, you may have an unsupported disk controller. Contact emulab-admins@googlgroups.com if this happens (and be sure to have your "dmesg" output handy!)

Assuming the root disk was found, set a shell variable for it to use in the following steps:

DSK=<your-disk-here>; export DSK # e.g., "da0"

Make really, really sure that your NODE variable is set correctly. This would be a truly excellent time to make sure that $NODE is set correctly. If it were somehow incorrectly set and you wound up sshing into boss or ops or your desktop or laptop or ..., you could wipe out the hard drive! So:

echo $NODE

Make sure the echoed value is not null, "localhost", "boss", or any machine that is important to you. It should be the name of the node you chose. To make really sure:

sudo ssh $NODE hostname

and again make sure the value is not "boss"!

Install the appropriate MBR on the node. First, determine if you need an MBR or not:

grep wholedisk $IMAGE.xml

If an attribute line is returned and the value is "1" then no MBR is needed. Otherwise, you will need to make sure you have the correct version of the MBR handy.

Image loading should take anywhere from 45 seconds, to several minutes.

NOTE: If the ssh returns with "Killed" then imageunzip ran out of memory. By default, imageunzip will consume memory without bound for buffering of pending disk writes. If imageunzip grows too big, the system will kill it. In this case, retry the imageunzip with "-W <num-MB>" where <num-MB> is a number of MB maximum to use for disk buffering. Using about half of the available physical memory should be safe (e.g., if the machine are loading has 2GB of memory, try "-W 1024").

NOTE: if you get back:

WARNING: requested zeroing in slice mode, will NOT zero outside of slice!

don't worry, it is a bogus warning from an old build of imageunzip and is harmless. You should probably update to a newer MFS however.

Customize the New Image

Customizing the image consists of two parts;

Fixing things that cause the image not to boot. This could mean installing a new kernel, changing device driver loads, etc.

Localizing various files that are specific to your testbed, such as the timezone, ssh host keys, root ssh keys, etc.

If your testbed has been updated to at least the stable-20120409 tag of the emulab-stable repository, you no longer need to do the second stop (localization) since the MFSs will do this automatically, so skip that step.

Customization is done from the MFS by mounting the disk filesystems. The following assumes the FreeBSD-based admin MFS. If you have the Linux MFS, tweak accordingly.

First, login as root from boss and set that magic DSK and PART variables as you did on boss (note the csh syntax, as the root login shell on BSD is csh):

FreeBSD

Linux MFS users: Linux maps partition numbers onto BSD partitions in a somewhat less-than-straightforward way, so manually mounting each BSD partition as described above can be a hassle. Instead, you can use the mount_bsd_slice script to do the same thing more easily:

/etc/testbed/mount_bsd_slice /dev/${DSK}${PART} /mnt

The MFS has a much scaled-down set of binaries. To get access to a more full-featured system, you can run binaries from the disk image itself (FreeBSD MFS only):

/mnt/boot/kernel Images used at Utah run a custom FreeBSD kernel with fewer drivers built in. Unless you have the same hardware we do, you might run into some problems. To see if you have a Utah image do:

ls -d /mnt/boot/kernel.GENERIC

If that directory exists, then you do have a Utah image and you need to install that generic kernel as the standard:

/mnt/etc/localtime Copy the correct file over from /mnt/usr/share/zoneinfo. For example:

cp -p /mnt/usr/share/zoneinfo/MST7MDT /mnt/etc/localtime

/mnt/etc/master.passwd Set the root password. This is again an optional step as per-experiment root passwords are set on every node at swapin time. But if you do choose to set one now, make sure you copy the changed version into the etc/emulab subdirectory as well:

/mnt/etc/ssh/ssh_host* As mentioned in the introduction, we use the same host key for all images. If you want to do that, and if you correctly localized your MFSes, then you have already generated a set of site-specific host keys, and you can copy them to the disk with:

cp -p /etc/ssh/ssh_host* /mnt/etc/ssh/

and then skip to the next bullet item. If you did not generate host keys for your MFSes, you can generate keys now with:

This installs them in the disk image, you will still have to go back and install these same keys in the sources for your frisbee/freebsd MFSes later using the MFS customization instructions. So save the keys from /mnt/etc/ssh off somewhere (not in the runnning MFS!)

/mnt/etc/emulab/{client,emulab}.pem These should have been created on your boss node when you did the boss setup. So return to your boss node and copy them over (you cannot "pull" the files over because boss does not trust the node):

The last step updates some files, in particular /etc/fstab, with the current disk type. Now you can skip to testing.

Linux

Mount the Linux filesystem:

mount -t ext2fs /dev/${DSK}s${PART} /mnt

If you are using the Linux MFS, run this command instead:

mount -t ext3 /dev/${DSK}${PART} /mnt

Now you can update the necessary files as follows.

/mnt/etc/localtime Copy the correct file over from /mnt/usr/share/zoneinfo. For example:

cp -p /mnt/usr/share/zoneinfo/MST7MDT /mnt/etc/localtime

/mnt/etc/shadow Set the root password. This is again an optional step as per-experiment root passwords are set on every node at swapin time. There is no easy way to do this for a Linux image from the FreeBSD MFS, so you might want to copy the file up to boss and fixup the password file there. From boss:

You will have to find the appropriate password hash in one of your other images. You can directly use the hash from a FreeBSD /etc/master.passwd file (but don't use the hash from your boss machine!).

/mnt/etc/ssh/ssh_host* As mentioned in the introduction, we use the same host key for all images. If you want to do that, and if you correctly localized your MFSes, then you have already generated a set of site-specific host keys, and you can copy them to the disk with:

cp -p /etc/ssh/ssh_host* /mnt/etc/ssh/

and then skip to the next bullet item. If you did not generate host keys for your MFSes, you will have to get them from another image and copy them in.

/mnt/etc/emulab/{client,emulab}.pem These should have been created on your boss node when you did the boss setup. So return to your boss node and copy them over (you cannot "pull" the files over because boss does not trust the node):