Setupsharednodes

Setting up Shared Node support

Version Info

In order to use Emulab's shared node support, you must be running from a recent version of emulab-stable (at least tag stable-20100914) or emulab-devel. Please see our Git Repository page for more information on how to get an Emulab Git account, as well as the instructions on how to update your testbed.

Supported Images

The only image currently supported is Fedora15 with OpenVZ container-based virtualization for Linux. This image is available from Utah's download directory. Instructions on how to import the FEDORA15-OPENVZ-STD image can be found on the Image Import page.

Turning on Shared Node Support

Okay, so it isn't a toggle, but more like a series of steps you have to perform to get things ready.

Edit the FEDORA15-OPENVZ-STD image descriptor in the web interface, and mark the set of node types that the image will run on.

Edit the following Site Variables (in red-dot mode, look in the Administration drop down menu for "Edit Site Variables". poolnodetype is the default node type that should be used for shared nodes, and should typically be the beefiest nodes you have.

general/minpoolsize

general/maxpoolsize

general/poolnodetype

Go to the Edit Node Type web page for the type you selected for poolnodetype and make sure that virtnode_capacity is set to something reasonable. This is the maximum number of VMs that can be built on any single shared node. If you have really fast machines with lots of memory, set this to 15 to start with, and adjust it as necessary.

Update the node_auxtypes database table to reflect the virtnode_capacity you selected in the previous step. On boss:

mysql> update node_auxtypes set count='NN' where type='pcvm';

Setting up your routing

The default control IP addresses for your VMs is in 172.16/12 network. Typically, you would insert a route on your control switch hardware to route these packets for your boss and ops.

If you cannot arrange this, then the easiest workaround is to add routes manually to boss and ops. If boss, ops, and your control net ifaces are all on one switch -- i.e., within the same broadcast domain, because then they don't need to pass through the router. On boss, add a route to the 172.16/12 network directly out your control net iface, without specifying a gateway (something like route add -net 172.16.0.0/12 -interface <ctrlnetiface>). Do the same thing on your ops node. You can arrange for this to happen on boot by adding these routes to /etc/rc.conf.

Swapping in the Shared Node holding experiment

The physical nodes that will make up the Shared Pool are held within a special experiment in the emulab-ops project. This experiment must be initially swapped in by a testbed administrator via the web interface. First, create an NS file using the following fragment, but be sure to set the value of the PoolNodeType variable to whatever you choose for the site variable above.

You can use as many nodes as you like, but for the initial setup, try going with just a couple of nodes.

Once you have your NS file ready, go to the web interface, and in red-dot, go the shared-nodes experiment in the emulab-ops project. In the settings list you will notice a toggle named Locked Down. If the toggle is on, turn it off so that you will be allowed to modify the experiment. Then choose Modify Experiment from the menu, and insert your NS file into the form. Then click on Modify at the bottom of the form. If all goes well, the experiment will swap in okay. If it does, be sure to click on the Locked Down toggle to prevent accidentally swapping the experiment out.

If the experiment fails to swap in, time to ask for help; post a question on the emulab-admins help forum. Join the forum if necessary.

An experiment using shared nodes

The next step is to test that shared node support is working properly and that you can swap in an experiment that uses shared nodes. Use the following NS fragment for testing:

We set the failure action to nonfatal to make it easier to debug failed nodes; the experiment will continue to swap in even if the VMs fail to boot, which makes it possible to find out what the problem is. You can log into the physical hosts in the shared pool and look at the log files that are stored in /var/emulab/logs.

Using the Begin Experiment menu option, begin a new experiment using the NS file above. If you have any problems with the swap in, it is time to ask for help (see above).

If the experiment does swap in okay, then you should ssh to the nodes and verify that the link was setup and that it has the proper delay during a simple ping test.

Adjusting the size of the Pool

The size of the shared node pool is currently adjusted by using the Modify Experiment page. In red-dot mode, look for the shared-nodes experiment in the emulab-ops project. Click on the Modify Experiment link in the left hand menu. If you do not see that option, then look in the list of settings for a line labeled "Locked Down". You will want to click on Toggle to flip the setting. We do this to prevent accidental swap out of the experiment, lest you annoy a bunch of users when their containers all get zapped!

Okay, now click on Modify Experiment. Edit the NS script as needed, but be aware that while adding nodes is easy, removing nodes that have containers on them is likely to cause grief for people running experiments. To find out how the shared nodes are currently utilized, use the "Show Shared Node Pool" option on the "Administration" drop down menu.

Okay, to add a node, add another batch of these lines (be sure to use unique vhostXX names):

If you have more then one node type or you want a specific node, be sure to put tb-set-hardware or tb-fix-node statements as needed. To delete a node, just remove its block of NS statements from the file (or comment them out). DO NOT RENAME any of the other nodes in the file though, that will cause you a lot of grief.

Once you have NS code changed, click on the form Modify button, and wait for the experiment to be modified. When that has finished, be sure to lock the experiment down again by clicking on the "Locked Down" toggle.

Note that we made an initial effort at automatically adjusting the size of the pool; see the pool_daemon in the tbsetup directory. However, this daemon is currently turned off since it is somewhat fragile.