Kb80

Emulab FAQ: Setting Up a New Emulab: I've added new nodes to the testbed but I cannot ping them, what is wrong?

Emulab FAQ: Setting Up a New Emulab: I've added new nodes to the testbed but I cannot ping them, what is wrong?

This question is too open-ended to just answer. But here are some tools at your disposal for figuring out what is wrong.

You really, really want to have a console

First, there is a reason why we spend up to $50 a machine to put serial consoles on every node, the console is invaluable for debugging these problems. So, if you have a serial console, connect to it, reboot the machine and watch what happens.

If you do not have a machine with a serial console, get one. Just run a serial cable from the machine to your boss or ops node, or any machine that has an available serial connector. Use tip, minicom, hyperterm, whatever to connect at 115200 baud (one stop bit, no parity) to a node. Believe me, it is worth the hassle to do this.

If you still insist on not having a serial console, at least have a console. Configure you MFSes and images to use the VGA for a console (see the README files that come with the MFSes and generic images for instructions). They haul your butt over to the machine room and hook up a monitor and keyboard. Now reboot the node and frantically scribble down messages that scroll past on the screen. (Have I mentioned that you really want a serial console?)

Debugging boot problems without a console

If you don't have a console of any sort, you have to figure out what happened from second hand info. Nodes contact boss from a number of places and for a variety of reasons when they boot. Thus there are numerous log files that you can look at on boss to see how far a node boot has progressed.

/usr/testbed/log/dhcpd.log. Nodes boot using PXE and the first thing the PXE BIOS does is DHCP to discover its address and to learn what boot program to download. Look in here for the node's MAC to see if it requested and received its IP info.

/usr/testbed/log/tftpd.log. DHCP should return /tftpboot/pxeboot.emu as the boot loader to download and the PXE BIOS should attempt to download this using TFTP. Look in this log to see if the node downloaded pxeboot.emu. (There may be other downloads for the node as well, we'll get to those in a minute.)

/usr/testbed/log/bootinfo.log. The pxeboot loader will contact the boss node to determine what it should do next, one of: download a memory-based filesystem (MFS) or boot from a disk partition. That request is done via the "bootinfo" protocol. Look in this log for the node IP to see if requested and received its "marching orders." If the node is has just been added then it should be configured to boot the "admin MFS." Now, you can go back to the TFTP log and see if it also requested a series of files from the /tftpboot/freebsd/boot directory. The final request should have been for mfsroot.gz.

/usr/testbed/log/tmcd.log. Once the MFS (or disk-based) OS is running, the Emulab scripts should start requesting self-configuration info via the "Testbed Master Control" protocol. Look in the TMCD's log for the node name to see if it started requesting various info. You should see "status", "fullconfig", and other requests, culminating in its reporting state ISUP.