Using DeployStudio Across Subnets—a Path not Taken

At Pivotal Labs we use DeployStudio to rapidly image machines over the network. It was an excellent solution when the DeployStudio server and the client were on the same subnet. It did not work when they were on different subnets.

We found that, with a combination of clever use of tcpdump, a carefully-crafted dhcpd configuration file, and a judicious set of firewall exceptions, we were able to extend DeployStudio so that it worked across subnets.

Unfortunately, it was an epic fail: every third install would cause our firewall (m0n0wall 1.8.0b512) to lock up. We have put the project on ice until we get a new firewall.

Audience

This blog post is intended for IT organizations with the following characteristics

use DeployStudio to deploy OS X workstations

have multiple subnets

are uncomfortable having a DeployStudio server span multiple networks (most often these are security concerns; by compromising the DeployStudio server, a hacker would gain access to all the networks) (a DeployStudio server must run several services, at least one of which, NFS, requires discipline to implement in a secure manner)

use an ISC DHCP server

are willing to put their firewall to the test

The easy way

See Ryan’s comments below. With a few lines of Cisco configuration (assuming you have a Cisco router), you can easily configure DeployStudio boots across subnets.

The rest of this blog post is the much more difficult path that I took, and I don’t recommend it unless you really enjoy doing things the hard way.

The Hard Way: Start with tcpdump

To make DeployStudio work across subnets, you first need to use tcpdump to capture how it works within a subnet. In this case, we used a laptop (kate-enet), and our DeployStudio server (deploystudio).

First, we started the capture. We captured to a file so that we could examine the output at our leisure. We ran the following command on our deploystudio server:

sudo tcpdump -w /tmp/kate.tcp -s 1536 host kate-enet

Next, we started a network install:

we turned on kate-enet (a 13″ MacBook Air laptop with a thunderbolt ethernet adapter)

we held down the option-key so that we were presented with a choice of boot options

we chose the network install

when DeployStudio runtime screen came up, we ctrl-c’d the tcpdump—we had what we needed.

Resist the temptation to substitute a hostname for the NFS server’s IP address; (i.e. leave it “nfs:10.0.0.64″; do not put “nfs:deploystudio.sf.pivotallabs.com”). IP addresses will work; hostnames won’t.

We used ruby (irb) to convert the dotted-decimal strings in tcpdump to colon-hexadecimal in dhcpd.conf. In the following example, we convert “1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48”:

Troubleshooting

If you’re having problems, you need to check that your TFTP and NFS are working, preferably from a machine that’s on the subnet of the client which your trying to image.

TFTP

In our example, we know that our tftp server is deploystudio.sf.pivotallabs.com, and the file we’re downloading is /private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter. Let’s try from the command line:

NFS

Testing NFS is a little tricky because the NFS path is slightly mangled. Specifically, a “:” is substituted for the second-to-last “/” in the pathname. For example, the dhcp root-path directive “nfs:10.80.28.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg”
is translated to a pathname of “/net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg” for testing purposes on a client machine. We take advantage of automount running on a typical OS X client. First do an ls to make sure we can see the file, then do a cp to make sure we can read the file:

Performance

The time required to image a machine will more than double. A typical install will take 40 minutes or more.

Initial Boot-up

Certain operations are much slower. Specifically, the time between selecting netboot server and being presented with the DeployStudio runtime screen takes approximately 7 minutes. We have studied that lag, and over 4 minutes is due to abysmal (3.8kBps) TFTP throughput. We are unclear why there is such a gross lag; running the same tftp on the command line completes 20x faster (74.7kBps).

We have a firewall that negotiates traffic between our subnets, and we are aware that TFTP provides challenges for firewalls (it re-negotiates its destination port) (Cisco firewalls have special directives to handle TFTP traffic appropriately).

Bibliography

3 Comments

I’m not certain how your network is set up but I’ve been able to use DeployStudio across subnets for some time now. The way that I’ve set it up is to have the Deploystudio server configured as a “DHCP Helper” (AKA DHCP Relay) in addition to your actual DHCP server.

This is a setting done on the router interface for each subnet. Most routers will allow you to enter multiple DHCP servers into the list of “helpers”. I know at least Cisco, HP, and Juniper do.

Just log into your router and add a DHCP helper to each router interface for the subnets that you want to be able to use DeployStudio in. If your DeployStudio server IP is 192.168.1.200, a DHCP helper entry on a Cisco router would looks something like:

config t
interface Fast 0/0 # This is the subnet that you want to use DeployStudio in but currently can’t
ip helper-address 192.168.1.200
end
write mem

What happens is every helper in the list on the router is forwarded a copy of any DHCP request that happens in that subnet. DeployStudio is configured to respond with special information when certain “options” are requested by a system. Your DHCP server will still be used to get and IP address, but DeployStudio will be used if any network bootable drives are requested by the computer. This happens during the bootp process if you hold down the “N” key on a MAC while booting. So it is pretty seamless once you add the helper information to the router.

Hope this info helps.

Ryan.

April 22, 2013 at 4:14 pm

Brian Cunnie says:

Thanks again Ryan. I’ve updated my blog post to say, “Read Ryan’s comment: he has a much better way of doing it”.

April 29, 2013 at 6:36 pm

Brian Cunnie says:

Ryan, thanks for the tip. Once we install a router to manage our inter-subnet traffic (currently our firewall does it, and the firewall does not handle the tftp traffic gracefully), we’ll most likely take the path that you suggest—it’s much easier than crafting custom DHCP records.

I'm a systems administrator at Pivotal Labs. I've worked at a slew of startups and with a slew of UNIXes (OS X, Linux, FreeBSD, OpenBSD, HP-UX, AIX, Solaris/SunOS UTS, Xenix, Ultrix, and even the original UNIX). In my spare time I play rugby.