This means that building a DC has become a repeatable task, and repeatable tasks are tasks that are begging for automation. This blog is the story of how we have built our last few datacenters without needing to manually log in to the majority of our devices.

The Slow Way

There is a fair amount of effort that goes into building a network in a brand new location, not to mention the tight timeline. It's critical that the network is up and running early in the build process; without it, there's no connectivity and our platform engineers can't come in and build the hypervisors.

In any new deployment, there are typically around 50 new switches to configure. Most switches have an identical configuration (except for some unique things, like the management IP address), and the new switches will almost always need their software updated to our standard version.

In our early days, deploying a new network meant logging into every switch via the console port, pasting a config from a template, and then upgrading the software. With so many switches to build, it was time consuming and — let's face it — pretty boring. The whole process was in need of a total overhaul.

Not Touching Anything… Almost

For our automated network deployment to work, we have to address a chicken and egg problem: there needs to be some form of networking already in place so the new switches can download their updated code and grab their configuration template.

As a result, a small part of the network still does need to be built by hand. This is typically a small-ish firewall connected to what we call our "out of band" (OOB) internet link, plus a few switches to provide connectivity to the management ports of our switches. These devices have a very basic configuration, so it's easy to copy and paste it and get some initial connectivity.

Additionally, we need to know the MAC address of each switch, which is printed on the side of the chassis. Fortunately, we have a fantastic datacenter team that flies all over the world to do all the physical labor involved with deploying a new location. These folks have racking and stacking down to a fine art, and part of their process is to note down the MAC address of each switch they are racking into a file for use later on.

The Fast Way, aka Zero Touch Provisioning

The actual automation of the building process is known as Zero Touch Provisioning (ZTP). Most major networking vendors have some form of ZTP support, and the process is pretty simple. There are a few specific configurations needed on the ZTP server to make everything work.

Setting Up DHCP

First, we need a DHCP server. We use good old ISC DHCP running on a Ubuntu server, and configure it to give the switch the information it needs once it boots up. This is the top of our dhcpd.conf file:

This is where the MAC address from the side of the switches' chassis comes into play. We need each switch to pull down the correct configuration template, so the MAC address is used to identify the switch. The dhcpd.conf file will have an entry like the one above for every single switch that we want to ZTP.

Because creating a entry for 50 or so switches would be pretty annoying, we also automate this using simple Python script which spits out the appropriate dhcpd.conf file containing all the correct MAC addresses and IP addresses.

Configuration Templates

For this process to be fully automated, each new switch needs to have a configuration template ready to go. To make this happen, we use the Jinja2 templating software and some Python, which makes it easy to create a whole bunch of templates quickly. We create a template for every device that is going to be deployed and upload the templates to the ZTP server.

Voila!

The switch boots up and sends out a DHCP request, which the OOB firewall relays to the ZTP server. The switch then grabs its config template, downloads its software, and that's it!

Here is the console output from a real Juniper QFX switch going through the process:

With the old process, it would take a full day of work to build 50 switches. With the new process, it takes 5 minutes, and the longest part is just waiting for the switch to reboot for its software update.

Instead of manually logging into each device, we now set up a ZTP server, upload the configuration templates, then sit back and watch the network build itself.