Information technology, applied.

Fully Automated Installs

As I get my hardware setup in order, I’ve been planning my first automation step: fully automated provisioning and installation. I am basing my plan on a common use case. A company needs to expand its capacity by installing one or more new hosts. Hosts are physical servers. A fully automated install should be able to handle 1 new server or a thousand. It should only be a matter of creating the proper configuration files.

I see the process working as follows:

Manually install the new server in the rack and plug it into the network.

The hypervisor is installed on the host automatically via PXE when it first boots.

The hypervisor is configured by an automated system management tool such as Chef or Ansible.

A designated number of VMs are installed on the hypervisor.

The VM’s OS is configured.

The VM applications are configured per their designated role; for instance one VM might be designated a web server, one a database, one a Hadoop node.

Whatever software packages are needed for that role are installed and configured.

Custom data sets are loaded.

The servers are joined to whatever cluster they are part of.

Every piece of software installed via this process is tested and verified.

Every host, every VM, and every application is tested to ensure that the install was correct and is working as expected.

I will write about testing in a separate post. It is an important topic that deserves an expanded discussion. My goal is to apply the test-driven development (TDD) process I use for software development to server deployment. This means everything is tested. Just because Chef didn’t produce an error doesn’t mean it worked.

The other thing I would add is that every component used in this process should be under configuration management (CM). All software repositories are locally managed. All software package versions are base-lined and this baseline is only changed using a change management process. And, of course, all the configuration files for the applications and the automated system management tools are placed in a version control systme such as git.

In some limited tests the biggest problem I have seen so far is how to bootstrap the hostnames. If I install 100 new hosts and I want them to run 100 web server VMs called Web001 through Web100, and 100 to run database VMs named DB001 to DB100 how do I do that? It seems like I need some way to map hosts to VMs to roles (by hostname.) I assume that DHCP will take care of the IP addressing issues automatically so I can always refer to the hosts and VMs by hostname.

Related Articles

5 replies

Hope you won’t take it as spam but spent a good part of yesterday reading your very interesting blog (and still have a lot to discover here) ! Thank you very much for sharing all these experiences, knowledge and thoughts :+1:

About your hosts to VMs to roles mapping question, can’t it be solved playing with groups and group_vars, like creating a VMs (sub)group for each host or something like that ? I’m still getting my head around all this and interested to know how you may have solved this issue…

Another blurry point, AFAIC, lies between point 1 and 2. Stepping from one to the other actually involves several (if not many) substeps combining CMDB, IPAM, PXE/Provisioning system and finally CM system. As I see it ATM, it would be:
* enter (Bar/QR scan) host MAC address in CMDB, or use some kind of MAC discovery system,
* assign it a bare metal class/role
* asign it an IP address and DNS name (auto/manual),
* (reboot and) install OS and CM system

The thing is I can’t find any provisioning system really satisfying my reqs: whether it’s hardly installable under Debian (Cobbler), or is (too) intimate with another CM system (Foreman), or needs the JVM (razor -WTF jRuby instead of Plain Ruby to build a provisioning server ?!!), or doesn’t support W$ (many of them), or has several of these downsides (MAAS and others). On this point too, I’d be glad to read your thoughts and even more to know how you handle this.

Not at all, the more comments the better. Thanks for the kind words about the blog. With respect to the roles, since I wrote that I have developed a system that works very well for me. I define host level attributes in the Ansible host_vars/host.example.com file, many small roles (almost one per major app such as OSSEC, auditd, iptable, postfix, etc.), and tie them together in the inventory file.

My VM bootstrap process has also become very simple:
1. Set host parameters in host_vars file. This would be easy to template but Ansible doesn’t allow them to be retrieved via http. Too bad since I would just have them dynamically generated from my inventory database.
2. Add static DNS information via Ansible script.
3. Run Ansible VM creation play:
a. Ansible generates a custom kickstart file for the host and loads it on my provisioning web server
b. Ansible tells Xen to spin up a new VM (on whatever host it wants) and boots it from my kickstart file
c. Once the kickstart install is done Ansible takes over automatically and finishes the configuration.

It took me a while to get it to this state but now I can easily spin up and fully configure as many VMs as I need. Generating the inventory files is the only painful part. It’s easy to automate this part too only not with Ansible.

The Xen hosts are a different matter. With them I use their MAC and DHCP to give them an IP and hostname, then boot via PXE, then kickstart, and then Ansible finishes. Whew! Not easy like the VMs but its automated and repeatable. Just need to link a MAC to hostname (and IP) and the rest takes care of itself. I think this is essentially the same process that you describe.

What I want, but haven’t found yet, it a way to link my RDBMS inventory management to the CM tools. Once I can do that, all I need to do is enter the host or VM in my DB and it will generate all the right files/data automatically.

Oh, and I totally agree about jRuby. Why? Logstash uses it too for some reason. On the topic of languages, I’m surprised no-one has built a CM tool using Node.js yet. Most are Python or Ruby (Chef, Puppet, Ansible, Salt). Node’s scalability might work well in this area.

On the bare metal auto-install front, seems like we got to the same point… If I consider that I won’t deploy W$ on metal anymore, I may end giving a shot at my last discover in the field, eDeploy, which has an interesting approach and already uses Ansible: https://github.com/enovance/edeploy

For inventory I use a simple custom mySQL database. Ansible doesn’t support plain SQL. However, I saw in those links that it can take a JSON source. I could use Elasticsearch instead of mySQL. The nice part of that is that I’m already using it for log storage. The down side is that I will have to learn noSQL idioms (I have a lot of experience with SQL). I would have to write an Elasticsearch plugin as well but that didn’t look too hard.

Didn’t try it myself but as far as I understand, you ‘just’ have to write a script running the good SQL queries and returning JSON, make it executable and use it as your inventory. If your inventory file is executable, Ansible will detect it, run it, and use the output.