Sporadically delivered thoughts on Continuous Delivery

Infrastructure as Code - Automation Is Not Enough

Infrastructure automation has become a mainstream theme in our industry, but automation without Infrastructure as Code practices only spurs the growth of chaotic IT sprawl. Organisations that depend on IT are still held back by their inability to quickly and reliably adapt to business challenges and opportunities. IT Ops people continue to be bogged down in firefighting, with barely enough time to keep systems running, leaving little time for fundamental improvements.

Back in the the Iron Age …

Virtualization and cloud (IaaS, Infrastructure as a Service, in particular) have forced the need for automation of some kind. In the old days, the “Iron Age” of IT, infrastructure growth was limited by the hardware purchasing cycle. Since it would take weeks for a new server to arrive, there was little pressure to rapidly install and configure an operating system on it. We would slot in a CD and follow our checklist, and a few days later it would be ready.

But the ability to spin up new virtual machines in minutes required us to get a lot better at automating this process. Server image templates and cloning helped get us over the hump. But now we had a new problem. Because we could, assuming enough overall capacity, spin up new VMs at the drop of a hat, we found ourselves with an ever-growing portfolio of servers. The need to keep a constantly growing and changing number of servers up to date and avoiding Configuration Drift spawned new tools.

Infrastructure as Code is born

CFengine, Puppet, and Chef established a new category of infrastructure automation tool quickly adopted by the early adopters, those nimble organisations who were taking full advantage of IaaS cloud as it emerged. These organisations, whose IT was typically built around Agile and Lean mindsets, evolved “Infrastructure as Code” practices to managing their automated infrastructure.

Enter the enterprise vendors - who get it get it wrong (of course)

As more traditional organisations have adopted virtualization - generally on in-house infrastructure rather than public clouds - they’ve felt the same need for automation to manage their systems. But although some have explored the toolsets used by the early adopters, many turn to traditional vendors of so-called enterprise management toolsets, who have moved to adapt and rebrand their software to catch the latest waves in the industry (“Now with DevOps!”)

The problem is that few of these toolsets are designed to support Infrastructure as Code. Yes, they do automate things. Once you point and click your way through their GUI to create a server template, you can create identical instances to your heart’s content. But when you go back and make tweaks to your template, you don’t have a traceable, easily understood record of the change. You can’t automatically trigger automatic testing of each change, using validation tools from multiple vendors, open source projects, and in-house groups.

In short, rather than using intensive, automatically enforced extreme change management you’re stuck with old-school, manual, “we’d do it more thoroughly if we had time” change management.

The difference is:

Infrastructure automation makes it possible to carry out actions repeatedly, across a large number of nodes. Infrastructure as code uses techniques, practices, and tools from software development to ensure those actions are thoroughly tested before being applied to business critical systems.

What to demand from your tools

Here are some guidelines for choosing configuration management tools that support Infrastructure as Code:

The definitions used to create and update system configurations should be externalizable in a format that can be stored in off the shelf version control systems such as Git, Subversion, or Perforce. This enables the adoption of a wide variety of tools for managing, validating, and testing software source code, rather than locking you into a single vendor’s toolset. It also gives you a history of every change, along with who made it and (hopefully) why, and the ability to roll back.

It should be possible to validate definitions at various levels of granularity, so you can apply a variation of the test pyramid. Quick syntax and code style validations, followed by execution of individual units of configuration, followed by instantiating of VMs that can be validated, etc. This offers the benefits of fast feedback and correction of changes, and is the foundation for Continuous Integration and a building a Continuous Delivery pipeline.

Without the ability to ensure that every change is quickly and easily tested as a matter of course, we’re forced to rely on people to take the time to manually set up and run tests, even when they’re under pressure. Without visibility and openness of configuration changes, we end up locked into the limited toolset of a single vendor, and deprive ourselves of a huge ecosystem of tools for managing software changes.

Bottom line:

The defining characteristic of our move beyond the “Iron Age” into the “Cloud Era” is that infrastructure can now be treated like software. Ensuring we’re able to bring the most effective software development practices to bear is the key to getting the most value out of this shift.