Configuration Management 101: Writing Puppet Manifests

Introduction

In a nutshell, server configuration management (also popularly referred to as IT Automation) is a solution for turning your infrastructure administration into a codebase, describing all processes necessary for deploying a server in a set of provisioning scripts that can be versioned and easily reused. It can greatly improve the integrity of any server infrastructure over time.

In a previous guide, we talked about the main benefits of implementing a configuration management strategy for your server infrastructure, how configuration management tools work, and what these tools typically have in common.

This part of the series will walk you through the process of automating server provisioning using Puppet, a popular configuration management tool capable of managing complex infrastructure in a transparent way, using a master server to orchestrate the configuration of the nodes. We will focus on the language terminology, syntax and features necessary for creating a simplified example to fully automate the deployment of an Ubuntu 14.04 web server using Apache.

This is the list of steps we need to automate in order to reach our goal:

Update the apt cache

Install Apache

Create a custom document root directory

Place an index.html file in the custom document root

Apply a template to set up our custom virtual host

Restart Apache

We will start by having a look at the terminology used by Puppet, followed by an overview of the main language features that can be used to write manifests. At the end of this guide, we will share the complete example so you can try it by yourself.

Note: this guide is intended to introduce you to the Puppet language and how to write manifests to automate your server provisioning. For a more introductory view of Puppet, including the steps necessary for installing and getting started with this tool, check our How To Install Puppet 4 in a Master-Agent Setup on Ubuntu 14.04 guide.

Getting Started

Before we can move to a more hands-on view of Puppet, it is important that we get acquainted with important terminology and concepts introduced by this tool.

Puppet Terms

Puppet Master: the master server that controls configuration on the nodes

Puppet Agent Node: a node controlled by a Puppet Master

Manifest: a file that contains a set of instructions to be executed

Resource: a portion of code that declares an element of the system and how its state should be changed. For instance, to install a package we need to define a package resource and ensure its state is set to "installed"

Module: a collection of manifests and other related files organized in a pre-defined way to facilitate sharing and reusing parts of a provisioning

Class: just like with regular programming languages, classes are used in Puppet to better organize the provisioning and make it easier to reuse portions of the code

Facts: global variables containing information about the system, like network interfaces and operating system

Services: used to trigger service status changes, like restarting or stopping a service

Puppet provisionings are written using a custom DSL (domain specific language) that is based on Ruby.

Resources

With Puppet, tasks or steps are defined by declaring resources. Resources can represent packages, files, services, users, and commands. They might have a state, which will trigger a system change in case the state of a declared resource is different from what is currently on the system. For instance, a package resource set to installed in your manifest will trigger a package installation on the system if the package was not previously installed.

This is what a package resource looks like:

package { 'nginx':
ensure => 'installed'
}

You can execute any arbitrary command by declaring an exec resource, like the following:

exec { 'apt-get update':
command => '/usr/bin/apt-get update'
}

Note that the apt-get update portion on the first line is not the actual command declaration, but an identifier for this unique resource. Often we need to reference other resources from within a resource, and we use their identifier for that. In this case, the identifier is apt-get update, but it could be any other string.

Resource Dependency

When writing manifests, it is important to keep in mind that Puppet doesn't evaluate the resources in the same order they are defined. This is a common source of confusion for those who are getting started with Puppet. Resources must explicitly define dependency between each other, otherwise there's no guarantee of which resource will be evaluated, and consequently executed, first.

As a simple example, let's say you want execute a command, but you need to make sure a dependency is installed first:

The require option receives as parameter a reference to another resource. In this case, we are referring to the Package resource identified as python-software-properties.
It's important to notice that while we use exec, package, and such for declaring resources (with lowercase), when referring to previously defined resources, we use Exec, Package, and so on (capitalized).

Now let's say you need to make sure a task is executed before another. For a case like this, we can use the before option instead:

Manifest Format

Manifests are basically a collection of resource declarations, using the extension .pp. Below you can find an example of a simple playbook that performs two tasks: updates the apt cache and installs vim afterwards:

Before the end of this guide we will see a more real-life example of a manifest, explained in detail. The next section will give you an overview of the most important elements and features that can be used to write Puppet manifests.

Writing Manifests

Working with Variables

Variables can be defined at any point in a manifest. The most common types of variables are strings and arrays of strings, but other types are also supported, such as booleans and hashes.

The example below defines a string variable that is later used inside a resource:

$package = "vim"
package { $package:
ensure => "installed"
}

Using Loops

Loops are typically used to repeat a task using different input values. For instance, instead of creating 10 tasks for installing 10 different packages, you can create a single task and use a loop to repeat the task with all the different packages you want to install.

The simplest way to repeat a task with different values in Puppet is by using arrays, like in the example below:

As of version 4, Puppet supports additional ways for iterating through tasks. The example below does the same thing as the previous example, but this time using the each iterator. This option gives you more flexibility for looping through resource definitions:

Using Conditionals

Conditionals can be used to dynamically decide whether or not a block of code should be executed, based on a variable or an output from a command, for instance.

Puppet supports most of the conditional structures you can find with traditional programming languages, like if/else and case statements. Additionally, some resources like exec will support attributes that work like a conditional, but only accept a command output as condition.

Let's say you want to execute a command based on a fact. In this case, as you want to test the value of a variable, you need to use one of the conditional structures supported, like if/else:

Another common situation is when you want to condition the execution of a command based on the output from another command. For cases like this you can use onlyif or unless, like in the example below. This command will only be executed when the output from /bin/which php is successful, that is, the command exits with status 0:

Working with Templates

Templates are typically used to set up configuration files, allowing for the use of variables and other features intended to make these files more versatile and reusable. Puppet supports two different formats for templates: Embedded Puppet (EPP) and Embedded Ruby (ERB). The EPP format, however, works only with recent versions of Puppet (starting from version 4.0).

Below is an example of an ERB template for setting up an Apache virtual host, using a variable for setting up the document root for this host:

In order to apply the template, we need to create a file resource that renders the template content with the template method. This is how you would apply this template to replace the default Apache virtual host:

Puppet makes a few assumptions when dealing with local files, in order to enforce organization and modularity. In this case, Puppet would look for a vhost.erb template file inside a folder apache/templates, inside your modules directory.

Defining and Triggering Services

Service resources are used to make sure services are initialized and enabled. They are also used to trigger service restarts.

Let's take into consideration our previous template usage example, where we set up an Apache virtual host. If you want to make sure Apache is restarted after a virtual host change, you first need to create a service resource for the Apache service. This is how such resource is defined in Puppet:

service { 'apache2':
ensure => running,
enable => true
}

Now, when defining the resource, you need to include a notify option in order to trigger a restart:

Example Manifest

Now let's have a look at a manifest that will automate the installation of an Apache web server within an Ubuntu 14.04 system, as discussed in this guide's introduction.

The complete example, including the template file for setting up Apache and an HTML file to be served by the web server, can be found on Github. The folder also contains a Vagrantfile that lets you test the manifest in a simplified setup, using a virtual machine managed by Vagrant.

Below you can find the complete manifest:

default.pp

$doc_root = "/var/www/example"

exec { 'apt-get update':

command => '/usr/bin/apt-get update'

}

package { 'apache2':

ensure => "installed",

require => Exec['apt-get update']

}

file { $doc_root:

ensure => "directory",

owner => "www-data",

group => "www-data",

mode => 644

}

file { "$doc_root/index.html":

ensure => "present",

source => "puppet:///modules/main/index.html",

require => File[$doc_root]

}

file { "/etc/apache2/sites-available/000-default.conf":

ensure => "present",

content => template("main/vhost.erb"),

notify => Service['apache2'],

require => Package['apache2']

}

service { 'apache2':

ensure => running,

enable => true

}

Manifest Explained

line 1

The manifest starts with a variable definition, $doc_root. This variable is later used in a resource declaration.

lines 3-5

This exec resource executes an apt-get update command.

lines 7-10

This package resource installs the package apache2, defining that the apt-get update resource is a requirement, which means that it will only be executed after the required resource is evaluated.

lines 12-17

We use a file resource here to create a new directory that will serve as our document root. The file resource can be used to create directories and files, and it's also used for applying templates and copying local files to the remote server. This task can be executed at any point of the provisioning, so we didn't need to set any require here.

lines 19-23

We use another file resource here, this time to copy our local index.html file to the document root inside the server. We use the source parameter to let Puppet know where to find the original file. This nomenclature is based on the way Puppet handles local files; if you have a look at the Github example repository, you will see how the directory structure should be created in order to let Puppet find this resource. The document root directory needs to be created prior to this resource execution, that's why we include a require option referencing the previous resource.

lines 25-30

A new file resource is used to apply the Apache template and notify the service for a restart. For this example, our provisioning is organized in a module called main, and that's why the template source is main/vhost.erb. We use a require statement to make sure the template resource only gets executed after the package apache2 is installed, otherwise the directory structure used by Apache may not be present yet.

lines 32-35

Finally, the service resource declares the apache2 service, which we notify for a restart from the resource that applies the virtual host template.

Conclusion

Puppet is a powerful configuration management tool that uses an expressive custom DSL for managing server resources and automate tasks. Its language offers advanced resources that can give extra flexibility to your provisioning setups; it is important to remember that resources are not evaluated in the same order they are defined, and for that reason you need to be careful when defining dependencies between resources in order to establish the right chain of execution.

In the next guide of this series, we will have a look at Chef, another powerful configuration management tool that leverages the Ruby programming language to automate infrastructure administration and provisioning.

Tutorial Series

Configuration management can drastically improve the integrity of servers over time by providing a framework for automating processes and keeping track of changes made to the system environment. This series will introduce you to the concepts behind Configuration Management and give you a practical overview of how to use Ansible, Puppet and Chef to automate server provisioning.

March 23, 2016

As a broader subject, configuration management (CM) refers to the process of systematically handling changes to a system in a way that allows the system to maintain integrity over time. In this tutorial, we will discuss how configuration management works for servers, and what to consider when choosing a tool for building your configuration management infrastructure.

March 23, 2016

This tutorial will walk you through the process of creating an automated server provisioning using Ansible, a configuration management tool that provides a complete automation framework and orchestration capabilities. We will focus on the language terminology, syntax and features necessary for creating a simplified example to fully automate the deployment of an Ubuntu 14.04 web server using Apache.

March 23, 2016

This guide will focus on the features and characteristics of Puppet, a popular configuration management tool capable of managing complex infrastructures in a transparent way, using a master server to orchestrate the configuration of the nodes. In order to give you a hands-on experience with the tool, we will walk you through the process of creating a simplified provisioning example to fully automate the deployment of a web server using Apache.

April 14, 2016

This tutorial will walk you through the process of automating server provisioning using Chef, a powerful configuration management tool that leverages the Ruby programming language to automate infrastructure administration and provisioning. We will focus on the language terminology, syntax, and features necessary for creating a simplified example to fully automate the deployment of an Ubuntu 14.04 web server using Apache.