Git Workflow and Puppet Environments

Breadcrumb

Editor's Note:This post has been updated with information on using r10k, a tool developed by Adrien Thebo. For the most current information, read Git Workflows with Puppet and r10k.

One of the features offered by Puppet is the ability to break up infrastructure configuration into environments. With environments, you can use a single Puppet master to serve multiple isolated configurations. For instance, you can adopt the development, testing and production series of environments embraced by a number of software development life cycles and by application frameworks such as Ruby on Rails, so that new functionality can be added incrementally without interfering with production systems. Environments can also be used to isolate different sets of machines. A good example of this functionality would be using one environment for web servers and another for databases, so that changes made to the web server environment don’t get applied to machines that don’t need that configuration.

Mapping the Puppet code base against the environments shows the power of this method. People often use a version control system to manage the code, and create a set of branches that each map to an environment. Adopting the development, testing and production workflow, we can have a puppet.conf that looks something like this:

With this configuration, we could map three Git branches for these environments and set up a central Git repository with post receive hooks. When changes were pushed to this repository, they would be automatically deployed to the puppet master. The example post-receive hook later in this post will work with this kind of environment setup.

Dynamic Environments

While there are benefits to having a set of branches and environments like the previously outlined configuration, there are also drawbacks with a static set of environments. This model is somewhat constrained by enforcing a single workflow. For instance, if multiple people are working on experimental features in the development branch, they'll have to constantly merge their code through the development process. If multiple people are developing different features in the same branch, they’ll all have to work with unrelated changes in the code. In addition, migrating a single feature to testing or production gets more complicated if multiple features are in the same branch.

Modern distributed version control systems like Git handle these constraints by making branch creation and merging lightweight operations, allowing us to generate Puppet environments on the fly. Puppet can set up environments with an explicitly defined section in the configuration file, but we can exploit the fact Puppet will set the $environment variable to the name of environment it is currently running under. With this in mind, we can write a puppet.conf to resemble this:

This handles the dynamic environment aspect; all we have to do is create a directory with manifests in $confdir/environments and we have created that environment. Generating new environments using Git is similarly easy. We can create a central Git repository somewhere, with a post-receive hook that looks something like this:

#!/usr/bin/env ruby
# Puppet Labs is a ruby shop, so why not do the post-receive hook in ruby?
require 'fileutils'
# Set this to where you want to keep your environments
ENVIRONMENT_BASEDIR = "/etc/puppet/environments"
# post-receive hooks set GIT_DIR to the current repository. If you want to
# clone from a non-local repository, set this to the URL of the repository,
# such as git@git.host:puppet.git
SOURCE_REPOSITORY = File.expand_path(ENV['GIT_DIR'])
# The git_dir environment variable will override the --git-dir, so we remove it
# to allow us to create new repositories cleanly.
ENV.delete('GIT_DIR')
# Ensure that we have the underlying directories, otherwise the later commands
# may fail in somewhat cryptic manners.
unless File.directory? ENVIRONMENT_BASEDIR
puts %Q{#{ENVIRONMENT_BASEDIR} does not exist, cannot create environment directories.}
exit 1
end
# You can push multiple refspecs at once, like 'git push origin branch1 branch2',
# so we need to handle each one.
$stdin.each_line do |line|
oldrev, newrev, refname = line.split(" ")
# Determine the branch name from the refspec we're received, which is in the
# format refs/heads/, and make sure that it doesn't have any possibly
# dangerous characters
branchname = refname.sub(%r{^refs/heads/(.*$)}) { $1 }
if branchname =~ /[\W-]/
puts %Q{Branch "#{branchname}" contains non-word characters, ignoring it.}
next
end
environment_path = "#{ENVIRONMENT_BASEDIR}/#{branchname}"
if newrev =~ /^0+$/
# We've received a push with a null revision, something like 000000000000,
# which means that we should delete the given branch.
puts "Deleting existing environment #{branchname}"
if File.directory? environment_path
FileUtils.rm_rf environment_path, :secure => true
end
else
# We have been given a branch that needs to be created or updated. If the
# environment exists, update it. Else, create it.
if File.directory? environment_path
# Update an existing environment. We do a fetch and then reset in the
# case that someone did a force push to a branch.
puts "Updating existing environment #{branchname}"
Dir.chdir environment_path
%x{git fetch --all}
%x{git reset --hard "origin/#{branchname}"}
else
# Instantiate a new environment from the current repository.
puts "Creating new environment #{branchname}"
%x{git clone #{SOURCE_REPOSITORY} #{environment_path} --branch #{branchname}}
end
end
end

And from here on out, you can use the new_feature environment on your hosts, and use git like you would with any code base.

This development model gives us some simple access control. Utilizing access control with a tool like gitolite, we can allow people to generate new environments to test their own code, but deny them access to change the production environment. This allows us to institute some sort of change control, by requiring all code to be reviewed by a merge master before inclusion into production, and allows code to be tested and verified before the request for submission is made.

This is ideal for junior sysadmins, as it allows them to experiment with Puppet while preventing accidental pushes of incorrect code. However, it is important to keep in mind that while this can prevent accidents, it cannot prevent malice. Unless otherwise configured, the Puppet master will run all manifests as one user, so a malicious user could attempt to manipulate other branches than the one they created or used.

The model of dynamic Puppet environments with Git was pioneered at Portland State University, and one of our Professional Service Engineers, Hunter Haugen, originally wrote up the basic concept on his blog.

Gitolite has been a fundamental underpinning for managing and deploying puppet manifests, and it’s a very powerful tool. You can check out the documentation here.

If you want to learn more about how environments can be used and configured, you can find the official documentation here.

Pro Git is an excellent book on Git, and they do a great job of outlining the different git hooks and how you can use them. You can read the relevant chapter online.

While the dynamic environments enable easy development workflows, which is great by the way, it seems like you're still going to end up merging back through a set of standard environments once development code is ready for the wild. For example, production, obviously you're going to have those boxes statically set to the production environment, and most likely other machines through the stack that are important will have appropriate static environments set. Unless I'm missing something.., why would you want to allow a production or qa box to run my-crazy-dev-branch?!

I might be missing the point, but I can't get my head around this workflow. Aren't we mixing puppet code lifecycle with infrastructure lifecycle? Ie. Production and Development environments are different with different packages and accounts.

What I was doing in the past is having Unstable and Stable branches both containing code for Production and Development (etc) environments. Is that an overkill?

The puppet workflow topic is critical for sysadmins trying to implement puppet/version control into their environments. It would be great if puppetlabs could provide some more formal documentation on various workflow scenario examples with specifics how puppet master configurations use git/gitolite hooks(via cron?) and branches to promote code. I have read a lot of conflicting info online.