Software Developer, Cyclist, & Traveler.

Deploying your chef infrastructure with capistrano

Chef and Capistrano are the perfect pair when it comes to managing and deploying your web application to the cloud.

Chef is strictly responsible for managing the server, items like installing Ubuntu packages, configuring Nginx, Varnish, etc. Capistrano is strictly responsible for deploying new application code and running migrations. We're also using capistrano-ext so that we can deploy to multiple stages which is nice and works really well with the Chef environment's we've configured.

Chef has a deploy resource that you can utilize to deploy your code but it's a bit difficult to customize and forces you to use a very opinionated way to deploy. Something like Capistrano or Vlad is a much better tool to deploy with which is why I set them up to work together and share configuration between the two. There's a few gems out there that do something similar like capistrano-chef and chef-cap but they get rid of the beauty of deploying with Capistrano.

To deploy an application this way you basically figure out what you need on your server and get that setup with Chef. What kind of database you'll be using, web server you'll be using, etc. I store all of the application configuration within Opscode's hosted chef server but you can use any chef server or even chef solo. They have the ability to store encrypted data bags now as well as unencrypted data bags. Once you have the proper software/cookbooks installed and configured with Chef then it's just a regular Capistrano setup that looks like a typical Capistrano deploy recipe. The only variation is that to figure out what servers the particular roles point to you'll be querying Chef.

Here's what the production Capistrano configuration looks like:

require 'rubygems'
require 'chef/config'
require 'chef/knife'
require 'chef/data_bag_item'
require 'chef/search/query'
# Load up our Chef config assuming that it's in the parent directory as 'chef'
config = File.expand_path("../../../chef/.chef/knife.rb",File.dirname( __FILE__))
Chef::Config.from_file(config)
query = Chef::Search::Query.new
# Grab the servers that we've assigned the role of 'web_server' and that are in the production environment
servers = query.search(:node, 'role:web_server AND chef_environment:production')[0].collect do |w|
w["automatic"]["cloud"]["public_hostname"] }
end
# We have some crons that need to run on all web servers and then a few that just need to run on one primary server.
cron_runners = query.search(:node, 'role:cron_runner AND chef_environment:production')[0].collect do |w|
w["automatic"]["cloud"]["public_hostname"] }
end
cron_adjuncts = web_servers.map { |n| n unless n == cron_runners.first }.compact
puts "Deploying to production on #{web_servers.inspect}"
puts "Running primary cronjobs on #{cron_runners.first}"
puts "Running adjunct cronjobs on #{cron_adjuncts.inspect}"
# Here we just pass the IP addresses of our nodes for Capistrano to run on
role :web, *web_servers
role :app, *web_servers
role :cron_primary, cron_runners.first
role :cron_adjunct, *cron_adjuncts
# We just need one web app server to be our runner, doesn't matter which one
role :db, web_servers.first, :primary => true

The biggest win of combining Chef and Capistrano is that we can add and remove our web server nodes and Capistrano automatically grabs the latest pool of app servers that we have running. Also, since you're using the regular Chef search functionality it's really easy to access the data that you need for your deploys.