Building an Automated Website Publishing Pipeline Using CircleCI

This is a re-post from the Reflect blog written by Customer Success Engineer Luc Perkins. Reflect is a modern data platform that combines a powerful API with beautiful visualizations and easy-to-use authoring tools

I have a confession: I actually kind of enjoy manually deploying and managing websites. I grab a cup of coffee, run make publish or some similar command, gaze into the CLI output, wait for static assets to be generated and rsynced to a server thousands of miles away, and refresh the browser until I see the new changes go live—one of life’s subtle pleasures.

But let’s face it, there’s usually no good reason to do things this way now that continuous deployment and similar automation patterns have become standard practice in our industry. We decided a while back here at Reflect to use the CI tool that we already use for everything else, CircleCI, to automate our website deployment flow, and the benefits were immediately clear.

Our website generation toolkit

Before we get into website deployment, a few words on how we build our site for context.

We use Jekyll as both our static site generator and as the manager of our static asset pipeline. Basically we feed a bunch of Markdown, Sass, and JavaScript into Jekyll and end up with a fully built site in a _site directory. We’ll say a lot more about this in a future post.

Outside of this we we also use: Gulp.js for things like managing BrowserSync for local development; Rake for running our deployment scripts, which are heavily parameterized and benefit greatly from Ruby syntax (rather than Bash); and of course good old Make, which is sort of the main entry point into the project (our CircleCI jobs only ever run Make commands).

The infrastructure behind our website

So where does our site run and how do we get it there? Here’s a quick overview of our setup:

We have two dedicated DigitalOcean droplets, one running in New York and the other running in San Francisco.

We have both production and staging environments for the website. At the moment, both the staging and production versions of the site run side by side on the same boxes (which is fine given how light the load is for our staging site).

All assets are served via nginx. We have separate nginx configs for the staging and production sites.

Pro tip: store your nginx configs in the same repo as the website

Initially, our nginx configs simply lived on our droplets. If we needed to update those configs we SSHed into the boxes and made ad hoc changes. But the downsides of this are pretty clear: it’s easy to update one box and forget to update the other, it’s annoying to have to jump onto another box just to know what the config even is, and so on. So we decided to store our nginx configs in the repo alongside the site itself, which means that they’re now subject to version control and easily viewable. So how do we make sure that nginx actually uses those configs? We’ll say more about that below.

Timestamps for CI output should be for the America/Los_Angeles timezone, because that’s the one we all live in (we’re lobbying behind the scenes for an America/Portland timezone…we’ll let you know how that goes)

Node.js version 7.4.0 will be used (for things like our Gulp.js setup), while Ruby version 2.3.1 will be used for Jekyll

Prior to the build, an additional make setup command will be run. That command looks like this in our Makefile:

gem install bundler
bundle install
npm install

Our basic development pattern

Although there are always exceptions and edge cases, for the most part all projects at Reflect follow a workflow in which:

the master branch is considered the most up to date

all pull requests are targeted to a development branch rather than to master directly

development is merged into master only as part of a pull request, which enables us to review the collected result of many pull requests before we update master.

The website follows this flow as well. We’ve found this flow to be very friendly to our continuous deployment patterns, as we can confidently push bold changes to development with the assurance that they’ll be vetted one more time before being pushed to master, which can mean anything from updating our libraries in npm to updating Debian packages to, in this case, pushing the website live.

Once the CI environment is all set up, what happens next depends on the branch to which we’re merging:

Whenever changes are merged to master, the production site is updated. That means that all of the following happens in CircleCI:

Jekyll builds the site with JEKYLL_ENV set to production. When our Jekyll environment is set to production, our templates add JavaScript for Disqus and Google Analytics. More on Jekyll environments in the next post!

The make release command is run, which rsyncs the contents of the _site folder to both Digital Ocean droplets, scps our nginx configuration files, and uses SSH to restart nginx and perform some basic housecleaning (like chowning certain directories).

Whenever changes are merged to development, the staging version of the site is updated (this includes all of the steps for #1).

By default, CircleCI runs make test whenever you push to a branch that doesn’t have any behaviors associated with it. In our case, that means that make test is run whenever we push to branches outside of master and development. If make test returns 0 the build passes; otherwise, the build fails.

At the moment, our make test command does one thing: make buil, which simply runs jekyll build. This process exits with a 0 (success) or something else (failure). If we push anything to any branch and Jekyll can’t build it then CircleCI will let us know via Slack immediately. In the future we’d like to “test” our site more comprehensively by adding things like a linkchecker and a spellchecker to our site build.

Keep your CircleCI config minimal

In general, it’s a good idea to include only what is absolutely necessary in your circle.yml config file. You don’t want your various commands sections looking like whole shell scripts. Any time you need to invoke a command delegate everything either to a single shell script or to a Make command. YAML is a simple key/value markup format, not a scripting engine!

CircleCI and SSH

The only tricky part of getting our setup to work was enabling CircleCI to communicate directly with our Digital Ocean droplets over SSH. Here’s how we did it:

We created an SSH key on one of our local machines and uploaded the key to ~/.ssh/authorized_keys on both of our current Digital Ocean droplets. We now share that key with anyone who needs it via 1Password.

We uploaded the generated key to CircleCI via the Settings page in their web UI.

You’ve probably encountered this kind of flow if you’ve used AWS or DigitalOcean or similar services. The tricky part for us was making sure that both CircleCI and our engineers can perform the same tasks. After all, sometimes manual publishing and deployment tasks are still necessary, for example if CircleCI goes down at a time when our site is completely borked.

The most important thing that we learned is that you should keep your SSH config locally in your website repo so that you can make sure that every user (including CircleCI!) is using the same config for all ssh, scp, and rsync commands. Our website’s SSH config is stored in a config/ssh-config file, and the -F flag enables us to use that config directly when a command is invoked. Here are some examples:

In order to ensure that anyone can manage the site manually, each of our engineers who’s responsible for the site keeps the key locally at ~/.ssh/id_reflect_website. In the eyes of our Digital Ocean droplets, CircleCI and our engineers are considered equals.

Conclusion: make the jump, but keep a few things in mind

The advantages of setting up an automatic website publishing flow are very clear to us in hindsight. It took a few hours to get everything set up but it has without a doubt saved us many, many hours and afforded us peace of mind. If you use a static site generator for your website, we strongly recommend that you make the same investment. But there are some things we found out that you might be able to learn from:

Make sure that the site can still be deployed manually if necessary. There’s plenty that can go wrong, even in a heavily automated setup, and you need to be prepared for scenarios in which your automation framework has issues. At any point in time, several members on your team should be equipped to manually deploy the site securely and using just a handful of (well-documented!) commands. We have eight employees and currently half of them can deploy the site at any time by running git checkout master && make release.

Run the site in multiple environments, both of which are hooked into your automation pipeline. For us, that means that master pushes straight to production and development pushes straight to staging, but your practices may vary. Your website is others’ portal into your organization and should be troubleshooted as thoroughly as anything else.

Virtually everything that’s necessary for publishing the site locally should be located within the repo and subject to version control (and thus open to collaboration).