Friday, February 12, 2016

I've played a bit with Hugo, the static web site generator written in golang that has been getting a lot of good press lately. At the suggestion of my colleague Warren Runk, I also experimented with hosting the static files generated by Hugo on Google Cloud Storage (GCS). That way there is no need for launching any instances that would serve those files. You can achieve this by using AWS S3 as well of course.

Notes on GCS setup

You first need to sign up for a Google Cloud Platform (GCP) account. You get a 30-day free trial with a new account. Once you are logged into the Google Cloud console, you need to create a new project. Let's call it my-gcs-hugo-project.

You need to also create a bucket in GCS. If you want to serve your site automatically out of this bucket, you need to give the bucket the same name as your site. Let's assume you call the bucket hugotest.mydomain.com. You will have to verify that you own mydomain.com either by creating a special CNAME in the DNS zone file for mydomain.com pointing to google.com, or by adding a special META tag to the HTML file served at hugotest.mydomain.com (you can achieve the latter by temporarily CNAME-ing hugotest to www.mydomain.com and adding the HEAD tag to the home page for www).

If you need to automate deployments to GCS, it's a good idea to create a GCP Service Account. Click on the 'hamburger' menu in the upper left of the GCP console, then go to Permissions, then Service Accounts. Create a new service account and download its private key in JSON format (the key will be called something like my-gcs-hugo-project-a37b5acd7bc5.json.

Let's say your service account is called my-gcp-service-account1. The account will automatically be assigned an email address similar to my-gcp-service-account1@my-gcs-hugo-project.iam.gserviceaccount.com.I wanted to be able to deploy the static files generated by Hugo to GCS using Jenkins. So I followed these steps on the Jenkins server as the user running the Jenkins process (user jenkins in my case):1) Installed the Google Cloud SDK

5) Configured GCS via the gsutil command-line utility (this may actually be redundant since we already configured the project with gcloud, but I leave it here in case you encounter issues with using just gcloud)

$ gsutil config -e

It looks like you are trying to run "/var/lib/jenkins/google-cloud-sdk/bin/bootstrapping/gsutil.py config".

then find the project you will use, and copy the Project ID string from the

second column. Older projects do not have Project ID strings. For such projects,

click the project and then copy the Project Number listed under that project.

What is your project-id? my-gcs-hugo-project

Boto config file "/var/lib/jenkins/.boto" created. If you need to use

a proxy to access the Internet please see the instructions in that

file.

6) Added the service account created above as an Owner for the bucket hugotest.mydomain.com

7) Copied a test file from the local file system of the Jenkins server to the bucket hugotest.mydomain.com (still logged in as user jenkins), then listed all files in the bucket, then removed the test file

$ gsutil cp test.go gs://hugotest.mydomain.com/

Copying file://test.go [Content-Type=application/octet-stream]...

Uploading gs://hugotest.mydomain.com/test.go: 951 B/951 B

$ gsutil ls gs://hugotest.mydomain.com/

gs://hugotest.mydomain.com/test.go

$ gsutil rm gs://hugotest.mydomain.com/test.go

Removing gs://hugotest.mydomain.com/test.go...

8) Created a Jenkins job for uploading all static files for a given website to GCS

Assuming all these static files are checked in to GitHub, the Jenkins job will first check them out, then do something like this (where TARGET is the value selected from a Jenkins multiple-choice dropdown for this job):

BUCKETNAME=$TARGET

# upload all filee and disable caching (for testing purposes)

gsutil -h "Cache-Control:private" cp -r * gs://$BUCKETNAME/

# set read permissions for allUsers

for file in `find . -type f`; do

# remove first dot from file name

file=${file#"."}

gsutil acl ch -u allUsers:R gs://${BUCKETNAME}${file}

done

The first gsutil command does a recursive copy (cp -r *) of all files to the bucket. This will preserve the directory structure of the website. For testing purposes, the gsutil command also sets the Cache-Control header on all files to private, which tells browsers not to cache the files.

The second gsutil command is executed for each object in the bucket, and it sets the ACL on that object so that the object has Read (R) permissions for allUsers (by default only owners and other specifically assigned users have Read permissions). This is because we want to serve a public website out of our GCS bucket.

At this point, you should be able to hit hugotest.mydomain.com in a browser and see your static site in all its glory.

Notes on Hugo setup

I've only dabbled in Hugo in the last couple of weeks, so these are very introductory-type notes.

Installing Hugo on OSX and creating a new Hugo site

$ brew update && brew install hugo

$ mkdir hugo-sites

$ cd hugo-sites

$ hugo new site hugotest.mydomain.com

$ git clone --recursive https://github.com/spf13/hugoThemes themes

$ cd hugotest.mydomain.com

$ ln -s ../themes .

At this point you have a skeleton directory structure created by Hugo (via the hugo new site command) under the directory hugotest.mydomain.com:

$ ls

archetypes config.toml content data layouts static themes

(note that we symlinked the themes directory into the hugotest.mydomain.com directory to avoid duplication)

Configuring your Hugo site and choosing a theme

One file you will need to pay a lot of attention to is the site configuration file config.toml. The default content of this file is deceptively simple:

$ cat config.toml

baseurl = "http://replace-this-with-your-hugo-site.com/"

languageCode = "en-us"

title = "My New Hugo Site"

Before you do anything more, you need to decide on a theme for your site. Browse the Hugo Themes page and find something you like. Let's assume you choose the Casper theme. You will need to become familiar with the customizations that the theme offers. Here are some customizations I made in config.toml, going by the examples on the Casper theme web page:

$ cat config.toml

baseurl = "http://hugotest.mydomain.com/"

languageCode = "en-us"

title = "My Speedy Test Site"

newContentEditor = "vim"

theme = "casper"

canonifyurls = true

[params]

description = "Serving static sites at the speed of light"

cover = "images/header.jpg"

logo = "images/mylogo.png"

# set true if you are not proud of using Hugo (true will hide the footer note "Proudly published with HUGO.....")

hideHUGOSupport = false

# author = "Valère JEANTET"

# authorlocation = "Paris, France"

# authorwebsite = "http://vjeantet.fr"

# bio= "my bio"

# googleAnalyticsUserID = "UA-79101-12"

# # Optional RSS-Link, if not provided it defaults to the standard index.xml

# RSSLink = "http://feeds.feedburner.com/..."

# githubName = "vjeantet"

# twitterName = "vjeantet"

# facebookName = ""

# linkedinName = ""

I left most of the Casper-specific options commented out and only specified a cover image, a logo and a description.

Creating a new page

If you want blog-style posts to appear on your home page, create a new page with Hugo under a directory called post (some themes want this directory to be named post and others want it posts, so check what the theme expects).

Let's assume you want to create a page caled hello-world.md (I haven't even mentioned this so far, but Hugo deals by default with Markdown pages, so you will need to brush up a bit on our Markdown skills). You would run:

$ hugo new post/hello-world.md

This creates the post directory under the content directory, creates a file called hello-world.md in content/post, and opens up the file for editing in the editor you specified as the value for newContentEditor in config.toml (vim in my case). The default contents of the md file are specific to the theme you used. For Casper, here is what I get by default:

+++

author = ""

comments = true

date = "2016-02-12T11:54:32-08:00"

draft = false

image = ""

menu = ""

share = true

slug = "post-title"

tags = ["tag1", "tag2"]

title = "hello world"

+++

Now add some content to that file and save it. Note that the draft property is set to false by the Casper theme. Other themes set it to true, in which case it would not be published by Hugo by default. The slug property is set by Casper to "post-title" by default. I changed it to "hello-world". I also changed the tags list to only contain one tag I called "blog".

At this point, you can run the hugo command by itself, and it will take the files it finds under content, static, and its other subdirectories, turn them into html/js/css/font files and save it in a directory called public:

$ hugo

0 draft content

0 future content

1 pages created

3 paginator pages created

1 tags created

0 categories created

in 55 ms

$ find public

public

public/404.html

public/css

public/css/nav.css

public/css/screen.css

public/fonts

public/fonts/example.html

public/fonts/genericons.css

public/fonts/Genericons.eot

public/fonts/Genericons.svg

public/fonts/Genericons.ttf

public/fonts/Genericons.woff

public/index.html

public/index.xml

public/js

public/js/index.js

public/js/jquery.fitvids.js

public/js/jquery.js

public/page

public/page/1

public/page/1/index.html

public/post

public/post/hello-world

public/post/hello-world/index.html

public/post/index.html

public/post/index.xml

public/post/page

public/post/page/1

public/post/page/1/index.html

public/sitemap.xml

public/tags

public/tags/blog

public/tags/blog/index.html

public/tags/blog/index.xml

public/tags/blog/page

public/tags/blog/page/1

public/tags/blog/page/1/index.html

That's quite a number of files and directories created by hugo. Most of it is boilerplate coming from the theme. Our hello-world.md file was turned into a directory called hello-world under public/post, with an index.html file dropped in it. Note that the Casper theme names the hello-world directory after the slug property in the hello-world.md file.

Thursday, February 11, 2016

Some quick notes I jotted down while documenting our Ansible setup. Maybe they will be helpful for people new to Ansible.

Ansible playbooks and roles

Playbooks are YAML files that specify which roles are applied to hosts of certain type.

Example: api-servers.yml

$ cat api-servers.yml

---

- hosts: api

sudo: yes

roles:

- base

- tuning

- postfix

- monitoring

- nginx

- api

- logstash-forwarder

This says that for each host in the api group we will run tasks defined in the roles listed above.

Example of a role: the base role is one that (in our case) is applied to all hosts. Here is its directory/file structure:

roles/base

roles/base/defaults

roles/base/defaults/main.yml

roles/base/files

roles/base/files/newrelic

roles/base/files/newrelic/newrelic-sysmond_2.0.2.111_amd64.deb

roles/base/files/pubkeys

roles/base/files/pubkeys/id_rsa.pub.jenkins

roles/base/files/rsyslog

roles/base/files/rsyslog/50-default.conf

roles/base/files/rsyslog/60-papertrail.conf

roles/base/files/rsyslog/papertrail-bundle.pem

roles/base/files/sudoers.d

roles/base/files/sudoers.d/10-admin-users

roles/base/handlers

roles/base/handlers/main.yml

roles/base/meta

roles/base/meta/main.yml

roles/base/README.md

roles/base/tasks

roles/base/tasks/install.yml

roles/base/tasks/main.yml

roles/base/tasks/newrelic.yml

roles/base/tasks/papertrail.yml

roles/base/tasks/users.yml

roles/base/templates

roles/base/templates/hostname.j2

roles/base/templates/nrsysmond.cfg.j2

roles/base/vars

roles/base/vars/main.yml

An Ansible role has the following important sub-directories:

defaults - contains the main.yml file which defines default values for variables used throughout other role files; note that the role’s files are checked in to GitHub, so these values shouldn’t contain secrets such as passwords, API keys etc. For those types of variables, use group_vars or host_vars files which will be discussed below.

files - contains static files that are copied over by ansible tasks to remote hosts

handlers - contains the main.yml file which defines actions such as stopping/starting/restarting services such as nginx, rsyslog etc.

meta - metadata about the role; things like author, description etc.

tasks - the meat and potatoes of ansible, contains one or more files that specify the actions to be taken on the host that is being configured; the main.yml file contains all the other files that get executed

Here are 2 examples of task files, one for configuring rsyslog to send logs to Papertrail and the other for installing the newrelic agent:

$ cat tasks/papertrail.yml

- name: copy papertrail pem certificate file to /etc

copy: >

src=rsyslog/{{item}}

dest=/etc/{{item}}

with_items:

- papertrail-bundle.pem

- name: copy rsyslog config files for papertrail integration

copy: >

src=rsyslog/{{item}}

dest=/etc/rsyslog.d/{{item}}

with_items:

- 50-default.conf

- 60-papertrail.conf

notify:

- restart rsyslog

$ cat tasks/newrelic.yml

- name: copy newrelic debian package

copy: >

src=newrelic/{{newrelic_deb_pkg}}

dest=/opt/{{newrelic_deb_pkg}}

- name: install newrelic debian package

apt: deb=/opt/{{newrelic_deb_pkg}}

- name: configure newrelic with proper license key

template: >

src=nrsysmond.cfg.j2

dest=/etc/newrelic/nrsysmond.cfg

owner=newrelic

group=newrelic

mode=0640

notify:

- restart newrelic

templates - contains Jinja2 templates with variables that get their values from defaults/main.yml or from group_vars or host_vars files. One special variable that we use (and is not defined in these files, but instead is predefined by Ansible) is inventory_hostname which points to the hostname of the target being configured. For example, here is the template for a hostname file which will be dropped into /etc/hostname on the target:

$ cat roles/base/templates/hostname.j2

{{ inventory_hostname }}

Once you have a playbook and a role, there are a few more files you need to take care of:

hosts/myhosts - this is an INI-type file which defines groups of hosts. For example the following snippet of this file defines 2 groups called api and magento.

[api]

api01 ansible_ssh_host=api01.mydomain.co

api02 ansible_ssh_host=api02.mydomain.co

[magento]

mgto ansible_ssh_host=mgto.mydomain.co

The api-servers.yml playbook file referenced at the beginning of this document sets the hosts variable to the api group, so all Ansible tasks will get run against the hosts included in that group. In the hosts/myhosts file above, these hosts are api01 and api02.

group_vars/somegroupname - this is where variables with ‘secret’ values get defined for a specific group called somegroupname. The group_vars directory is not checked into GitHub. somegroupname needs to exactly correspond to the group defined in hosts/myhosts.

Example:

$ cat group_vars/api

ses_smtp_endpoint: email-smtp.us-west-2.amazonaws.com

ses_smtp_port: 587

ses_smtp_username: some_username

ses_smtp_password: some_password

datadog_api_key: some_api_key

. . . other variables (DB credentials etc)

host_vars/somehostname - this is where variables with ‘secret’ values get defined for a specific host called somehostname. The host_vars directory is not checked into GitHub. somehostname needs to exactly correspond to a host defined in hosts/myhosts.

Example:

$ cat host_vars/api02

insert_sample_data: false

This overrides the insert_sample_data variable and sets it to false only for the host called api02. This could also be used for differentiating between a DB master and slave for example.

Tying it all together

First you need to have ansible installed on your local machine. I used:

$ pip install ansible

To execute a playbook for a given hosts file against all api server, you would run:

$ ansible-playbook -i hosts/myhosts api-servers.yml

The name that ties together the hosts/myhosts file, the api-servers.yml file and the group_vars/groupname file is in this case api.

You need to make sure you have the desired values for that group in these 3 files:

hosts/myhosts: make sure you have the desired hosts under the [api] group

api-server.yml: make sure you have the desired roles for hosts in the api group

group_vars/api: make sure you have the desired values for variables that will be applied to the hosts in the api group