Testing puppet with docker and python

In all the past positions I’ve been in I’ve been lucky enough to have a dedicated ops team to handle service deployment, cluster health, and machine managmenent. However, at my new company there is much more of a “self serve” mentality such that each team needs to handle things themselves. On the one hand this is a huge pain in my ass, since really the last thing I want to do is deal with clusters and machines. On the other hand though, because we have the ability to spin up openstack boxes in our data centers at the click of a button, each team has the flexibility to host their own infrastructrure and stack.

For the most part my team and I are deploying our java services using dockerized containers. Our container is a centos7 base image with a logstash forwarder in it and some other minor tooling, and we run our java service in the foreground. All we need to have on our host boxes is a bootloader script that we execute to shut down old docker containers and spin up new docker containers, and of course docker. To get docker and our bootloader (and of course manage things like our jenkins instances, RMQ clusters, cassandra nodes, etc) we are using puppet.

After deep diving into puppet my first question was “how do I test this?”. Most suggestions indicate testing is two fold

Syntax checking

Integration testing on isolated machines

The first element is a no brainer. You run the puppet syntax checker and you get some output. That’s not that helpful though, other than making sure I didn’t fat finger something. And the second point really sucks. You have to manually check if everything worked. As an engineer I shudder at the word “manual”, so I set out to create an isolated test framework that my team can use to simulate and automatically test puppet scripts both local and on jenkins.

To do that, I wrote puppety. It’s really stupidly simple. The gist is you have a puppet master in a docker container who auto signs anyone who connects, and you have a puppet agent in a docker container who connects, syncs, and then runs tests validating the sync was complete.

Puppety structure

If you look at the git repo, you’ll see there are two main folders:

/data
/test

The /data folder is going to map to the /etc/puppet folder on our puppet master. It should contain all the stuff we want to deploy as if we plopped that whole folder onto the puppet root.

The test folder contains the python test runners, as well as the dockerized containers for both the master and the agent.

Testing a node

If you have a node configuration in an environment you can test a node by annotating it like so:

Now, when the dockerized puppet container connects, it assumes the role of the jenkins node!

The tests sit in a folder called tests/runners and the test name is the path to the test to run. It’s that simple.

We are also structuring our puppet scripts in terms of roles. Roles using a custom facter who reads from /etc/.role/role to find out the role name of a machine. So this way, when a machine connects to puppet it’ll say “I’m this role” and puppet can switch on the role to know what configurations to apply.

Tests are annotated by where they’ll run at. Agent tests run after a sync, but master tests will run BEFORE the master runs. This is so you can do any setup on the master you need. Need to drop in some custom data before the agent starts? Perfect place to do it.

Getting test results on jenkins

The fun part about this is that we can output the result of each test into a linked docker volume. Our jenkins test runner just looks like:

To deploy we have cron job on the puppet master to pull back our puppet scripts git repo and merge the data folder into its /etc/puppet folder.

Debugging the containers

Sometimes using puppety goes wrong and it’s nice to see whats going on. Because each container exposes an entrypoint script we can pass in a debug flag to get access to a shell so we can run the tests manually:

Now we can execute the entrypoint by hand, or run puppet by hand and play around.

Conclusion

All in all this has worked really well for our team. It’s made it easy for us to prototype and play with our infrastructure scripts in a controlled environment locally. And since we are able to now actually write tests against our infrastructure we can feel more comfortable about pushing changes out to prod.