Share this:

Like this:

I have published a new article on InfoQ, Scaling Docker with Kubernetes, where I describe the Kubernetes project and how it allows to run Docker containers across multiple hosts.

Kubernetes is an open source project to manage a cluster of Linux containers as a single system, managing and running Docker containers across multiple hosts, offering co-location of containers, service discovery and replication control.

Included in the article there is an example of scaling Jenkins in a master – multiple slaves architecture, where the slaves are running in Kubernetes. When I finally find the time I will implement a Jenkins Kubernetes plugin that would handle the scaling automatically.

Like this:

Everybody should be building Docker images! but what if you don’t want to write all those shell scripts, which is basically what the Dockerfile is, a bunch of shell commands in RUN declarations; or if you are already using some Puppet modules to build VMs?

It is easy enough to build a new Docker image from Puppet manifests. For instance I have built this Jenkis slave Docker image, so here are the steps.

Otherwise you can just install Puppet in any bare image using the normal installation instructions. Something to have into account is that Docker images are quite simple and may not have some needed packages installed. In this case the centos6 image didn’t have tar installed and some things failed to run. In some CentOS images the centosplus repo needs to be enabled for the installation to succeed.

Once Puppet is installed we can apply any manifest to the server, we just need to put the right files in the right places. If we need extra modules we can copy them from the host, maybe using librarian-puppet to manage them. Note that I’m avoiding to run librarian or any tool in the image, as that would require installing extra packages that may not be needed at runtime.

ADD modules/ /etc/puppet/modules/

The main manifest can go anywhere but the default place is into /etc/puppet/manifests/site.pp. Hiera data default configuration goes into /var/lib/hiera/common.yaml.

After that it’s the usual Docker CMD configuration. In this case we call Jenkins slave jar from a shell script that handles some environment variables, with information about the Jenkins master, so it can be overriden at runtime with docker run -e.

librarian-puppet 1.3.1 and 1.0.8 [changelog] include two important changes. Now there is no need to create a Puppetfile if you have a Modulefile or metadata.json, it will use them by default. Of course you can add a Puppetfile to bring in modules from git, a directory, or github tarballs.

The other change is that all the dependencies’ metadata.json files will be parsed now for transitive dependencies, so it works with the latest Puppet Labs modules and those migrated from the old Modulefile format going forward. That also means that the puppet gem is no longer needed if there are no Modulefile present in your tree of dependencies, which was a source of pain for some users.

The 1.0.x branch is kept updated to run in Ruby 1.8 while 1.1+ requires Ruby 1.9 and uses the Puppet Forge API v3.

Puppet Blacksmith, the gem to automate pushing modules to the Puppet Forge was also updated to use metadata.json besides Modulefile in version 2.2+ [changelog].

In Maestro we typically use a Maestro master server and multiple Maestro agents. Each Maestro Agent is just a small service where the actual work happens, it processes the work sent by the master, via ActiveMQ, and executes the plugins with the data received.

The two main goals of the agent are load distribution and heterogeneous composition support. The more agents running, the more compositions that can be executed in parallel, and compositions can target specific agents based on its features, such as architecture, operating system,… which is a must for development environments. For simplicity each agent can only run one composition at a time, but you could have multiple agent processes running in a single server.

It uses Puppet Facter to gather the machine facts (operating system, memory size, cloud provider data,…) and sends all that information to the master, that can use it to filter what compositions run in the agent. For instance I may want to run a composition in a Windows agent, or in an agent that has some specific piece of software installed. Facter supports external facts so it is really easy to add new filtering capabilities, and not be just limited to what Facter provides out of the box. A small text file can be added to /etc/facter/facts.d/ and Facter would report it to the master server.

Agents are installed alongside with all the tools that may be needed, from Git, to clone repos, to Jenkins swarm to reuse the agents as Jenkins slaves, or mcollective agents to allow updating the agent itself automatically with Puppet when new manifests are deployed to the Puppet master. In our internal environment any commit to Puppet manifests or modules automatically trigger our rspec-puppet tests, the deployment of those manifests to the Puppet master, and a cascading Puppet update of all the machines in our staging environment using MCollective. All our Puppet modules are likewise built and tested on each commit and a new version published to the Puppet Forge automatically using rspec-puppet and Puppet Blacksmith.

Maestro also supports manually assigning agents to pools, and matching compositions with agent pools, so compositions can be limited to run in a predefined set of agents.

The agent process is written in Ruby and runs under JRuby in the JVM, thus supporting multiple operating systems and architectures, and the ability to write extensions in Java or Ruby easily. It connects to the master’s Composition Execution Engine through ActiveMQ using STOMP for messaging.

Plugins

Plugins are small pieces of code written in Java or Ruby that run in the agent to execute the actual work. We have made all plugins available in GitHub so they can be used as examples to create new plugins for custom tasks.

Plugins can be added to Maestro at runtime and automatically show up in the composition editor. The plugin manifest defines the plugin images, what tasks are defined, and what fields in each task. Based on the workload received, the agent downloads and executes the plugin, which just accesses the fields in the workload and do the actual work, whatever it might be, sending output back to LuCEE and populating the composition context.

For instance the Fog plugin can manage multiple clouds, such as EC2, where it can start and stop instances. The plugin receives the fields defined in the composition (credentials, image id,…), calls the EC2 API, streams the status to the Maestro output (successfully created, instance id,…) and puts some data (ids of the instances created, public ips,…) in the composition context for other tasks to use. All of that in less than 100 lines of code.

The agent cloud manager is a service that runs on Google Compute Engine and watches a number of Maestro installations to provide automatic agent scaling. Based on preconfigured parameters such as min/max number of agents for each agent pool, max waiting time,… and the current status of each agent pool queue, the service can start new machines from specific images, suspend them (destroy the instance but keep the disk), or completely destroy them.

Maestro architecture is basically defined by a master server and multiple agents, written in Java and Ruby (JRuby) for the backend and JavaScript for the frontend using AngularJS, and integrating several open source services. It is quite heterogeneous, with multiple languages, build tools, packages,… using the best tool for the job in each part of the stack.

Master

Maestro REST API

The REST API is a webapp written in Java, using Spring, packaged with a Jetty server. It is documented with Swagger annotations that generate a really nice web interface automatically that allows trying all the operations from the browser.

It handles caching, security, based on LDAP or database records, and delegates to the Composition Execution Engine (LuCEE) typically through LuCEE REST API but also via STOMP messaging to avoid continuous polling.

It also implements handlers to execute compositions from Github, Git, SVN,… on commit callbacks.

Built with Maven and Grunt (better for the Javascript parts), using Bower to manage all the Javascript dependencies (angular core, bootstrap, ladda button spinner,…), and Karma + PhantomJS, for headless UI tests without needing a real browser.

Composition Execution Engine (LuCEE)

LuCEE is a webapp that manages the execution of compositions, sending/receiving work to/from the agents through ActiveMQ STOMP queues, and storing state in the PostgreSQL database. LuCEE uses the Ruote workflow engine for work scheduling, and manages the compositions queue and agent routing, so basically checks what compositions need to be executed and decides in what agent to execute them, based on composition requirements, free agents, and other factors ie. prioritizing previously used agents that would likely have a cached copy of sources and dependencies to speed things up.

It is written in Ruby, it was quick to implement a first version, with a simple REST API using Sinatra and a STOMP connector to send messages to the Maestro REST webapp through ActiveMQ.

It is packaged as a JRuby war with Warbler, and both LuCEE and the REST API wars are run in the same Jetty server, all packaged as an RPM for easier deployment.

ActiveMQ

ActiveMQ handles all the comunication between LuCEE, the REST API webapp, and the agents using multiple STOMP queues. All the comunication between LuCEE and agents such as workloads, agent output, agent status,… is sent over a queue so it can be easily scaled across a high number of agents.

LuCEE also pushes changes in the database to the REST API webapp so it can update the caches without needing continuous polling.

PostgreSQL

LuCEE uses PostgreSQL (or MySQL or any other SQL database using Ruby Datamapper) as main storage to save compositions, projects, tasks,… The SQL database is also used by the REST API webapp to store permissions and user data when not using LDAP.

MongoDB

We found that in order to do more complex dashboards and reports we needed to store all sort of unstructured data from the plugins, from run time or status to anything that a plugin developer may want such as GitHub payload data received or test stacktrace. That data is sent by the agents to LuCEE and then stored in MongoDB, and can be queried directly (all your data belong to you) or through a reporting pane in the webapp.