DC/OS is an excellent platform to deploy distributed services like Apache Cassandra, Kafka, Hdfs, or even SQL Server. It includes many additional features like the container orchestration and resource management, but this blog post will focus on the deployment and maintenance of DC/OS for distributed services.

We have the following goals:

An enterprise-grade deployment of DC/OS.

The ability to perform upgrades to DC/OS and the virtual machines' OS.

Easy deployment and maintenance of virtual machines.

To monitor the health of virtual machines and resurrect terminated instances.

Complex systems require more than just ease of deployment. We build tools for production that are easy for operators to maintain and developers to consume. We care about "Day 2" and the impact that has on people.

Why BOSH?

BOSH is a great tool for provisioning and maintaining virtual machines and disk for large distributed systems. It can deploy VMs with CPIs to AWS, Azure, GCE, vSphere, OpenStack and other infrastructures. Targeting different infrastructures requires very little effort because of the CPIs. BOSH also can:

The site bosh.io provides pre built and regularly CVE patched Stemcells for Ubuntu & CentOS.

BOSH is also a software/packaging life cycle management tool. A BOSH Manifest combines a BOSH Release and Stemcell into a yaml file to describe the software and operating system a set of servers. BOSH uses this manifest file to deploy to the targeted infrastructure, monitoring the health of the virtual machines and resurrecting any which are destroyed.

BOSH was originally created for deploying Cloud Foundry, a PaaS for deploying, scaling and managing stateless applications. It can also be used to deploy services like HAProxy, Redis and etcd so it isn't limited to just deploying Cloud Foundry.

Overview of How to Deploy DC/OS on BOSH

Next, you will need a BOSH Release for DC/OS. The BOSH Release for DC/OS describes how the software on a DC/OS deployment should look. The release includes configuration and declaration of the packages which will run on each type of VM (called a job). The BOSH Release for DC/OS is located here and is used to describe the mesos-agent, mesos-public-agent and mesos-master jobs which BOSH installs onto VM instances of these jobs when deployed.

The next step is to create a BOSH Deployment Manifest which contains:

The types of VMs to create (instance type, Stemcell/OS, disk).

The software that should run on these VMs (releases, packages).

The network used by the VMs (ip ranges, subnet, security groups, availability zones).

How to Upgrade DC/OS

This is where the real power of the DC/OS BOSH Release is exposed. If you want to upgrade the version of DC/OS in a deployment you simply change the version of DC/OS in the deployment manifest and redeploy. All of the version handling is built in.

If there is a newer Stemcell (os) image you would like to use for a patched CVE or newer kernel, you upload the new Stemcell to the BOSH Director and deploy. BOSH will recreate each of the VMs, reattach the persistent storage and start the necessary components.

Scaling is simple too, need more mesos-agents? Modify the instances: in the deployment manifest and redeploy. One or more VMs will be created and the install scripts will be run automatically from the provision server.

Summary

Overall, BOSH is a great tool to:

Provision, monitor and recreate servers on most known IaaS providers

Manage software life cycle

Manage DC/OS upgrades

We hope others will find this useful! We are in both dcos-community.slack.com and cloudfoundry.slack.com as @lnguyen so feel free to ask questions there.