Prelude To Operational Simplicity – A Two Act Play

MCP heralded the coming of continuous innovation for cloud infrastructure by including DriveTrain as a lifecycle management system capable of consuming incremental technology. Instead of large integrated releases after every OpenStack Foundation release as before, with MCP, Mirantis embarked upon continuous delivery on the order of every few weeks. DriveTrain could then methodically consume some or all of the latest innovation, bring it through a CI/CD pipeline, validate it in a staging environment, and promote it into production without downtime. Thus, the difficulties historically associated with the lifecycle management of OpenStack have been replaced by an automated, repeatable process, and can be done with little or no downtime.

Even better, this innovation is not limited to OpenStack, but applies to all the open cloud components of MCP, including containers with Kubernetes, SDN services with Mirantis OpenContrail, and more.

Act 1 – OpenStack Upgrades Made “Doubly” Simple

Initially, because there were no major upgrades of OpenStack or other components to upgrade to just yet, the innovation DriveTrain consumed consisted of new features for MCP as well as updates and fixes. Now, with support for the latest OpenStack release, Ocata, that has changed.

A detailed description and demo of DriveTrain completing an OpenStack upgrade from MCP’s initial support of Mitaka to Ocata (skipping Newton with a “double upgrade”) can be seen here. As you can see from the overview of this process from one of our best and brightest engineering directors, Jakub Pavlik, what used to take days of careful planning followed by days of downtime for an OpenStack upgrade and validation can now be automated by DriveTrain within a few hours, with zero workload downtime!

Here is a summary of the highlights:

Preparation and testing of one Ocata VM takes about 40 minutes during which the production Mitaka cloud is live.

After validation, the upgrade to an Ocata-based highly available production control plane takes about 42 minutes with zero downtime for running workloads while the Mitaka control plane is offline.

Next, the Mitaka compute nodes can connect to the Ocata control plane and cascade through upgrades to Ocata at a later time.

Jakub also demonstrated a rollback to Mitaka, which requires about 9 minutes based on restoring the original Mitaka database from the first step in the process.

Amazing. Simply amazing.

Act 2 – Now With OpenContrail Too

But it’s not just OpenStack that’s benefiting from DriveTrain, and not just in the lab. It’s also great to see Mirantis customers taking advantage of continuous delivery live in their Managed Open Clouds.

Let’s take another example, this time with Mirantis OpenContrail at one of our customers. This is no demo; we are talking about the SDN of a major running OpenStack cloud in production and serving thousands of users. Again, in the past this would have taken days of planning and many days, if not weeks, of downtime.

But not anymore. Overall this upgrade required about 5 hours, including cascading through all the compute nodes. Highlights on this one include:

A major upgrade of Mirantis OpenContrail 3.0.2 to 3.1.1

Over 1000 code changes executed by DriveTrain with zero issues

The pull from github took about 10 minutes, installing the latest OpenContrail salt formula required about 2 minutes, the DB backup took about 10 minutes, the upgrade of the Mirantis OpenContrail control plane took 1 hour

80 compute nodes were upgraded in a cascading fashion at ~20/hour (or ~4 hours in total)