Scriptable Runbooks for Releases MVC

Problem to solve

Managing releases in GitLab from a release checklist point of view is difficult. For an example of how this is done in GitLab today, you can see how we managed the 11.4 release at gitlab-org/release/tasks#462 (closed) and gitlab-org/release/tasks#460 (closed). There are manual tasks defined in the issue description in markdown, and these are discussed and checked off as things go. This is a pain for a few reasons:

Context is minimal, detailed instructions could be linked to but aren't naturally in place in the checklist (it would get too long)

Clicking on checkboxes is error prone, there's no separate way to validate that something actually happened

It's not possible to see how the plan is changing over time

It's not possible to measure the performance/efficiency of the plan

Target audience

Release Managers and teams involved in executing releases. The main difference between this and a pipeline implementer for a .gitlab-ci.yml is that:

The authors of these kinds of pipelines are non-technical and even editing yaml would be a challenge

The pipeline consists of manual and automated steps, that may require additional approvals.

Our internal customers are the #production team (for runbooks in general) and #delivery team (for release plans)

Proposal

Build a way in GitLab to create an operational runbook which allows for mixing documentation (in markdown) with scriptable actions, embedded in the same document. We have implemented a version of this on the Configure team (&380), but it has difficult setup requirements (for example, requires k8s - limiting availability of the feature): https://docs.gitlab.com/ee/user/project/clusters/runbooks/. We can leverage this feature, but it would need to be more generally available.

Note that ChatOps already supplies a mechanism for running a script securely. Perhaps we can reuse/expand this capability for this feature.

Metrics on runbooks is not included in this iteration, but it should be possible to do things like generate a value stream map for a runbook, or show % automated/not automated and how that is progressing over time, for example. Also possible for the future are embedded approval tasks (requiring approval from specific people).

Further details

You can look at a release as a kind of state machine for deploying traditional applications. This could be done in something like Excel (which more people are still using than you might expect), or a workflow tool. You'd be able to see what the status is, which are on track, and what is failing. Typically, releases will have an overall due date and a work-back plan for delivering. Some items may depend on or block other items.

Manual (human does something). A sub-version of this might be a pure approval step.

There are a couple ways what we build can be better than using Excel:

Value stream analysis (how long are things taking, how much is automated, what tasks are taking a long time and could be opportunities to improve efficiency). This is what release orchestration products on the market primarily offer.

Integrating with our releases feature and tie in with capabilities like #56030 (evidence collection in releases), or calls out to pipelines. This is powerful, and takes advantage of our 'single-application' nature to offer better features.