Add Certificate Generation and Management To Orders

Barbican can be used to automate and streamline secure generation and storage
of SSL certificates and associated private keys from client information (such
as a CSR). This feature requires being able to coordinate between multiple
possible certificate authorities (CA) for generation, Barbican for secure
certificate storage and notification once generation is completed. This
blueprint details a plugin-based approach to satisfy these requirements.

To generate and manage SSL certificates within Barbican via its orders
resource requires that Barbican perform these actions: (1) accept certificate
information from a client, (2) initiate a certificate generation order with
one of several possible certificate authorities (CA), (3) check on the
progress of each CA order periodically, (4) securely store certificate
information once generated by the CA, and finally (5) notify clients that the
certificate is ready to provision. This wiki page details the flows involved:
https://wiki.openstack.org/wiki/Barbican/Blueprints/ssl-certificates.

Barbican should be capable of interacting with a variety of different CAs,
each having their own custom interaction workflows. Barbican should be able to
store state on behalf of this workflow between states (such as to wait for
a CA to generate a certificate). Barbican should also provide a means for
workflow processes to reschedule calling back to the workflow process, such
as retrying a failed attempt to create a certificate order in the CA.

Barbican should be capable of notifying clients about issues or generated
certificates via a variety of eventing mechanisms available to a deployment.
Examples include surfacing CADF events via Ceilometer, or issuing tickets into
a corporate ticketing system.

Barbican should provide CA workflow processes the ability to securely store
certificate information in Barbican without needing to directly interact
with its datamodel.

Barbican should provide CA workflow processes the ability to reschedule
themselves at a future time, perhaps to reattempt a failed process such as
establishing an order with the CA.

Barbican should allow for validation of the certificate order fields prior
to initiating the CA ordering process. Barbican should allow for clients to
update information about their launched CA order, such as providing
corrections, cancellations or approvals. This interface could also support
revoking a certificate once it is created and installed in Barbican.

This blueprint proposes using a plugin approach to interact with specific CAs.
The plugin contract will expose processing methods that represent workflow
actions, such as ‘request_certificate(…)’ or ‘check_request_status(…)’.
This blueprint will defer to implementing CRs the specific names and usages
for these methods.

Because the workflow/state handling required to interact with a given CA is
expected to vary significantly between CA vendors, it is expected that the
plugin will need to manage its own state machine. However, the CA plugins
shouldn’t have to manage persisting/retrieving state machine data for a given
order instance. Hence Barbican should handle providing a dict of information
into the plugin’s methods into which they can reference their state. For
example, this might include a ‘state’ key that keeps track of the current state
machine state for an order. Barbican would then store this dict as
plugin-specific metadata about the order.

A separate scheduled process run from the worker nodes (via oslo incubator’s
periodic_task feature) would poll CAs for updates to pending certificate
orders, generating RPC tasks for each update, which would eventually invoke the
CA plugin method above.

Also proposed is a plugin for surfacing certificate events from Barbican.
Since CA plugins will know best within their workflows when an event should be
issued (say when a certificate is generated) the proposal is to pass the event
plugin into the CA plugin’s methods. This inversion of control (IoC) approach
would allow plugins to be in control of event sequencing, and would allow
separation from how the events are handled within and from Barbican. This
plugin could have domain specific methods that make sense for certificate
processing work, such as ‘notify_certificate_is_generated()’. The default
out-of-the-box plugin would just log events. A provided optional event plugin
implementation would create CADF events for Ceilometer to handle.

The CA workflow plugins also know best when to store a generated certificate.
Therefore another IoC adapter would be passed into the CA plugin’s
method providing specific methods such as ‘store_certificate()’ which would be
implemented as a Barbican Container repository save call.

Finally, an IoC adapter would be passed into the CA plugin’s methods allowing
the plugin to invoke one of its method at a future time, by enqueing an RPC
call that launches the CA plugin’s method.

The certificate generation process detailed above is a workflow that could be
expressed using a workflow framework. Several of them are discussed below.
They all generally aim to execute a series of preconfigured tasks sequentially
or in parallel until completion, potentially reverting the entire sequence if
errors occur.

The certificate processing state machine does not seem to be a good fit for
this approach for two reasons:

(1) Certificate processing involves potentially long delays (multiple days)
between CA interaction state machine steps, including polling the CA for
status updates or waiting for client updates. Delays and scheduled polling
behaviors do not appear to be integrated into the existing workflow approaches
below, so Barbican would need to implement such logic anyway.

(2) Some state steps may need to be repeated, such as retrying initiating an
order with the CA when the CA is unavailable. So in this case, the order of
tasks executed is not known a priori, and when an error happens tasks
completed up to that point should not always be reverted.

Desired though not required is that the workflow process play nicely with the
Barbican worker processes, including being able to react to RPC calls from
the queue, or to scheduled update requests.

TaskFlow (https://wiki.openstack.org/wiki/TaskFlow) is an OpenStack Python
library that can compose tasks and workflows using Python classes. These are
intended to be run from within an ‘engine’, which manages the execution of the
configured tasks. Hence TaskFlow might be best run as a separate process or
node from the Barbican workers. It seems to be the most capable for this
blueprint’s needs, but still has the ‘fitness’ issues mentioned above.

Mistral (https://wiki.openstack.org/wiki/Mistral) is an OpenStack Workflow as
a Service project. It is in an early stage however, an may not be available by
the Juno or K releases. It does not let projects upload and execute custom
code, but rather calls back to Barbican to perform its tasks.

Lower level frameworks such as machinist
(https://pypi.org/project/machinist/0.1) can operate state machines as
defined via Python objects. Like TaskFlow this might be best run as a
process separate from the Barbican workers. This framework seemed less
flexible than TaskFlow.

The current Orders entity would need to have a new relationship added to store
a CA plugin’s metadata, similar to the key/value metadata store that was added
to the Secrets entity. This would be different than the Order’s ‘meta’
attribute, which is used to store user-provided ordering information.

No database migrations should be required as the metadata is both optional and
controlled by the plugins themselves.

Barbican will interact with a third-party system (a certificate authority).
This interaction is only initiated by Barbican, but care must be taken to
validate both the user provided information (via the certificate plugin) as
well as the response information back from the CA.

This blueprint calls for using a plugin approach to surface events from the
certificate generation process, in particular to notify when a certificate has
been generated and is ready to install, and when an error has occurred. Errors
could include the CA rejecting the order, is temporarily unavailable or is
rate-limiting the number of requests made by the client. In some cases,
Barbican would need to re-attempt the request at some point in the future.

This blueprint will probably be the first use case for event generation and
notifications in Barbican.

The impact of this certificate processing should be minimal, since even
though it could take days to approve and generate a certificate, the vast
majority of that time is spent waiting on either the CA to update a given
certificate order, or else for users to provide corrections or approvals.
When Barbican is processing a state machine step the computation load should
be minimal.

Also, it is expected that not many certificate orders will be processed
concurrently. Even if the load does increase over time, all certificate
processing is performed asynchronously on worker nodes, so additional delays
will be accommodated and are likely to be a fraction of the overall
certificate workflow period.