This guide assumes that the undercloud is already installed and ready
to deploy an overcloud and that the appropriate repositories
containing Ceph packages, including ceph-ansible if applicable, have
been enabled and installed as described in
Installing the Undercloud.

TripleO can deploy and configure Ceph as if it was a composable
OpenStack service and configure OpenStack services like Nova, Glance,
Cinder, Cinder Backup, and Gnocchi to use it as a storage backend.

Prior to Pike, TripleO deployed Ceph with puppet-ceph. With the
Pike release it is possible to use TripleO to deploy Ceph with
either ceph-ansible or puppet-ceph, though puppet-ceph is
deprecated. To deploy Ceph in containers use ceph-ansible, for which
only a containerized Ceph deployment is possible. It is not possible
to deploy a containerized Ceph with puppet-ceph.

To deploy with Ceph include either of the appropriate environment
files. For puppet-ceph use “environments/puppet-ceph.yaml”
like the following:

When using ceph-ansible to deploy Ceph in containers, the process
described in the Container Image Preparation documentation
will configure the deployment to use the appropriate Ceph docker
image. However, it is also possible to override the default docker
image. For example:

In both the puppet-ceph and ceph-ansible examples above, at least one
Ceph storage node is required. The following example will configure
one Ceph storage nodes on servers matching the ceph-storage
profile. It will also set the default pool size, the number of times
that an object should be written for data protection, to one. These
parameter_defaults may be saved in an environment file
“~/my-ceph-settings.yaml” and added to the deploy commandline:

The values above are only appropriate for a development or POC
deployment. The default pool size is three but if there are less
than three Ceph OSDs, then the cluster will never reach status
HEALTH_OK because it has no place to make additional copies.
Thus, a POC deployment with less than three OSDs should override the
default default pool size. However, a production deployment should
replace both of the ones above with threes, or greater, in order to
have at least three storage nodes and at least three back up copies of
each object at minimum.

The above will produce three OSDs which run on /dev/sdb, /dev/sdc,
and /dev/sdd which all journal to /dev/sde. This same setup will
be duplicated per Ceph storage node and assumes uniform hardware. If
you do not have uniform hardware see Provisioning of node-specific Hieradata.

The parameter_defaults like the above may be saved in an environment
file “~/my-ceph-settings.yaml” and added to the deploy commandline:

The playbooks provided by ceph-ansible are triggered by a Mistral
workflow. A new CephAnsibleExtraConfig parameter has been added to
the templates and can be used to provide arbitrary config variables
consumed by ceph-ansible. The pre-existing template params consumed
by the TripleO Pike release to drive puppet-ceph continue to work
and are translated, when possible, into their equivalent
ceph-ansible variable.

The group variables ceph_osd_docker_memory_limit, which corresponds
to dockerrun...--memory, and ceph_osd_docker_cpu_limit, which
corresponds to dockerrun...--cpu-quota, may be overridden
depending on the hardware configuration and the system needs. Below is
an example of setting custom values to these parameters:

When collocating Ceph OSD services on the same nodes which run Nova
compute services (also known as “hyperconverged deployments”),
variations of the above may be made to ensure Ceph does not consume
resources Nova may need.

ceph-ansible 3.2 and newer

As of ceph-ansible 3.2, the ceph_osd_docker_memory_limit and
ceph_osd_docker_cpu_limit are set by default to the max memory
and CPU of the host in order to ensure Ceph does not run out of
resources unless the user specifically overrides these values. The
3.2 version also introduced the boolean is_hci flag, which may
be set when using bluestore to automatically tune the bluestore
cache size as below:

Because /dev/nvme0n1 is in a higher performing device class, e.g.
it is an SSD and the other devices are spinning HDDs, the above will
produce three OSDs which run on /dev/sdb, /dev/sdc, and
/dev/sdd and they will use /dev/nvme0n1 as a bluestore WAL device.
The ceph-volume tool does this by using the “batch” subcommand.
This same setup will be duplicated per Ceph storage node and assumes
uniform hardware. If you do not have uniform hardware see
Provisioning of node-specific Hieradata. If the bluestore WAL data will reside
on the same disks as the OSDs, then the above could be changed to the
following:

osd_scenario: lvm is used above to default new
deployments to bluestore as configured, by ceph-volume,
and is only available with ceph-ansible 3.2, or newer,
and with Luminous, or newer. The parameters to support
filestore with ceph-ansible 3.2 are backwards-compatible
so existing filestore deployments should not simply have
their osd_objectstore or osd_scenario parameters
changed without taking steps to maintain both backends.

Filestore or ceph-ansible 3.1 (or older)

Ceph Luminous supports both filestore and bluestore, but bluestore
deployments require ceph-ansible 3.2, or newer, and ceph-volume.
For older versions, if the osd_scenario is either collocated or
non-collocated, then ceph-ansible will use the ceph-disk tool,
in place of ceph-volume, to configure Ceph’s filestore backend
in place of bluestore. A variation of the above example which uses
filestore and ceph-disk is the following:

The above will produce three OSDs which run on /dev/sdb,
/dev/sdc, and /dev/sdd, and which all journal to three
partitions which will be created on /dev/nvme0n1. If the
journals will reside on the same disks as the OSDs, then
the above should be changed to the following:

To resolve this difference, use Provisioning of node-specific Hieradata to
map the filestore node’s machine unique UUID to the filestore
parameters, so that only those nodes are passed the filestore
parmaters, and then set the default Ceph parameters, e.g. those
found in ~/my-ceph-settings.yaml, to the bluestore parameters.

Be sure to set every existing Ceph filestore server to the filestore
parameters by its machine unique UUID. If the above is not done and
the default parameter is set to osd_scenario=lvm for the existing
nodes which were configured with ceph-disk, then these OSDs will not
start after a restart of the systemd unit or a system reboot.

The example above, makes bluestore the new default and filestore an
exception per node. An alternative approach is to keep the default of
filestore and ceph-disk and use Provisioning of node-specific Hieradata for
adding new nodes which use bluestore and ceph-volume. A benefit of
this is that there wouldn’t be any configuration change for existing
nodes. However, every scale operation with Ceph nodes would require
the use of Provisioning of node-specific Hieradata. While the example above,
of making filestore and ceph-disk the per-node exception, requires
more work up front, it simplifies future scale up when completed. If
the cluster will be migrated to all bluestore, through node scale down
and scale up, then the amount of items in ~/my-node-settings.yaml
could be reduced for each scale down and scale up operation until the
full cluster uses bluestore.

The number of OSDs in a Ceph deployment should proportionally affect
the number of Ceph PGs per Pool as determined by Ceph’s
pgcalc. When the appropriate default pool size and PG number are
determined, the defaults should be overridden using an example like
the following:

parameter_defaults:CephPoolDefaultSize:3CephPoolDefaultPgNum:128

In addition to setting the default PG number for each pool created,
each Ceph pool created for OpenStack can have its own PG number.
TripleO supports customization of these values by using a syntax like
the following:

In the above example, PG numbers for each pool differ based on the
OpenStack use case from pgcalc. The example above also passes
additional options as described in the ceph osd pool create
documentation to the volumes pool used by Cinder.

TripleO runs the ceph-ansible site-docker.yml.sample playbook by
default. The values in this playbook should be overridden as described
in this document and the playbooks themselves should not be modified.
However, it is possible to specify which playbook is run using the
following parameter:

For each TripleO Ceph deployment, the above playbook’s output is logged
to /var/log/mistral/ceph-install-workflow.log. The default verbosity
of the playbook run is 0. The example below sets the verbosity to 3:

parameter_defaults:CephAnsiblePlaybookVerbosity:3

During the playbook run temporary files, like the Ansible inventory
and the ceph-ansible parameters that are passed as overrides as
described in this document, are stored on the undercloud in a
directory that matches the pattern /tmp/ansible-mistral-action*.
This directory is deleted at the end of each Mistral workflow which
triggers the playbook run. However, the temporary files are not
deleted when the verbosity is greater than 0. This option is helpful
when debugging.

The Ansible environment variables may be overridden using an example
like the following:

In the above example, the number of SSH retries is increased from the
default to prevent timeouts. Ansible’s fork number is automatically
limited to the number of possible hosts at runtime. TripleO uses
ceph-ansible to configure Ceph clients in addition to Ceph servers so
when deploying a large number of compute nodes ceph-ansible may
consume a lot of memory on the undercloud. Lowering the fork count
will reduce the memory footprint while the Ansible playbook is running
at the expense of the number of hosts configured in parallel.

The desired options from the ceph-ansible examples above to customize
the ceph.conf, container, OSD or Ansible options may be combined under
one parameter_defaults setting and saved in an environment file
“~/my-ceph-settings.yaml” and added to the deploy commandline:

In the example above, the OVERCLOUD_HOSTS variable should be set to
the IPs of the overcloud hosts which will be Ceph servers or which
will host Ceph clients (e.g. Nova, Cinder, Glance, Gnocchi, Manila,
etc.). The enable-ssh-admin.sh script configures a user on the
overcloud nodes that Ansible uses to configure Ceph.

Note

Both puppet-ceph and ceph-ansible do not reformat the OSD disks and
expect them to be clean to complete successfully. Consequently, when reusing
the same nodes (or disks) for new deployments, it is necessary to clean the
disks before every new attempt. One option is to enable the automated
cleanup functionality in Ironic, which will zap the disks every time that a
node is released. The same process can be executed manually or only for some
target nodes, see cleaning instructions in the Ironic doc.