The Ceph Dashboard is a built-in web-based Ceph management and monitoring
application to administer various aspects and objects of the cluster. It is
implemented as a Ceph Manager Daemon module.

The original Ceph Dashboard that was shipped with Ceph Luminous started
out as a simple read-only view into various run-time information and performance
data of a Ceph cluster. It used a very simple architecture to achieve the
original goal. However, there was a growing demand for adding more web-based
management capabilities, to make it easier to administer Ceph for users that
prefer a WebUI over using the command line.

The new Ceph Dashboard module is a replacement of the previous one and
adds a built-in web based monitoring and administration application to the Ceph
Manager. The architecture and functionality of this new module is derived from
and inspired by the openATTIC Ceph management and monitoring tool. The development is actively driven by the team
behind openATTIC at SUSE, with a lot of support from
companies like Red Hat and other members of the Ceph
community.

The dashboard module’s backend code uses the CherryPy framework and a custom
REST API implementation. The WebUI implementation is based on
Angular/TypeScript, merging both functionality from the original dashboard as
well as adding new functionality originally developed for the standalone version
of openATTIC. The Ceph Dashboard module is implemented as a web
application that visualizes information and statistics about the Ceph cluster
using a web server hosted by ceph-mgr.

Multi-User and Role Management: The dashboard supports multiple user
accounts with different permissions (roles). The user accounts and roles
can be modified on both the command line and via the WebUI. The dashboard
supports various methods to enhance password security, e.g. by enforcing
configurable password complexity rules, forcing users to change their password
after the first login or after a configurable time period. See
User and Role Management for details.

Single Sign-On (SSO): the dashboard supports authentication
via an external identity provider using the SAML 2.0 protocol. See
Enabling Single Sign-On (SSO) for details.

SSL/TLS support: All HTTP communication between the web browser and the
dashboard is secured via SSL. A self-signed certificate can be created with
a built-in command, but it’s also possible to import custom certificates
signed and issued by a CA. See SSL/TLS Support for details.

Auditing: the dashboard backend can be configured to log all PUT, POST
and DELETE API requests in the Ceph audit log. See Auditing API Requests
for instructions on how to enable this feature.

Internationalization (I18N): the dashboard can be used in different
languages that can be selected at run-time.

Currently, Ceph Dashboard is capable of monitoring and managing the following
aspects of your Ceph cluster:

Monitoring: Enable creation, re-creation, editing and expiration of
Prometheus’ silences, list the alerting configuration of Prometheus and all
configured and firing alerts. Show notifications for firing alerts.

Configuration Editor: Display all available configuration options,
their description, type and default values and edit the current values.

OSDs: List all OSDs, their status and usage statistics as well as
detailed information like attributes (OSD map), metadata, performance
counters and usage histograms for read/write operations. Mark OSDs
up/down/out, purge and reweight OSDs, perform scrub operations, modify
various scrub-related configuration options, select different profiles to
adjust the level of backfilling activity. List all disks associated with an
OSD. Set and change the device class of an OSD, display and sort OSDs by
device class. Deploy new OSDs on new disks/hosts.

Device management: List all hosts known by the orchestrator. List all
disks and their properties attached to a node. Display disk health information
(health prediction and SMART data). Blink enclosure LEDs.

iSCSI: List all hosts that run the TCMU runner service, display all
images and their performance characteristics (read/write ops, traffic).
Create, modify and delete iSCSI targets (via ceph-iscsi). Display the
iSCSI gateway status on the landing page and info about active initiators.
See Enabling iSCSI Management for instructions on how to configure
this feature.

If you have installed ceph-mgr-dashboard from distribution packages, the
package management system should have taken care of installing all the required
dependencies.

If you’re installing Ceph from source and want to start the dashboard from your
development environment, please see the files README.rst and HACKING.rst
in directory src/pybind/mgr/dashboard of the source code.

If different certificates are desired for each manager instance for some reason,
the name of the instance can be included as follows (where $name is the name
of the ceph-mgr instance, usually the hostname):

This might be useful if the dashboard will be running behind a proxy which does
not support SSL for its upstream servers or other situations where SSL is not
wanted or required. See Proxy Configuration for more details.

Warning

Use caution when disabling SSL as usernames and passwords will be sent to the
dashboard unencrypted.

Note

You need to restart the Ceph manager processes manually after changing the SSL
certificate and key. This can be accomplished by either running cephmgrfailmgr or by disabling and re-enabling the dashboard module (which also
triggers the manager to respawn itself):

Since each ceph-mgr hosts its own instance of dashboard, it may also be
necessary to configure them separately. The IP address and port for a specific
manager instance can be changed with the following commands:

In order to be able to log in, you need to create a user account and associate
it with at least one role. We provide a set of predefined system roles that
you can use. For more details please refer to the User and Role Management
section.

To create a user with the administrator role you can use the following
commands:

You can now access the dashboard using your (JavaScript-enabled) web browser, by
pointing it to any of the host names or IP addresses and the selected TCP port
where a manager instance is running: e.g., http(s)://<$IP>:<$PORT>/.

You should then be greeted by the dashboard login page, requesting your
previously defined username and password.

In a typical default configuration with a single RGW endpoint, this is all you
have to do to get the Object Gateway management functionality working. The
dashboard will try to automatically determine the host and port of the Object
Gateway by obtaining this information from the Ceph Manager’s service map.

If multiple zones are used, it will automatically determine the host within the
master zone group and master zone. This should be sufficient for most setups,
but in some circumstances you might want to set the host and port manually:

If you are using a self-signed certificate in your Object Gateway setup, then
you should disable certificate verification in the dashboard to avoid refused
connections, e.g. caused by certificates signed by unknown CA or not matching
the host name:

$ ceph dashboard set-rgw-api-ssl-verify False

If the Object Gateway takes too long to process requests and the dashboard runs
into timeouts, then you can set the timeout value to your needs:

The Ceph Dashboard can manage iSCSI targets using the REST API provided by the
rbd-target-api service of the Ceph iSCSI Gateway. Please make sure that it’s
installed and enabled on the iSCSI gateways.

Note

The iSCSI management functionality of Ceph Dashboard depends on the latest
version 3 of the ceph-iscsi project.
Make sure that your operating system provides the correct version, otherwise
the dashboard won’t enable the management features.

If ceph-iscsi REST API is configured in HTTPS mode and its using a self-signed
certificate, then you need to configure the dashboard to avoid SSL certificate
verification when accessing ceph-iscsi API.

To disable API SSL verification run the following command:

$ ceph dashboard set-iscsi-api-ssl-verification false

The available iSCSI gateways must be defined using the following commands:

Grafana requires data from Prometheus. Although
Grafana can use other data sources, the Grafana dashboards we provide contain
queries that are specific to Prometheus. Our Grafana dashboards therefore
require Prometheus as the data source. The Ceph Prometheus Module also only
exports its data in the Prometheus’ common format. The Grafana dashboards rely
on metric names from the Prometheus module and Node exporter. The Node exporter is a
separate application that provides machine metrics.

Note

Prometheus’ security model presumes that untrusted users have access to the
Prometheus HTTP endpoint and logs. Untrusted users have access to all the
(meta)data Prometheus collects that is contained in the database, plus a
variety of operational and debugging information.

However, Prometheus’ HTTP API is limited to read-only operations.
Configurations can not be changed using the API and secrets are not
exposed. Moreover, Prometheus has some built-in measures to mitigate the
impact of denial of service attacks.

Please see Prometheus’ Security model
<https://prometheus.io/docs/operating/security/> for more detailed
information.

Grafana and Prometheus are likely going to be bundled and installed by some
orchestration tools along Ceph in the near future, but currently, you will have
to install and configure both manually. After you have installed Prometheus and
Grafana on your preferred hosts, proceed with the following steps.

Enable the Ceph Exporter which comes as Ceph Manager module by running:

In newer versions of Grafana (starting with 6.2.0-beta1) a new setting named
allow_embedding has been introduced. This setting needs to be explicitly
set to true for the Grafana integration in Ceph Dashboard to work, as its
default is false.

[security]allow_embedding=true

After you have set up Grafana and Prometheus, you will need to configure the
connection information that the Ceph Dashboard will use to access Grafana.

You need to tell the dashboard on which url Grafana instance is running/deployed:

Ceph Dashboard embeds the Grafana dashboards via iframe HTML elements.
If Grafana is configured without SSL/TLS support, most browsers will block the
embedding of insecure content into a secured web page, if the SSL support in
the dashboard has been enabled (which is the default configuration). If you
can’t see the embedded Grafana dashboards after enabling them as outlined
above, check your browser’s documentation on how to unblock mixed content.
Alternatively, consider enabling SSL/TLS support in Grafana.

If you are using a self-signed certificate in your Grafana setup, then you should
disable certificate verification in the dashboard to avoid refused connections,
e.g. caused by certificates signed by unknown CA or not matching the host name:

$ ceph dashboard set-grafana-api-ssl-verify False

You can directly access Grafana Instance as well to monitor your cluster.

The Ceph Dashboard supports external authentication of users via the
SAML 2.0 protocol. You need to create
the user accounts and associate them with the desired roles first, as authorization
is still performed by the Dashboard. However, the authentication process can be
performed by an existing Identity Provider (IdP).

Note

Ceph Dashboard SSO support relies on onelogin’s
python-saml library.
Please ensure that this library is installed on your system, either by using
your distribution’s package management or via Python’s pip installer.

To configure SSO on Ceph Dashboard, you should use the following command:

Using Prometheus for monitoring, you have to define alerting rules.
To manage them you need to use the Alertmanager.
If you are not using the Alertmanager yet, please install it as it’s mandatory in
order to receive and manage alerts from Prometheus.

The Alertmanager capabilities can be consumed by the dashboard in three different
ways:

Use the notification receiver of the dashboard.

Use the Prometheus Alertmanager API.

Use both sources simultaneously.

All three methods are going to notify you about alerts. You won’t be notified
twice if you use both sources, but you need to consume at least the Alertmanager API
in order to manage silences.

Use the notification receiver of the dashboard

This allows you to get notifications as configured from the Alertmanager.
You will get notified inside the dashboard once a notification is send out,
but you are not able to manage alerts.

Add the dashboard receiver and the new route to your Alertmanager
configuration. This should look like:

Please make sure that the Alertmanager considers your SSL certificate in terms
of the dashboard as valid. For more information about the correct
configuration checkout the <http_config> documentation.

Use the API of Prometheus and the Alertmanager

This allows you to manage alerts and silences. This will enable the “Active
Alerts”, “All Alerts” as well as the “Silences” tabs in the “Monitoring”
section of the “Cluster” menu entry.

Alerts can be sorted by name, job, severity, state and start time.
Unfortunately it’s not possible to know when an alert was sent out through a
notification by the Alertmanager based on your configuration, that’s why the
dashboard will notify the user on any visible change to an alert and will
notify the changed alert.

Silences can be sorted by id, creator, status, start, updated and end time.
Silences can be created in various ways, it’s also possible to expire them.

Create from scratch

Based on a selected alert

Recreate from expired silence

Update a silence (which will recreate and expire it (default Alertmanager behaviour))

To be able to see all configured alerts, you will need to configure the URL to
the Prometheus API. Using this API, the UI will also help you in verifying
that a new silence will match a corresponding alert.

In this section we show a full example of the commands that need to be used
in order to create a user account, that should be able to manage RBD images,
view and create Ceph pools, and have read-only access to any other scopes.

In a Ceph cluster with multiple ceph-mgr instances, only the dashboard running
on the currently active ceph-mgr daemon will serve incoming requests. Accessing
the dashboard’s TCP port on any of the other ceph-mgr instances that are
currently on standby will perform a HTTP redirect (303) to the currently active
manager’s dashboard URL. This way, you can point your browser to any of the
ceph-mgr instances in order to access the dashboard.

If you want to establish a fixed URL to reach the dashboard or if you don’t want
to allow direct connections to the manager nodes, you could set up a proxy that
automatically forwards incoming requests to the currently active ceph-mgr
instance.

If you are accessing the dashboard via a reverse proxy configuration,
you may wish to service it under a URL prefix. To get the dashboard
to use hyperlinks that include your prefix, you can set the
url_prefix setting:

If the dashboard is behind a load-balancing proxy like HAProxy
you might want to disable the redirection behaviour to prevent situations that
internal (unresolvable) URL’s are published to the frontend client. Use the
following command to get the dashboard to respond with a HTTP error (500 by default)
instead of redirecting to the active dashboard:

$ ceph config set mgr mgr/dashboard/standby_behaviour "error"

To reset the setting to the default redirection behaviour, use the following command:

Below you will find an example configuration for SSL/TLS pass through using
HAProxy.

Please note that the configuration works under the following conditions.
If the dashboard fails over, the front-end client might receive a HTTP redirect
(303) response and will be redirected to an unresolvable host. This happens when
the failover occurs during two HAProxy health checks. In this situation the
previously active dashboard node will now respond with a 303 which points to
the new active node. To prevent that situation you should consider to disable
the redirection behaviour on standby nodes.

Ceph Dashboard can manage NFS Ganesha exports that use
CephFS or RadosGW as their backstore.

To enable this feature in Ceph Dashboard there are some assumptions that need
to be met regarding the way NFS-Ganesha services are configured.

The dashboard manages NFS-Ganesha config files stored in RADOS objects on the Ceph Cluster.
NFS-Ganesha must store part of their configuration in the Ceph cluster.

These configuration files must follow some conventions.
Each export block must be stored in its own RADOS object named
export-<id>, where <id> must match the Export_ID attribute of the
export configuration. Then, for each NFS-Ganesha service daemon there should
exist a RADOS object named conf-<daemon_id>, where <daemon_id> is an
arbitrary string that should uniquely identify the daemon instance (e.g., the
hostname where the daemon is running).
Each conf-<daemon_id> object contains the RADOS URLs to the exports that
the NFS-Ganesha daemon should serve. These URLs are of the form:

%urlrados://<pool_name>[/<namespace>]/export-<id>

Both the conf-<daemon_id> and export-<id> objects must be stored in the
same RADOS pool/namespace.

To enable the management of NFS-Ganesha exports in Ceph Dashboard, we only
need to tell the Dashboard, in which RADOS pool and namespace the
configuration objects are stored. Then, Ceph Dashboard can access the objects
by following the naming convention described above.

Ceph Dashboard also supports the management of NFS-Ganesha exports belonging
to different NFS-Ganesha clusters. An NFS-Ganesha cluster is a group of
NFS-Ganesha service daemons sharing the same exports. Different NFS-Ganesha
clusters are independent and don’t share the exports configuration between each
other.

Each NFS-Ganesha cluster should store its configuration objects in a
different RADOS pool/namespace to isolate the configuration from each other.

To specify the locations of the configuration of each NFS-Ganesha cluster we
can use the same command as above but with a different value pattern:

Its associated REST API endpoints will reject any further requests (404, Not Found Error).

The main purpose of this plug-in is to allow ad-hoc customizations of the workflows exposed
by the dashboard. Additionally, it could allow for dynamically enabling experimental
features with minimal configuration burden and no service impact.

By default, it’s disabled. This is the recommended setting for production
deployments. If required, debug mode can be enabled without need of restarting.
Currently, disabled debug mode equals to CherryPy production environment,
while when enabled, it uses test_suite defaults (please refer to
CherryPy Environments for more
details).

It also adds request uuid (unique_id) to Cherrypy on versions that don’t
support this. It additionally prints the unique_id to error responses and
log messages.

If you are unable to log into the Ceph Dashboard and you receive the following
error, run through the procedural checks below:

Check that your user credentials are correct. If you are seeing the
notification message above when trying to log into the Ceph Dashboard, it
is likely you are using the wrong credentials. Double check your username
and password, and ensure the caps lock key is not enabled by accident.

If your user credentials are correct, but you are experiencing the same
error, check that the user account exists:

$ ceph dashboard ac-user-show <username>

This command returns your user data. If the user does not exist, it will
print:

$ Error ENOENT: User <username> does not exist

Check if the user is enabled:

$ ceph dashboard ac-user-show <username> | jq .enabled
true

Check if enabled is set to true for your user. If not the user is
not enabled, run:

When an error occurs on the backend, you will usually receive an error
notification on the frontend. Run through the following scenarios to debug.

Check the Ceph Dashboard/mgr logfile(s) for any errors. These can be
identified by searching for keywords, such as 500 Internal Server Error,
followed by traceback. The end of a traceback contains more details about
what exact error occurred.