Che on OpenShift: Admin Guide

RAM prerequisites

Single-user prerequisites

Che server pod uses up to 1 GB of RAM. The initial request for RAM is 256 MB. The Che server pod rarely uses more than 800 MB RAM.

Workspaces use 2 GB of RAM.

Multi-user prerequisites

You must have at least 5 GB of RAM to run multi-user Che. The Keycloak authorization server and PostgreSQL database require the extra RAM. Multi-user Che uses RAM in this distribution:

Che server: approximately 750 MB

Keycloak: approximately 1 GB

PostgreSQL: approximately 515 MB

Workspaces: 2 GB of RAM per workspace. The total workspace RAM depends on the size of the workspace runtime(s) and the number of concurrent workspace pods.

Setting default workspace RAM limits

The default workspace RAM limit and the RAM allocation request can be configured by passing the CHE_WORKSPACE_DEFAULT__MEMORY__LIMIT__MB and CHE_WORKSPACE_DEFAULT__MEMORY__REQUEST__MB parameters to a Che deployment.

For example, use the following configuration to limit the amount of RAM used by workspaces to 2048 MB and to request the allocation of 1024 MB of RAM:

Setting up a multi OpenShift project

To create workspace objects in different namespaces for each user, set the NULL_CHE_INFRA_OPENSHIFT_PROJECT variable to NULL.

To create resources on behalf of the currently logged-in user, use the user’s OpenShift tokens.

How the Che server uses PVCs and PVs for storage

Che server, Keycloak and PostgreSQL pods, and workspace pods use Persistent Volume Claims (PVCs), which are bound to the physical Persistent Volumes (PVs) with ReadWriteOnce access mode. When the deployment YAML files run, they define the Che PVCs. You can configure workspace PVC access mode and claim size with Che deployment environment variables.

Storage requirements for Che infrastructure

Che server: 1 GB to store logs and initial workspace stacks.

Keycloak: 2 PVCs, 1 GB each to store logs and Keycloak data.

PostgreSQL: 1 GB PVC to store database.

Storage strategies for Che workspaces

The workspace PVC strategy is configurable:

strategy

details

pros

cons

unique (default)

One PVC per workspace volume or user-defined PVC

Storage isolation

An undefined number of PVs is required

common

One PVC for all workspaces in one OpenShift Project

Sub-paths pre-created

Easy to manage and control storage

Workspaces must be in a separate OpenShift Project if PV does not support ReadWriteMany (RWX) access mode

per-workspace

One PVC for one workspace

Sub-paths pre-created

Easy to manage and control storage

Workspace containers must all be in one pod if PV does not support ReadWriteMany (RWX) access mode

Unique PVC strategy

How the unique PVC strategy works

Every Che Volume of workspace gets its own PVC, which means workspace PVCs are created when a workspace starts for the first time. Workspace PVCs are deleted when a corresponding workspace is deleted.

User-defined PVCs are created with few modifications:

they are provisioned with genarated names to garantee that it is not conflicting with other PVCs in namespace;

subpaths of mount volumes that reference user-defined PVCs are prefixed with {workspace id}/{PVC name}.
It is done to have the same data structure on PV on different PVC strategies;

Enabling a unique strategy

If you have already deployed Che with another strategy, set the CHE_INFRA_KUBERNETES_PVC_STRATEGY variable to unique in dc/che.
Note that existing workspaces data won’t be migrated and they will use new unique PVC per Che Volume without cleaning up existing PVCs.

If applying the che-server-template.yaml configuration, pass -p CHE_INFRA_KUBERNETES_PVC_STRATEGY=unique to the oc new-app command.

Common PVC Strategy

How the common PVC strategy works

All workspaces (within one OpenShift Project) use the same PVC to store data declared in their volumes (projects and workspace logs by default and whatever additional volumes that a user can define.)

User-defined PVCs are ignored and volumes that reference PVCs are replaced with volume that references common PVC.
The corresponding containers volume mounts are relinked to common volume and subpaths are prefixed with '{workspaceId}/{originalPVCName}'.

User-defined PVC name is used as Che Volume name. It means that if Machine is configured to use Che Volume with the same name as user-defined
PVC has then they will use the same shared folder in common PVC.

A PV that is bound to PVC che-claim-workspace will have the following structure:

Volumes can be anything that a user defines as volumes for workspace machines. The volume name is equal to the directory name in ${PV}/${ws-id}.

When a workspace is deleted, a corresponding subdirectory (${ws-id}) is deleted in the PV directory.

Enabling the common strategy

If you have already deployed Che with another strategy, set the CHE_INFRA_KUBERNETES_PVC_STRATEGY variable to common in dc/che.
Note that existing workspaces data won’t be migrated and they will use common PVC without cleaning up existing PVCs.

If applying the che-server-template.yaml configuration, pass -p CHE_INFRA_KUBERNETES_PVC_STRATEGY=common to the oc new-app command.

Restrictions on using the common PVC strategy

When the common strategy is used and a workspace PVC access mode is ReadWriteOnce (RWO), only one OpenShift node can simultaneously use the PVC. If there are several nodes, you can use the common strategy, but the workspace PVC access mode is ReadWriteMany (RWM). Multiple nodes can use this PVC simultaneously.

To change the access mode for workspace PVCs, pass the CHE_INFRA_KUBERNETES_PVC_ACCESS_MODE=ReadWriteMany environment variable to Che deployment either when initially deploying Che or through the Che deployment update.

Another restriction is that only pods in the same namespace can use the same PVC. The CHE_INFRA_KUBERNETES_PROJECT environment variable should not be empty. It should be either the Che server namespace where objects can be created with the Che service account (SA) or a dedicated namespace where a token or a user name and password need to be used.

Per workspace PVC strategy

How the per-workspace PVC strategy works

The per-workspace strategy works similarly to the common PVC strategy. The only difference is that all workspace volumes (but not all workspaces) use the same PVC to store data (projects and workspace logs by default and any additional volumes that a user can define).

Enabling a per-workspace strategy

If you have already deployed Che with another strategy, set the CHE_INFRA_KUBERNETES_PVC_STRATEGY variable to per-workspace in dc/che.
Note that existing workspaces data won’t be migrated and they will use common PVC per workspace without cleaning up existing PVCs.

If applying the che-server-template.yaml configuration, pass -p CHE_INFRA_KUBERNETES_PVC_STRATEGY=per-workspace to the oc new-app command.

Updating your Che deployment

To update a Che deployment:

Change the image tag:

You can change the image tag in one of the following ways:

On the command line, edit the image tag by running:

$ oc edit dc/che

In the OpenShift web console, edit the image:tag line in the YAML file in Deployments

Debug mode

Private Docker registries

Che server logs

Logs are persisted in a PV .The PVC che-data-volume is created and bound to a PV after Che deploys to OpenShift.

To retrieve logs, do one of the following:

Run the oc get log dc/che command.

Run the oc describe pvc che-data-claim command to find the PV. Next, run the oc describe pv $pvName command with the PV to get a local path with the logs directory. Be careful with permissions for that directory, since once changed, Che server will not be able to write to a file.

In the OpenShift web console, select Pods > che-pod > Logs.

It is also possible to configure Che master not to store logs, but produce JSON encoded logs to output instead. It may be used to collect logs by systems such as Logstash. To configure JSON logging instead of plain text environment variable CHE_LOGS_APPENDERS_IMPL should have value json. See more at logging docs.

Workspace logs

Workspace logs are stored in an PV bound to che-claim-workspace PVC. Workspace logs include logs from workspace agent, bootstrapper and other agents if applicable.

Che master states

The Che master has three possible states:

RUNNING

PREPARING_TO_SHUTDOWN

READY_TO_SHUTDOWN

The PREPARING_TO_SHUTDOWN state means that no new workspace startups are allowed. This situation can cause two different results:

If your infrastructure does not support workspace recovery, all running workspaces are forcibly stopped.

If your infrastructure does support workspace recovery, any workspaces that are currently starting or stopping is allowed to finish that process. Running workspaces do not stop.

For those that did not stop, automatic fallback to the shutdown with full workspaces stopping will be performed.

If you want a full shutdown with workspaces stopped, you can request this by using the shutdown=true parameter. When preparation process is finished, the READY_TO_SHUTDOWN state is set which allows to stop current Che master instance.

Che workspace termination grace period

The default grace termination period of OpenShift workspace pods is 0. This setting terminates pods almost instantly and significantly decreases the time required for stopping a workspace.

To increase the grace termination period, use the following environment variable: CHE_INFRA_KUBERNETES_POD_TERMINATION__GRACE__PERIOD__SEC.

If the terminationGracePeriodSeconds variable is explicitly set in the OpenShift recipe, the CHE_INFRA_KUBERNETES_POD_TERMINATION__GRACE__PERIOD__SEC environment variable does not override the recipe.

Auto-stopping a workspace when its pods are removed

Che Server includes a job that automatically stops workspace runtimes if their pods have been terminated. Pods are terminated when, for example, users remove them from the OpenShift console, administrators terminate them to prevent misuse, or an infrastructure node crashes.

The job is disabled by default to avoid problems in configurations where Che Server cannot interact with the Kubernetes API without user intervention.

The job cannot function with the following Che Server configuration:

Che Server communicates with the Kubernetes API using a token from the OAuth provider.

The job can function with the following Che Server configurations:

Workspaces objects are created in the same namespace where Che Server is located.

The cluster-admin service account token is mounted to the Che Server pod.

To enable the job, set the CHE_INFRA_KUBERNETES_RUNTIMES__CONSISTENCY__CHECK__PERIOD__MIN environment variable to contain a value greater than 0. The value is the time period in minutes between checks for runtimes without pods.

Updating Che without stopping active workspaces

The differences between a Recreate update and a Rolling update:

Recreate update

Rolling update

Che downtime

No Che downtime

-

New deployment starts in parallel and traffic is hot-switched

Performing a recreate update

To perform a recreate update:

Ensure that the new master version is fully API compatible with the old workspace agent version.

Set the deployment update strategy to Recreate

Make POST request to the /api/system/stop api to start WS master suspend. This means that all new attempts to start workspaces will be refused, and all current starts and stops will be finished. Note that this method requires system admin credentials.

Make periodical GET requests to the /api/system/state API, until it returns the READY_TO_SHUTDOWN state. Also, you can check for "System is ready to shutdown" in the server logs.

Perform new deploy.

Performing a rolling update

To perform a rolling update:

Ensure that the new master is fully API compatible with the old ws agent versions, as well as database compatibility. It is impossible to use database migrations on this update mode.

Known issues

Workspaces may fallback to the stopped state when they are started five to thirty seconds before the network traffic are switched to the new pod. This happens when the bootstrappers use the Che server route URL for notifying the Che Server that bootstrapping is done. Since traffic is already switched to the new Che server, the old Che server cannot get the bootstrapper’s report and fails to start after the waiting timeout is reached. If the old Che server is killed before this timeout, the workspaces can be stuck in the STARTING state. The terminationGracePeriodSeconds parameter must define enough time to cover the workspace start timeout, which is eight minutes plus some additional time. Typically, setting terminationGracePeriodSeconds to 540 sec is enough to cover all timeouts.

Users may experience problems with websocket reconnections or missed events published by WebSocket connection when a workspace is STARTED but dashboard displays that it is STARTING. In this case, you need to reload the page to restore connections and the actual workspace states.

Updating with database migrations or API incompatibility

If new version of Che server contains some DB migrations, but there is still API compatibility between old and new version, recreate update type may be used, without stopping running workspaces.

API incompatible versions should be updated with full workspaces stop. It means that /api/system/stop?shutdown=true must be called prior to update.

Deleting deployments

The fastest way to completely delete Che and its infrastructure components is to delete the project and namespace.

To delete Che and components:

$ oc delete namespace che

You can use selectors to delete particular deployments and associated objects.

To remove all Che server related objects:

$ oc delete all -l=app=che

To remove all Keycloak related objects:

$ oc delete all -l=app=keycloak

To remove all PostgreSQL-related objects:

$ oc delete all -l=app=postgres

PVCs, service accounts and role bindings should be deleted separately because oc delete all does not delete them.

To delete Che server PVC, ServiceAccount and RoleBinding:

$ oc delete sa -l=app=che
$ oc delete rolebinding -l=app=che

To delete Keycloak and PostgreSQL PVCs:

$ oc delete pvc -l=app=keycloak
$ oc delete pvc -l=app=postgres

Monitoring Che Master Server

Master server emits metrics in Prometheus format by default on port 8087 of the Che server host
(this can be customized by the che.metrics.portconfiguration property).

You can configure your own Prometheus deployment to scrape the metrics (as per convention, the
metrics are published on the <CHE_HOST>:8087/metrics endpoint).

The Che’s Helm chart can optionally install Prometheus and Grafana servers preconfigured to collect
the metrics of the Che server. When you set the global.metricsEnabled value to true when
installing Che’s Helm chart, Prometheus and Grafana servers are automatically deployed.
The servers are accessible on prometheus-<CHE_NAMESPACE>.domain or grafana-<CHE_NAMESPACE>.domain
domains respectively. The Grafana server is preconfigured with a sample dashboard showing the memory
usage of the Che server. You can log in to the Grafana server using the predefined username admin
with the default password admin.

Creating workspace objects in personal namespaces

You can register the OpenShift server as an identity provider when Che is installed in multi-user mode. This allows you to create workspace objects in the OpenShift namespace of the user that is logged in Che through Keycloak.

To create a workspace object in the namespace of the user that is logged into Che:

Register, inside Keycloak, an OpenShift identity provider that points to the OpenShift console of the cluster.

Configure Che to use the Keycloak identity provider to retrieve the OpenShift tokens of the Che users.

Every workspace action such as start or stop creates an OpenShift resource in the OpenShift user account. A notification message displays which allows you to link the Keycloak account to your OpenShift user account.

But for non-interactive workspace actions, such as workspace stop on idling or Che server shutdown, the dedicated OpenShift account configured for the Kubernetes infrastructure is used. See Setting up the project workspace for more information.

Configuring Che

Set the CHE_INFRA_OPENSHIFT_PROJECT variable to NULL to ensure a new distinct OpenShift namespace is created for every workspace that is started.

Set the CHE_INFRA_OPENSHIFT_OAUTH__IDENTITY__PROVIDER variable to the alias of the OpenShift identity provider specified in step 1 of its registration in Keycloak. The default value is openshift-v3.

Providing the OpenShift certificate to Keycloak

If the certificate used by the OpenShift console is self-signed or is not trusted, then by default the Keycloak will not be able to contact the OpenShift console to retrieve linked tokens.

Keycloak cannot contact the OpenShift console to retrieve linked tokens when the certificate used by the OpenShift console is self-signed or is not trusted.

When the certificate is self-signed or is not trusted, use the OPENSHIFT_IDENTITY_PROVIDER_CERTIFICATE variable to pass the OpenShift console certificate to the Keycloak deployment. This will enable the Keycloak server to add the certificate to the list of trusted certificates. The environment variable refers to a secret that contains the certificate.