Overview

As an alternative to performing an
automated upgrade, you can manually upgrade your OKD cluster. To manually
upgrade without disruption, it is important to upgrade each component as
documented in this topic.

Preparing for a Manual Upgrade

Before attempting the upgrade, follow the steps in
Verifying the Upgrade to verify the
cluster’s health. This will confirm that nodes are in the Ready state, running
the expected starting version, and will ensure that there are no diagnostic
errors or warnings.

To prepare for a manual upgrade, follow these steps:

Install or update to the latest available version of the
atomic-openshift-utils package on each RHEL 7 system, which provides files
that will be used in later sections:

# yum install atomic-openshift-utils

Install or update to the following latest available *-excluder packages on
each RHEL 7 system, which helps ensure your systems stay on the correct versions
of atomic-openshift and docker packages when you are not trying to upgrade,
according to the OKD version:

For any upgrade path, ensure that you are running the latest kernel on
each RHEL 7 system:

# yum update kernel

There is a small set of configurations that are possible in authorization policy
resources in OKD 3.6 that are not supported by RBAC in
OKD 3.7. Such configurations require manual migration based on your
use case.

If you are upgrading from OKD 3.6 to 3.7, to guarantee that all
authorization policy objects are in sync with RBAC, run:

$ oc adm migrate authorization

This read-only command emulates the migration controller logic and reports if
any resource is out of sync.

During a rolling upgrade, avoid actions that require changes to OKD
authorization policy resources such as the creation of new projects. If a
project is created against a new master, the RBAC resources it creates will be
deleted by the migration controller since they will be seen as out of sync from
the authorization policy resources.

If a project is created against an old master and the migration controller is no
longer present due to a OKD 3.7 controller process being the leader,
then its policy objects will not be synced and it will have no RBAC resources.

Upgrading Master Components

Before upgrading any stand-alone nodes, upgrade the master components (which
provide the control plane for the cluster).

Run the following command on each master to remove the atomic-openshift
packages from the list of yum excludes on the host:

# atomic-openshift-excluder unexclude

Upgrade etcd on all master hosts and any external etcd hosts.

For RHEL 7 systems using the RPM-based method:

Upgrade the etcd package:

# yum update etcd

Restart the etcd service and review the logs to ensure it restarts
successfully:

If you are performing a cluster upgrade that requires updating Docker to version
1.12, you must also perform the following steps if you are not already on Docker
1.12:

The node component on masters is set by default to unschedulable status during
initial installation, so that pods are not deployed to them. However, it is
possible to set them schedulable during the initial installation or manually
thereafter. If any of your masters are also configured as a schedulable node,
skip the following Docker upgrade steps for those masters and instead run all
steps described in Upgrading Nodes when you get to that
section for those hosts as well.

Upgrade the docker package.

For RHEL 7 systems:

# yum update docker

Then, restart the docker service and review the logs to ensure it restarts
successfully:

# systemctl restart docker
# journalctl -r -u docker

For RHEL Atomic Host 7 systems, upgrade to the latest Atomic tree if one is
available:

If upgrading to RHEL Atomic Host 7.4.2, this upgrades Docker to version 1.12.

# atomic host upgrade

After the upgrade is completed and prepared for the next boot, reboot the host
and ensure the docker service starts successfully:

# systemctl reboot
# journalctl -r -u docker

Remove the following file, which is no longer required:

# rm /etc/systemd/system/docker.service.d/docker-sdn-ovs.conf

Run the following command on each master to add the atomic-openshift packages
back to the list of yum excludes on the host:

# atomic-openshift-excluder exclude

During the cluster upgrade, it can sometimes be useful to take a master out of
rotation since some DNS client libraries will not properly to the other masters
for cluster DNS. In addition to stopping the master and controller services, you
can remove the EndPoint from the Kubernetes service’s subsets.addresses.

$ oc edit ep/kubernetes -n default

When the master is restarted, the Kubernetes service will be automatically
updated.

Updating Policy Definitions

During a cluster upgrade, and on every restart of any master, the
default
cluster roles are automatically reconciled to restore any missing permissions.

If you customized default cluster roles and want to ensure a role reconciliation
does not modify them, protect each role from reconciliation:

Upgrading Nodes

After upgrading your masters, you can upgrade your nodes. When restarting the
origin-node service, there will be a brief disruption of outbound network
connectivity from running pods to services while the
service
proxy is restarted. The length of this disruption should be very short and
scales based on the number of services in the entire cluster.

You can alternatively use the
blue-green
deployment method at this point to create a parallel environment for new nodes
instead of upgrading them in place.

One at at time for each node that is not also a master, you must disable
scheduling and evacuate its pods to other nodes, then upgrade packages and
restart services.

Run the following command on each node to remove the atomic-openshift
packages from the list of yum excludes on the host:

# atomic-openshift-excluder unexclude

As a user with cluster-admin privileges, disable scheduling for the node:

# oc adm manage-node <node> --schedulable=false

Evacuate pods on the node to other nodes:

The --force option deletes any pods that are not backed by a replication
controller.

# oc adm drain <node> --force --delete-local-data --ignore-daemonsets

On the node host, upgrade all origin packages:

# yum upgrade origin\*

If you are upgrading from OpenShift Origin 1.0 to 1.1, enable the following
renamed service on the node host:

# systemctl enable origin-node

Restart the origin-node and openvswitch services and review the logs to
ensure they restart successfully:

Upgrading the Router

If you have previously
deployed a router, the
router deployment configuration must be upgraded to apply updates contained in
the router image. To upgrade your router without disrupting services, you must
have previously deployed a
highly available routing service.

If you are upgrading to OpenShift Origin 1.0.4 or 1.0.5, first see the
Additional Manual Instructions per
Release section for important steps specific to your upgrade, then continue
with the router upgrade as described in this section.

Edit your router’s deployment configuration. For example, if it has the default
router name:

Adjust <tag> to match the version you are upgrading to (use v3.7.64
for the latest version).

You should see one router pod updated and then the next.

Upgrading the Registry

The registry must also be upgraded for changes to take effect in the registry
image. If you have used a PersistentVolumeClaim or a host mount point, you
may restart the registry without losing the contents of your registry.
Storage for the Registry details how to configure persistent storage for the registry.

Alternatively, use the REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA
environment variable, which is set to true for the new registry deployments
by default. Existing deployments need to be modified using:

Updating the Default Image Streams and Templates

By default, the advanced
installation method automatically creates default image streams, InstantApp
templates, and database service templates in the openshift project, which is a
default project to which all users have view access. These objects were created
during installation from the JSON files located under
/usr/share/openshift/examples.

To update these objects:

Ensure that you have the latest openshift-ansible code checked out, which
provides the example JSON files:

After a manual upgrade, get the latest templates from
openshift-ansible-roles:

# rpm -ql openshift-ansible-roles | grep examples | grep v3.7

In this example,
/usr/share/ansible/openshift-ansible/roles/openshift_examples/files/examples/v3.7/image-streams/image-streams-rhel7.json
is the latest file that you want in the latest openshift-ansible-roles package.

/usr/share/openshift/examples/image-streams/image-streams-rhel7.json is not
owned by a package, but is updated by Ansible. If you are upgrading outside of
Ansible. you need to get the latest .json files on the system where you are
running oc, which can run anywhere that has access to the master.

Install atomic-openshift-utils and its dependencies to install the new content
into
/usr/share/ansible/openshift-ansible/roles/openshift_examples/files/examples/v3.7/.:

In order to update your S2I-based applications, you must manually trigger a new
build of those applications after importing the new images using oc start-build
<app-name>.

Updating Master and Node Certificates

The following steps may be required for any OpenShift cluster that was
originally installed prior to the
OpenShift Origin 1.0.8 release.
This may include any and all updates from that version.

Node Certificates

With the 1.0.8 release, certificates for each of the kubelet nodes were updated
to include the IP address of the node. Any node certificates generated before
the 1.0.8 release may not contain the IP address of the node.

If a node is missing the IP address as part of its certificate, clients may
refuse to connect to the kubelet endpoint. Usually this will result in errors
regarding the certificate not containing an IP SAN.

In order to remedy this situation, you may need to manually update the
certificates for your node.

Checking the Node’s Certificate

The following command can be used to determine which Subject Alternative Names
(SANs) are present in the node’s serving certificate. In this example, the
Subject Alternative Names are mynode, mynode.mydomain.com, and 1.2.3.4:

Ensure that the nodeIP value set in the
/etc/origin/node/node-config.yaml file is present in the IP values from the
Subject Alternative Names listed in the node’s serving certificate. If the
nodeIP is not present, then it will need to be added to the node’s
certificate.

If the nodeIP value is already contained within the Subject Alternative
Names, then no further steps are required.

You will need to know the Subject Alternative Names and nodeIP value for the
following steps.

Generating a New Node Certificate

If your current node certificate does not contain the proper IP address, then
you must regenerate a new certificate for your node.

Node certificates will be regenerated on the master (or first master) and are
then copied into place on node systems.

Create a temporary directory in which to perform the following steps
on the first master listed in the Ansible host inventory file
by default /etc/ansible/hosts:

After you have replaced the node’s certificate, restart the node service:

# systemctl restart origin-node

Master Certificates

With the 1.0.8 release, certificates for each of the masters were updated to
include all names that pods may use to communicate with masters. Any master
certificates generated before the 1.0.8 release may not contain these additional
service names.

Checking the Master’s Certificate

The following command can be used to determine which Subject Alternative Names
(SANs) are present in the master’s serving certificate. In this example, the
Subject Alternative Names are mymaster, mymaster.mydomain.com, and
1.2.3.4:

Ensure that the following entries are present in the Subject Alternative Names
for the master’s serving certificate:

Entry

Example

Kubernetes service IP address

172.30.0.1

All master host names

master1.example.com

All master IP addresses

192.168.122.1

Public master host name in clustered environments

public-master.example.com

kubernetes

kubernetes.default

kubernetes.default.svc

kubernetes.default.svc.cluster.local

openshift

openshift.default

openshift.default.svc

openshift.default.svc.cluster.local

If these names are already contained within the Subject Alternative Names, then
no further steps are required.

Generating a New Master Certificate

If your current master certificate does not contain all names from the list
above, then you generate a new certificate for your mater. Perform the
following steps on the first master listed in the Ansible host inventory file
by default /etc/ansible/hosts:

Back up the existing /etc/origin/master/master.server.crt and
/etc/origin/master/master.server.key files for your master:

You will need the first IP in the services
subnet (the kubernetes service IP) as well as the values of masterIP,
masterURL and publicMasterURL contained in the
/etc/origin/master/master-config.yaml file for the following steps.

Adjust <master_IP_addresses> to match the value of masterIP. In a
clustered environment, add all master IP addresses.

3

Adjust <kubernetes_service_IP> to the first IP in the kubernetes
services subnet.

4

Adjust <internal_master_address> to match the value of masterURL.

5

Adjust <public_master_address> to match the value of masterPublicURL.

Restart master services. For single master deployments:

# systemctl restart origin-master-api origin-master-controllers

After the service restarts, the certificate update is complete.

Upgrading the Service Catalog

Manual upgrade steps for the service catalog and service brokers are not available.

Starting with OKD 3.7, the service catalog, OpenShift Ansible
broker, and template service broker are enabled and deployed by default for new
cluster installations. However, they are not deployed by default during the
upgrade from OKD 3.6 to 3.7, so you must run an individual component
playbook separate post-upgrade.

Upgrading from the OKD 3.6 Technology Preview version of the service
catalog and service brokers is not supported.

To upgrade to these features:

See the following three sections in the
Advanced Installation topic and update your inventory file accordingly:

Upgrading the EFK Logging Stack

Manual upgrade steps for logging deployments are no longer available starting in
OKD
1.5.

To upgrade an existing EFK logging stack deployment, you must use the provided
/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml
Ansible playbook. This is the playbook to use if you were deploying logging for
the first time on an existing cluster, but is also used to upgrade existing
logging deployments.

When you have finished updating your inventory file, follow the instructions in
Deploying the EFK Stack to run the openshift-logging.yml playbook and complete the
logging deployment upgrade.

If your Fluentd DeploymentConfig and DaemonSet for the EFK components are
already set with:

image: <image_name>:<vX.Y>
imagePullPolicy: IfNotPresent

The latest version <image_name> might not be pulled if there is already one with
the same <image_name:vX.Y> stored locally on the node where the pod is being
re-deployed. If so, manually change the DeploymentConfig and DaemonSet to
imagePullPolicy: Always to make sure it is re-pulled.

Upgrading Cluster Metrics

Manual upgrade steps for metrics deployments are no longer available starting in
OKD
1.5.

To upgrade an existing cluster metrics deployment, you must use the provided
/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-metrics.yml
Ansible playbook. This is the playbook to use if you were deploying metrics for
the first time on an existing cluster, but is also used to upgrade existing
metrics deployments.