Overview

OKD supports the Kubernetes
Container Network
Interface (CNI) as the interface between the OKD and Kubernetes.
Software defined network (SDN) plug-ins are a powerful and flexible way to match
network capabilities to your networking needs. Additional plug-ins that support
the CNI interface can be added as needed.

OpenShift SDN

OKD deploys a software-defined networking (SDN) approach for
connecting pods in an OKD cluster. The OpenShift SDN connects all
pods across all node hosts, providing a unified cluster network.

OpenShift SDN is installed and configured by default as part of the
Ansible-based installation procedure. See the
OpenShift SDN section
for more information.

Flannel SDN

flannel is a virtual networking layer designed specifically for containers.
OKD can use it for networking containers instead of the default
software-defined networking (SDN) components. This is useful if running
OKD within a cloud provider platform that also relies on SDN,
such as OpenStack, and you want to avoid encapsulating packets twice through
both platforms.

Architecture

OKD runs flannel in host-gw mode, which maps routes from
container to container. Each host within the network runs an agent called
flanneld, which is responsible for:

Managing a unique subnet on each host

Distributing IP addresses to each container on its host

Mapping routes from one container to another, even if on different hosts

Each flanneld agent provides this information to a centralized etcd store so
other agents on hosts can route packets to other containers within the
flannel network.

The following diagram illustrates the architecture and data flow from one
container to another using a flannel network:

Contiv SDN

Contiv is an open-source networking plug-in module
for container infrastructure. Contiv provides an infrastructure for
application-oriented network policies and support for a range of multiple
networking modes. These include:

A configurable set of overlay networking modes.

Physical networking modes.

Support for industry-leading hardware.

OKD can use Contiv for networking containers instead of the default
OpenShift
SDN.

Contiv configuration instructions are forthcoming.

Architecture

Each node within the cluster runs a Contiv agent called netplugin while the
master hosts run the Contiv controller (called netmaster), along with
supporting control plane components (such as etcd).

Together the components of Contiv (netmaster and netplugin) handle key
networking functions for OKD including:

Assigning IP addresses to each container pod on each cluster node.

Creating and managing multiple separate container network instances for
different groups of containers.

Configuring the network forwarding layer components for layer two or layer three
forwarding.

Configuring and enforcing a range of network policies.

Providing management interfaces (including both CLI and GUI) to configure and
manage Contiv-specific features and configurations.

Contiv uses the
Container Network
Interface (CNI) to interface with OKD and Kubernetes. A key value
store based on etcd is used to store Contiv-specific state information. This
is in addition to and separate from the instance of etcd used by other
components in the system, including OKD itself.

Nuage SDN for OKD

Nuage
Networks' SDN solution delivers highly scalable, policy-based overlay
networking for pods in an OKD cluster. Nuage SDN can be installed
and configured as a part of the Ansible-based installation procedure. See the
Advanced
Installation section for information on how to install and deploy
OKD with Nuage SDN.

Nuage Networks provides a highly scalable,
policy-based SDN platform called Virtualized Services Platform (VSP). Nuage VSP
uses an SDN Controller, along with the open source Open vSwitch for the data
plane.

Nuage VSP integrates with OKD to allows business applications to be
quickly turned up and updated by removing the network lag faced by DevOps teams.

Figure 1. Nuage VSP Integration with OKD

There are two specific components responsible for the integration.

The nuage-openshift-monitor service, which runs as a separate service on the
OKD master node.

The vsp-openshift plug-in, which is invoked by the OKD runtime on each of the nodes of the cluster.

Nuage Virtual Routing and Switching software (VRS) is based on open source Open
vSwitch and is responsible for the datapath forwarding. The VRS runs on each
node and gets policy configuration from the controller.

Nuage VSP Terminology

Figure 2. Nuage VSP Building Blocks

Domains: An organization contains one or more domains. A domain is a single "Layer 3" space. In standard networking terminology, a domain maps to a VRF instance.

Zones: Zones are defined under a domain. A zone does not map to anything on the network directly, but instead acts as an object with which policies are associated such that all endpoints in the zone adhere to the same set of policies.

Subnets: Subnets are defined under a zone. A subnet is a specific Layer 2 subnet within the domain instance. A subnet is unique and distinct within a domain, that is, subnets within a Domain are not allowed to overlap or to contain other subnets in accordance with the standard IP subnet definitions.

VPorts: A VPort is a new level in the domain hierarchy, intended to provide more granular configuration. In addition to containers and VMs, VPorts are also used to attach Host and Bridge Interfaces, which provide connectivity to Bare Metal servers, Appliances, and Legacy VLANs.

A Nuage subnet is not mapped to an OKD node, but a subnet for a
particular project can span multiple nodes in OKD.

A pod spawning in OKD translates to a virtual port being created in
VSP. The vsp-openshift plug-in interacts with the VRS and gets a policy for
that virtual port from the VSD via the VSC. Policy Groups are supported to group
multiple pods together that must have the same set of policies applied to them.
Currently, pods can only be assigned to policy groups using the
operations
workflow where a policy group is created by the administrative user in VSD. The
pod being a part of the policy group is specified by means of
nuage.io/policy-group label in the specification of the pod.

Integration Components

Nuage VSP integrates with OKD using two main components:

nuage-openshift-monitor

vsp-openshift plugin

nuage-openshift-monitor

nuage-openshift-monitor is a service that monitors the OKD API
server for creation of projects, services, users, user-groups, etc.

In case of a Highly Available (HA) OKD cluster with multiple
masters, nuage-openshift-monitor process runs on all the masters independently
without any change in functionality.

For the developer workflow, nuage-openshift-monitor also auto-creates VSD
objects by exercising the VSD REST API to map OKD constructs to VSP
constructs. Each cluster instance maps to a single domain in Nuage VSP. This
allows a given enterprise to potentially have multiple cluster installations -
one per domain instance for that Enterprise in Nuage. Each OKD
project is mapped to a zone in the domain of the cluster on the Nuage VSP.
Whenever nuage-openshift-monitor sees an addition or deletion of the project,
it instantiates a zone using the VSDK APIs corresponding to that project and
allocates a block of subnet for that zone. Additionally, the
nuage-openshift-monitor also creates a network macro group for this project.
Likewise, whenever nuage-openshift-monitor sees an addition ordeletion of a
service, it creates a network macro corresponding to the service IP and assigns
that network macro to the network macro group for that project (user provided
network macro group using labels is also supported) to enable communication to
that service.

For the developer workflow, all pods that are created within the zone get IPs
from that subnet pool. The subnet pool allocation and management is done by
nuage-openshift-monitor based on a couple of plug-in specific parameters in
the master-config file. However the actual IP address resolution and vport
policy resolution is still done by VSD based on the domain/zone that gets
instantiated when the project is created. If the initial subnet pool is
exhausted, nuage-openshift-monitor carves out an additional subnet from the
cluster CIDR to assign to a given project.

For the operations workflow, the users specify Nuage recognized labels on their
application or pod specification to resolve the pods into specific user-defined
zones and subnets. However, this cannot be used to resolve pods in the zones or
subnets created via the developer workflow by nuage-openshift-monitor.

In the operations workflow, the administrator is responsible for pre-creating
the VSD constructs to map the pods into a specific zone/subnet as well as allow
communication between OpenShift entities (ACL rules, policy groups, network
macros, and network macro groups). Detailed description of how to use Nuage
labels is provided in the Nuage VSP
Openshift Integration Guide.

vsp-openshift Plug-in

The vsp-openshift networking plug-in is called by the OKD runtime on
each OKD node. It implements the network plug-in init and pod setup,
teardown, and status hooks. The vsp-openshift plug-in is also responsible for
allocating the IP address for the pods. In particular, it communicates with the
VRS (the forwarding engine) and configures the IP information onto the pod.

F5 BIG-IP® Router Plug-in

A router is one way to get traffic into the cluster. The F5 BIG-IP® Router plug-in is one of the available
router plugins.

The F5 router has feature parity with the
HAProxy
template router, and has additional features over the F5 BIG-IP® support in
OpenShift v2.
Compared with the routing-daemon used in earlier
versions, the F5 router additionally supports:

path-based routing (using policy rules),

re-encryption (implemented using client and server SSL profiles)

passthrough of encrypted connections (implemented using an iRule that parses
the SNI protocol and uses a data group that is maintained by the F5 router for
the servername lookup).

Passthrough routes are a special case: path-based routing is technically
impossible with passthrough routes because F5 BIG-IP® itself does not see the
HTTP request, so it cannot examine the path. The same restriction applies to the
template router; it is a technical limitation of passthrough encryption, not a
technical limitation of OKD.

Routing Traffic to Pods Through the SDN

Because F5 BIG-IP® is external to the
OpenShift SDN, a
cluster administrator must create a peer-to-peer tunnel between F5 BIG-IP® and
a host that is on the SDN, typically an OKD node host.
This
ramp
node can be configured as
unschedulable
for pods so that it will not be doing anything except act as a gateway for the
F5 BIG-IP® host.
It is also possible to configure multiple such hosts and use
the OKD ipfailover feature for redundancy; the F5 BIG-IP® host would
then need to be configured to use the ipfailover VIP for its tunnel’s remote
endpoint.

F5 Integration Details

The operation of the F5 router is similar to that of the OKD
routing-daemon used in earlier versions. Both use REST API calls to:

create and delete pools,

add endpoints to and delete them from those pools, and

configure policy rules to route to pools based on vhost.

Both also use scp and ssh commands to upload custom TLS/SSL certificates to
F5 BIG-IP®.

The F5 router configures pools and policy rules on virtual servers as follows:

When a user creates or deletes a route on OKD, the router creates a
pool to F5 BIG-IP® for the route (if no pool already exists) and adds a rule to, or
deletes a rule from, the policy of the appropriate vserver: the HTTP vserver for
non-TLS routes, or the HTTPS vserver for edge or re-encrypt routes. In the case
of edge and re-encrypt routes, the router also uploads and configures the TLS
certificate and key. The router supports host- and path-based routes.

Passthrough routes are a special case: to support those, it is necessary to
write an iRule that parses the SNI ClientHello handshake record and looks up the
servername in an F5 data-group. The router creates this iRule, associates the
iRule with the vserver, and updates the F5 data-group as passthrough routes are
created and deleted. Other than this implementation detail, passthrough routes
work the same way as other routes.

When a user creates a service on OKD, the router adds a pool to F5
BIG-IP® (if no pool already exists). As endpoints on that service are created
and deleted, the router adds and removes corresponding pool members.

When a user deletes the route and all endpoints associated with a particular
pool, the router deletes that pool.

F5 Native Integration

With
native
integration of F5 with OKD, you do not need to configure a ramp
node for F5 to be able to reach the pods on the overlay network as created by
OpenShift SDN.

Also, only F5 BIG-IP® appliance version 12.x and above works with the native integration
presented in this section. You also need sdn-services add-on license for the
integration to work properly.
For version 11.x, set up a ramp
node.

Connection

The F5 appliance can connect to the OKD cluster via an L3
connection. An L2 switch connectivity is not required between OKD
nodes. On the appliance, you can use multiple interfaces to manage the
integration:

Internal interface - Programs the appliance and reaches out to the pods.

An F5 controller pod has admin access to the appliance. The F5 image is
launched within the OKD cluster (scheduled on any node) that uses
iControl REST APIs to program the virtual servers with policies, and configure
the VxLAN device.

Data Flow: Packets to Pods

This section explains how the packets reach the pods, and vice versa. These
actions are performed by the F5 controller pod and the F5 appliance, not the
user.

When natively integrated, The F5 appliance reaches out to the pods directly
using VxLAN encapsulation. This integration works only when OKD is
using openshift-sdn as the network plug-in. The openshift-sdn plug-in
employs VxLAN encapsulation for the overlay network that it creates.

To make a successful data path between a pod and the F5 appliance:

F5 needs to encapsulate the VxLAN packet meant for the pods. This requires the
sdn-services license add-on. A VxLAN device needs to be created and the pod
overlay network needs to be routed through this device.

F5 needs to know the VTEP IP address of the pod, which is the IP address of the
node where the pod is located.

F5 needs to know which source-ip to use for the overlay network when
encapsulating the packets meant for the pods. This is known as the gateway address.

OKD nodes need to know where the F5 gateway address is (the VTEP
address for the return traffic). This needs to be the internal interface’s
address. All nodes of the cluster must learn this automatically.

Since the overlay network is multi-tenant aware, F5 must use a VxLAN ID that is
representative of an admin domain, ensuring that all tenants are reachable by
the F5. Ensure that F5 encapsulates all packets with a vnid of 0 (the
default vnid for the admin namespace in OKD) by putting an
annotation on the manually created hostsubnet -
pod.network.openshift.io/fixed-vnid-host: 0.

A ghost hostsubnet is manually created as part of the setup, which fulfills
the third and forth listed requirements. When the F5 controller pod is launched,
this new ghost hostsubnet is provided so that the F5 appliance can be
programmed suitably.

The term ghosthostsubnet is used because it suggests that a subnet has been
given to a node of the cluster. However, in reality, it is not a real node of
the cluster. It is hijacked by an external appliance.

The first requirement is fulfilled by the F5 controller pod once it is launched.
The second requirement is also fulfilled by the F5 controller pod, but it is an
ongoing process. For each new node that is added to the cluster, the controller
pod creates an entry in the VxLAN device’s VTEP FDB. The controller pod needs
access to the nodes resource in the cluster, which you can accomplish by
giving the service account appropriate privileges. Use the following command: