Cluster Installation

Overview

This guide describes a simple procedure to install a single-node kubernetes cluster and join it to the SLATE federation. There are many other possible options for installing Kubernetes; this is just one easy way to get started quickly.

SLATE Reference Hardware

SLATE works with a variety of cluster configurations, from single-node SLATELite virtual machines to large clusters of dedicated hardware. The following reference configurations are known to work as SLATE edge servers:

Prerequisites

This guide assumes a freshly installed CentOS 7 system. All techniques should generalize to other suitably modern Linux systems, but specific commands can differ.

This guide also assumes that your Kubernetes head node (or control plane) is on a publicly accessible IP address with port 6443 open, in order for the SLATE API server to communicate with your cluster.

Finally, at least one additional publicly accessible IP address, not currently assigned to any specific machine. This is needed in order to install a kubernetes load balancer, which will in turn allocate an address to an Ingress Controller which will provide convenient access to users’ services.

Obtain a SLATE token

Every cluster must be administered by a SLATE group. If there is already a group which should be responsible for this cluster, and you are not a member, you should request to join it. You can also create a new group to administer your cluster (go to the ‘My Groups’ page and click ‘Register New Group’). When you create a group you are automatically its first member. If you create a new group whose primary purpose will be to administer this cluster (and possibly others) you should select ‘Resource Provider’ as the field of science for it.

A group can administer multiple clusters, so if you are already a member of a suitable group you do not need to create another. A group can also administer both clusters and applications, which may run both on clusters which it administers and on clusters which it does not.

Install a copy of the SLATE CLI client on the machine which you are setting up:

Once the script has finished, you should have a working single-node Kubernetes cluster registered with SLATE. You can then jump to allowing groups on your cluster

System Configuration Tweaks

Docker and Kubernetes can be picky about the state of the system on which they run. In particular it is possible to use these together with SELinux, but doing so can be tricky, and requires expertise. Likewise, while Kubernetes can run on systems with swap memory, this is not recommended by the developers. Therefore, you should disable SELinux and swap (from this point, until otherwise noted, commands must be run as superuser):

Next, enable the kubelet (the per-node Kubernetes agent) to run automatically:

systemctl enable --now kubelet

Some Kubernetes networking plugins rely on using IPTables to filter traffic on the bridge netwrok, but CentOS has this turned off by default to better support virtual machine use cases. So, we turn it back on:

At this point it is time to initialize the Kubernetes cluster. The pod networking CIDR range must be configured to match the expectations of the networking plugin which we will install later. In this case, we will use Calico, so we use its preferred setting of 192.168.0.0/16:

kubeadm init --pod-network-cidr=192.168.0.0/16

After kubeadm completes you can copy the resulting Kubernetes config file to the home directory of whichever user you used to being the installation process, which need not be privileged (from this point on commands are assumed not be run as superuser):

This will install the components of the load balancer itself (to its own namespace, metallb-system), but it will not yet be active, as it is not configured with any IP addresses it can allocate. MetalLB supports using Layer 2 protocols or BGP to advertise the address it assigns; Layer 2 is usually easier to set up and does not require interacting with networking hardware. Create a YAML config file like the following:

Fill in the range of addresses you have available to use either as a range (e.g. 192.168.1.240-192.168.1.250) or as a CIDR prefix (e.g. 192.168.10.0/24). Note that you can use a single IP address in your pool, but doing so requires writing it as a range (like 192.168.1.240-192.168.1.240).

After you have prepared your configuration file, install it for use by MetalLB by running:

MetalLB on OpenStack

If your Kubernetes cluster is installed on one or more virtual machines run by OpenStack, there is one small, extra step required to enable MetalLB to route traffic properly. See the MetalLB documentation for details; in short, OpenStack must be informed that traffic sent to IP addresses controlled by MetalLB has a valid reason to be going to the VMs which make up the Kubernetes cluster.

Join the cluster to the SLATE federation

To join your cluster to the SLATE federation, you will need:

The name of the group you created in the first section (or the existing group to which you are adding this cluster)

The name of the organization which formally owns the cluster (typically the name of your institution or lab). Note that if this contains spaces you will need to remember to quote it in the command below

The name you want to use for the cluster in SLATE, which must contain only lowercase letters, numbers and dashes, and should ideally be short but descriptive.

Put the appropriate names into the following command, which when run will install supporting components into your cluster, then contact the SLATE federation to register the cluster with it.

After this command completes your cluster should be joined to the federation. You can see this by rerunning slate cluster list, which should now show it.

As a service to users who are curious where your cluster is, it is helpful to also run

slate cluster update <cluster name> --location <latitude>,<longitude>

Allowing groups to run on your SLATE cluster

At this point your cluster is a fully working member of the SLATE platform. However, only your group has access to deploy applications to it. You can leave it in this state as long as you wish, for example to do testing and evaluation. If you want to grant other groups access, you can use

slate cluster allow-group <cluster name> '*'

to grant access to all groups participating in the platform, or replace '*' with the name of a particular group to grant access to just that group.

Joining additional nodes

If you have more worker nodes which you wish to add to the cluster use the following command to generate a command for joining them:

sudo kubeadm token create --print-join-command

Install docker and kubernetes on the worker nodes, but stop at the point where kubeadm init was run on the master. Instead, run:

substituting in the IP address of your master node, and the hash and token provided by kubeadm token create. Note that the token remains valid for 24 hours, so if you wait longer than that to join a worker you will have to regenerate it.

In case of problems

If setting up your Kubernetes cluster does not work properly, kubeadm reset can be used to revert the effects of kubeadm init (and eliminate anything which had been installed inside Kubernetes). This is obviously a fairly destructive operation if you have gotten to the point of using the cluster for anything.

If you have a problem with SLATE, specifically, you can remove SLATE’s access to your cluster by deleting the ‘cluster’ custom resource which defines its main namespace (called slate-system unless you picked a different name):

kubectl delete cluster slate-system

Please note that this leaves SLATE in a somewhat confused state of expecting to be able to use the cluster but being unable to. If possible, it is nicer to first inform SLATE that it should stop using the cluster:

This material is based upon work supported by the National Science Foundation under Grant No. 1724821.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.