Part 1 of this series covered the creation of the virtualized infrastructure for creating a Kubernetes cluster on. There are a variety of tools for building clusters, including Kops, Kubespray and Kubeadm. Kubeadm is perhaps the most popular tool for boot strapping clusters followed by Kubespray.

Kubespray

In essence Kubespray is a bunch of Ansible playbooks; yaml file that specify what actions should take place against one or more machines specified in a hosts.ini file, this resides in what is known as an inventory. Of all the infrastructure as code tools available at the time of writing, Ansible is the most popular and has the greatest traction. Examples of playbooks produced by Microsoft can be found on GitHub for automating tasks in Azure and deploying SQL Server availability groups on Linux. The good news for anyone into PowerShell is that PowerShell modules can be installed via Ansible and PowerShell commands can be executed via Ansible. Also, there are people already using PowerShell desired state configuration with Ansible. Ansible’s popularity is down to the facts it is easy to pick up and agent-less because it relies on ssh, hence why one of the steps in this post includes the creation of keys for ssh. This free tutorial is highly recommended for anyone wishing to pick up Ansible.

Cluster Topology Recap

The cluster this blog post will cover the creation of comprises of the following nodes and etcd instances. Note that any alphabetic characters used have to be lower case, you can use whatever naming convention you like, below is the naming convention I have elected to go with:

two master nodes:

ca-k8s-m01

ca-k8s-m02

three worker nodes:

ca-k8s-w01

ca-k8s-w02

ca-k8s-w03

three etcd instances, residing on:

ca-k8s-m01

ca-k8s-m02

ca-k8s-w01

the cluster will be deployed and administered from ca-k8s-boot.

Deploying Big Data Clusters On vmware

The process outlined in this blog post has been used to create Kubernetes clusters on vmware, ESXi 6.7 to be exact, which have then been used to deploy big data clusters on with success.

Cluster Creation

Add entries for the two master and three worker nodes in the /etc/hosts file on the boot strap virtual machine

Check that each machine can be pinged from the bootstrap machine in order to verify that the networking setup is sane.

Create a key pair on the boot strap machine, I’ve called this ca-k8s-boot in my example:

ssh-keygen

This will result in the following output, hit enter to accept the default:

Addendum 1st January 2019In the working directory that the git clone was performed from, change directory to the kubernetes directory and install the python packages specified in the requirements.txt file. This pulls in all the python packages and software required for the rest of the Kubernetes cluster creation process (including Ansible which is used in the next step):

pip3 install -r requirements.txt

With the kubernetes directory as your working directory, create a copy of the sample inventory directory for your cluster:

cp -rfp inventory/sample inventory/<cluster_name>

In the directory you created as part of the previous step under inventory, edit the hosts.ini file.the hosts.ini file under the inventory directory created for the cluster should look like this:

issuing these commands should disable line wrapping and display the end of the file. If the playbook has run successfully the values associated with unreachable and failed in the play recap section of the log should all be zero:

Kubectl Installation and Configuration

Kubectl is the primary tool for administering a Kubernetes cluster and deploying applications to it. This section will cover off installing and configuring the tool:

Under the home directory of the Kubernetes administration user (I’ve gone with cadkin) create a directory to hold the kubectl config file:

mkdir ~/.kube

Log onto one of the master node virtual machines, ca-k8s-m01 for example, and change permissions on the Kubernetes admin.conf file as follows:

sudo chmod 775 /etc/kubernetes/admin.conf

On the boot server; ca-k8s-w01 copy the admin.conf file to the .kube directory:

sudo scp cadkin@ca-k8-m01:/etc/kubernetes/admin.conf ~/.kube/config

Check that kubectl has picked up the context of the cluster in the configuration file:

kubectl config get-contexts

the output from this command should look something like this:

Lets take a look at the system pods:

kubectl get po --all-namespaces

Coming Up In Part 3

At present all communications between the cluster and the outside world, colloquially referred to as “North south traffic” is handled by something known as a node port. By default we have to make our own provision for load balancing this traffic, luckily there is an incredibly simple way to achieve this which will be covered in the next post in this series.