Building a distributed Kubernetes platform from scratch

While I was building VMware Photon-1.2 platform on my Intel NUC, I ran into some challenges running the Worker node VMs successfully (hence the reason, I couldn’t continue on Part-2). While trying to debug the problem, soon I realize that I need to understand the Kubernetes platform a lot better.

So in this article, I am venturing into build a scalable distributed Kubernetes platform for scratch. My approach is always the same - to understand what it takes to build any platform in a distributed manner so that, we can have a better understanding on all the moving parts and their interactions among them.

The scope of this article is to help you build a fully distributed kubernetes platform all by yourself, to have it at your disposal on your own machine so that you can start playing with it and thereby, learning the platform better. This article doesn’t get into the basics of kubernetes platform per se. There are many articles out there online that do a fantastic job, which I am pointing them to, under References.

For now, I am focusing only on building the platform from scratch (you can stop right here if this is not of your interest :-)) Once I am done with that part, I am pointing to some good links/resources who have done a very good job in teaching how to run workloads with their YAML files and in the process, explaining all the features the platform as a whole, provides.

My physical setup is follows:

I use Intel NUC running ESXi6 Hypervisor. This is where all the virtual machines that make up the Kubernetes platform run

I then built the entire Kubernetes platform on my NUC, there by accessing it from my Laptop

Once the Kubernetes platform is completely built, you will have multiple networks, each for a different purpose:

10.1.1.0/24 network – This is the network defined between my Macbook pro and my Intel NUC

172.16.10.0/24 network – This is the private network on which all the VMs that make up the kubernetes platform reside

10.20.0.0/16 network – This is the network created by flanneld which is an overlay network on top of (2)

10.20.xx.yy/24 network – This is the network that gets created among each docker running on each NODE so that the network address space is spread across multiple NODES. The most important thing to understand is that on each NODE where the docker runs, it gets a subnet assigned for itself from flanneld, which then uses that subnet range to assign the IP addresses for the Containers on its node (This will be clear in Section 4)

10.240.0.0/24 network – This is the Cluster lever network whose scope resides within the Cluster.

To have a distributed Kubernetes platform, I built a dozen virtual machines in total. Apart from EDGE VM, rest of them are running CentOS7:

1 x EDGE VM running EFW distribution, configured as network gateway

1 x DNS VM – CentOS7 VM running FreeIPA for DNS service

1 x JUMPBOX VM – CentOS7 VM where you log in that has visibility to all the VMs

3 x ETCD VMs – Three CentOS7 VMs that run ETCD service, which is a distributed Key / Value store. This is where all the information about the Kubernetes platform is stored and shared among all the components

1 x MASTER VM – CentOS7 VM where kube-apiserver, kube-controller-manager, kube-scheduler, flannel (overlay networking) and docker will be running. It is worth noting that it is possible to architect the platform without running docker on the Master node. In a production system, it is recommended to have a secondary Master as well, and having a Load Balancer in the frontend. I will skip the secondary Master and Load Balancer for now and will try visiting at a later time

5 x NODE VMs – Five CentOS7 VMs where kubelet, kube-proxy, flanneld, docker will be installed. There are the Nodes where Kubernetes PODs will be running. Each POD holds one or more docker containers that run the workloads

On my Intel NUC running ESXi6 hypervisor, I created 2 standard switches – vSwitch0 (WAN) and vSwitch1 (LAN). Except for the EDGE VM which acts as a Gateway that saddles between vSwitch0 and vSwitch1, rest of the VMs are connected to vSwitch1, there by residing on a separate network (172.16.10.0/24)

Configuration of virtual machines are as follows:

I have broken down the steps in building the distributed Kubernetes platform into multiple sections:

Configuration of ETCD VMs

Configuration of JUMPBOX VM (specific to my setup)

Configuration of the MASTER VM

Configuration of NODES VMs

Installation of Kubernetes Dashboard

Running a simple NGINX server as a POD

The real learning of Kubernetes platform starts now

Section 1. Configuration of ETCD VMs

I have seen articles that explain how to build ETCD distributed store as part of the Master VM. But here I am building the ETD as a separate cluster by itself, made up of 3 VMs (you can go up to 5 VMs). This way, the key /value store will be kept truly independent as how it should be in a production grade setup.

I started with the first VM of the ETCD cluster namely ETCD-01 VM. Except for its IP address, the steps are exactly the same for the other two ETCD VMs as well (ETCD-02 and ETCD-03).

To begin with, disable SELINUX by editing the file /etc/selinux/config followed by stopping/disabling firewall service, installing the packages open-vm-tools, net-tools and finally updating the entire system with yum update command.

ETCD-01# vi /etc/selinux/config

disable SELINUX

ETCD-01# setenforce 0

ETCD-01# systemctl stop firewalld.service

ETCD-01# systemctl disable firewalld.service

ETCD-01# yum -y install open-vm-tools net-tools

ETCD-01# yum -y update ; sync ; reboot

Create a user and group account named etcd under which the etcd service will be running

Now that the configuration is done on ETCD-01 VM, enable and start the ETCD distributed key/value store service with the following commands:

ETCD-01# systemctl enable etcd.service

ETCD-01# systemctl start etcd.service

ETCD-01# systemctl status -l etcd.service

Finally, repeat the above commands in the other two VMs as well that are meant for ETCD cluster namely ETCD-02 (172.16.10.22) and ETCD-03 (172.16.10.23) with their appropriate IP addresses in their /etc/etcd/etcd.conf file

Section 2: Configuration of JUMPBOX VM

This section is specific to the way I design this lab. Since JUMPBOX VM is my primary VM where I will be working for the most part, I have created an user account named student and I am setting that user account’s environment in a way, I can manage both ETCD distributed store and Kubernetes platform remotely, from this JUMPBOX VM. Following are the list of things I am doing on JUMPBOX VM:

As root, copy the command etcdctl from any of the ETCD VMs into /usr/local/bin directory

As root, download kubectl command from kubernetes repository and place it in /usr/local/bin directory

Make changes to the file .bashrc in home directory (of user account – student)

Make a specific entry in the ETCD distributed store which is needed later for flanneld in creating the overlay network

If you recall, /overlay/network/config is exactly what we added into the ETCD distributed store in Section 2. And it is that network address space (10.20.0.0/16) that will be used by flannel, which then creates subnets and assigns to each docker running on different NODE VM. Again, I would strongly recommend Peng Xiao’s website for a thorough explanation and better understanding.

Now that all the requirements are met, enable and start each server in the following order:

MASTER-01# systemctl enable kube-apiserver

MASTER-01# systemctl start kube-apiserver

MASTER-01# systemctl status -l kube-apiserver

MASTER-01# systemctl enable kube-controller-manager

MASTER-01# systemctl start kube-controller-manager

MASTER-01# systemctl status -l kube-controller-manager

MASTER-01# systemctl start kube-scheduler

MASTER-01# systemctl start kube-scheduler

MASTER-01# systemctl status -l kube-scheduler

MASTER-01# systemctl enable flanneld

MASTER-01# systemctl start flanneld

MASTER-01# systemctl status -l flanneld

MASTER-01: systemctl restart docker

MASTER-01# systemctl status -l docker

Section 4: Configuration of NODE-01 VM

There are total of 5 NODE VMs in my setup. These are the VMs that will eventually run the PODS/docker containers/application workloads.

The following components will be installed and configured on all five NODE VMs:

kubelet

kube-proxy

flanneld

docker

Again, I have recreated the diagram that shows the components running in NODE VMs and their interactions with other components

The setup shown below is to the context of NODE-01 VM. Except for the IP address, everything else is pretty much the same that needs to be repeated on the other 4 NODE VMs. Define the repository first

Note that the above IP address for docker (10.20.18.1/24) is obtained from the subnet range provided by flanneld service, which in turn gets it from the ETCD service where 10.20.0.0/16 network address space was already defined by us before (in Section 2). Flanneld refers ETCD service in taking care of assigning different subnet ranges to different docker(s) running on different NODES.

Initially when you run ‘ifconfig’ command, you will notice that docker0 (linux bridge) will be assigned a different IP address, when the docker package was installed and docker daemon was started the first time.

NODE-01# ifconfig

But once you restart the docker service after starting flanneld, you will now notice that it will consume the very first IP address from the flanneld assigned subnet address range (10.20.X), and no longer uses 172.17.X network.

NODE-01# systemctl restart docker

NODE-01# systemctl status –l docker

NODE-01# ifconfig

Docker will use the very first IP address from the flanneld’s subnet (docker0 = 10.20.18.1/24). From here onwards, all the Containers will be auto assigned from this network address space where Docker (10.20.18.1 in my case) will be their Gateway.

This is probably a good place make a pause, take a step back, try to assimilate all the details in understanding the mechanics of how overlay networking is being done by flanneld.

I would strongly recommend Peng Xiao’s website (References section) who has done an excellent job in explaining the networking fundamentals needed for Kubernetes platform.

To verify NODE-01 VM is talking to MASTER-01 and all the communications are happening as expected, go to JUMPBOX and issue the following command:

JUMPBOX$ kubectl get nodes –o wide

Section 5: Installation of Kubernetes dashboard

I have to admit that by the time I reached this far, I desperately wanted to see how Kubernetes dashboard UI is going to show up. So without debugging the errors messages that I faced, I just passed the parameter “--validate=false” to suppress them so that I can get on with the dashboard UI (I know I am cheating here)

If you have come this far, you’ll realize that what we have done so far is just the beginning J Next, we need to learn how to use, consume and administer the platform, for which I have a couple of suggestions:

Download the lab exercises provided by Edward Viaene and Peng Xiao and follow their documentation:

Now that you have your own Kubernetes platform at you disposal running on your own server, having the lab exercises and video lessons will certainly save you a lot of time and keep you stay focused with you learning.

Log in as root in both VMs namely SERVER-01 (172.16.10.1) and SERVER-02 (176.16.10.2) and run the following commands. Pay attention to the IP addresses and port numbers that are being pass on each Server VM.

NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MariaDB
SERVERS IN PRODUCTION USE! PLEASE READ EACH STEP CAREFULLY!

In order to log into MariaDB to secure it, we’ll need the current
password for the root user. If you’ve just installed MariaDB, and
you haven’t set the root password yet, the password will be blank,
so you should just press enter here.

By default, a MariaDB installation has an anonymous user, allowing anyone
to log into MariaDB without having to have a user account created for
them. This is intended only for testing, and to make the installation
go a bit smoother. You should remove them before moving into a
production environment.

Remove anonymous users? [Y/n] <–Hit ENTER
… Success!

Normally, root should only be allowed to connect from ‘localhost’. This
ensures that someone cannot guess at the root password from the network.

Disallow root login remotely? [Y/n] <–Hit ENTER
… Success!

By default, MariaDB comes with a database named ‘test’ that anyone can
access. This is also intended only for testing, and should be removed
before moving into a production environment.

Our true definition of success as a viable business is to enable individuals like you who have that fire in your belly to learn what we offer as self-paced labs, no matter where you live and what you speak.

Our end goal is to hand deliver Master-VMs through our subscription model so that you can self-learn to your heart’s content.

Partner Engineering

It’s always great to be in the partner engineering team guiding partners what to do and which way to go.

But at times, you yourself is at lost when it comes to producing your own deliverables namely SETs / Solutions Enablement Kits (or whatever your company calls them) because you’ll soon end up becoming dependent on other teams like technical marketing, product management, field enablement and so on.

By using the same Master-VM framework that the rest of the company uses, you can lock step and be on the same page with the rest of the teams. With this approach you can get your deliverables out the door on time and bring up partners on board in a consistent manner.

Product Management

You know what it means to be in the product management, how you are expected to have all the answers all the time and how everything becomes your action item.

One of the most challenging part of being a product manager is to assimilate all the information from the engineering and then spread it within the company as well as with your customers.

With our Master-VM framework, your products no longer have to live within powerpoints or wiki pages or in webex recordings. Your entire product stack with all its uses cases can come to life by packaging them into a single Master-VM along with evaluation licenses. And that can be distributed quickly in a USB thumb drive to all the relevant teams. That way, no one has to bother you on corridors or cafeteria asking for demos or product details anymore. Just point them to its Master-VM and that will take care of everything.

Technical Marketing

Technical marketing teams are expected to churn out reference architectures and white papers like pan cakes expect, nobody knows what it takes in producing your deliverables and how much you are dependent on having your own internal labs.

One of the major challenge you often face is showcasing your product capabilities and their integration with other products within your company as well as from partners and competitors. Once you figured that out, next comes the real challenge of disseminating that information to the rest of the world.

Our Master-VM framework gives you great flexibility in packing all your findings, recommendations, configurations, documentations, video/webex recordings – all in just one single VM (we call that Master-VM). It is upon us to get that delivered to your end customer, be it internal or external. This way, we completely eliminate the challenges in either opening up your internal labs for public consumption or burn your pockets by running them on a public cloud 24/7.

Technical Account Managers

As as TAM your neck is always on the line. Customers expect you to solve all their problems in a jiffy. You’ll end up becoming their sandbag taking all the blows and punches. But when it comes to training a TAM, it always seemed to be an after thought.

With all your hectic schedules and meeting up with customers face-to-face constantly, attending a full 5 day training course eventually becomes a pipe dream. You hate being dependent on someone else to help you out on the technical side of things as you never had the time to free up yourself in learning what you’re expected to know in the first place.

With our Master-VM framework, you can do your self-directed learning in your own pace. Since you are running the self-paced labs locally, you can always pause and continue your lab exercises without over lapping your work schedules. This way you can always keep up with product features and limitations first hand and deal the customers confidently.

Field Enablement

Though Architecting heterogeneous solutions have its own challenges, the primary hurdle that many Architects face is a lack of exposure to such systems.

With Master-VM framework, we build self-paced labs anywhere between the simplest to the most complicated architectures involving multiple software/hardware components from multiple vendors. Our Master-VM approach gives great flexibility, exposure and confidence in building such complex architectural solutions.

The video lessons that come bundled with any Master-VM will save your precious time in not getting yourself lost by wandering on the net looking for clues/answers but stay laser focused on mastering the concepts quickly without any diversion.

Consulting Architects

The management wants professional services organization to be a cash cow for the company. But somehow doesn’t want to spend on the required technical training (?!) Ask them for a training and all you hear are budgets cuts, quarterly endings and cost cutting.

We understand your world. And that is why we kept all your training requirements as the fundamental corner stone while designing our Master-VM framework. You no longer need to convince your reporting chain all the way up to a VP to get the required training. All you need is your laptop, your physical server, your time commitment and the rest is on us.

In fact, for some of the self-paced labs that we have designed, we have even eliminated dependency on an internet connection! That way, you can be in hotel room in a different country altogether, but you can still spend a few hours wisely in learning the product capabilities of your choice using our Master-VM framework.

Sales Engineers

Our Master-VM framework provides you an alternate and cost-effective way of carrying your demos and proof-of-concepts with you on your customer visits which instantly become your sales tools.

Your sales cycles can be much shortened by leaving your portable servers at your customer site for them to play with your stack if needed, rather than try building sandboxes for them on Cloud and then baby sitting them.

You no longer need to burn midnight oil in preparing for your next day demo. But instead a single Master-VM can build your demos and poc’s with all the use cases that you see fit. This way you can showcase product capabilities effectively and quicker.