Kubernetes Application Management: Stateful Services

This article describes how to deploy and maintain a set of highly available MySQL services through the native k8s resource object StatefulSet and the MySQL Operator.

By Wu Bo (Bruce Wu)

Background

With Deployments and ReplicationControllers, users can conveniently deploy a highly available and scalable distributed stateless service in Kubernetes. These type of applications do not store data locally. By using simple load balancing policies, they can implement request delivery. With the popularization of k8s and the rise of cloud-native architectures, more and more people want to orchestrate stateful services like databases by using k8s. However, this process is not easy due to the complexity of stateful services. This article uses the most popular open-source database MySQL as an example to describe how to deploy and maintain stateful services on k8s. The content of this article is based on k8s 1.13.

Use StatefulSets to Deploy MySQL

StatefulSet Overview

Deployments and ReplicationControllers are designed for stateful services. Pod names, host names, and storage in Deployments and ReplicationControllers are not stable. In addition, Pods are started and destroyed in random order. Therefore, they are not suitable for stateful applications like databases. K8s provides the StatefulSet workload that is used to manage stateful services. Its management pod has the following features:

1.Uniqueness: For a StatefulSet with N replicas, each Pod in the StatefulSet will be assigned a unique integer ordinal, from 0 up through N-1.2.Sequence: By default, Pods in a StatefulSet are started, updated and destroyed sequentially.3.Stable network identity: The hostname and DNS of a Pod will not change after the Pod is rescheduled.4.Stable persistent storage: When a Pod is rescheduled, it can still mount the original PersistentVolume to ensure data integrity and consistency.

Service Deployment

In this example, the highly available MySQL service consists of one master node and multiple slave nodes that asynchronously replicate data from the master node (that is, the one-master-multiple-slave replication model). The master node can process read/write requests from users, while the slave nodes can only process read requests from users.

To deploy such a service, in addition to StatefulSets, many other k8s resource objects are required, including ConfigMaps, Headless Services, and ClusterIP Services. The collaboration among these objects allows stateful services like MySQL to conditionally run on k8s.

ConfigMap

To make it easy and convenient to maintain application configuration, large systems and distributed applications usually adopt centralized configuration management policies, In k8s, users can separate configuration from Pods by using ConfigMap to maintain the portability of the workload and simplify configuration change and management.

The sample contains a ConfigMap called mysql. When a Pod in the StatefulSet is started, it will read proper configuration from the ConfigMap based on its own role.

Headless Service

A Headless Service provides each associated Pod with a corresponding DNS address of the form <pod-name>.<service-name>. This allows the client to access any desired application instances and can solve the identity recognition among different instances in a distributed environment.

The sample contains a Headless Service called mysql, which is associated with Pods. These Pods are assigned the following DNS addresses: mysql-0.mysql, mysql-1.mysql, and mysql-2.mysql. By doing this, the client can access the master node through mysql-0.mysql and the slave nodes through mysql-1.mysql or mysql-2.mysql.

ClusterIP Service

To simplify access in read-only scenarios, the sample provides an ordinary service called mysql-read. This service has its own cluster IP and sends requests to associated Pods (including the master and the slaves) to hide Pod access details from users.

StatefulSet

A StatefulSet is a critical part of service deployment. Each Pod that a StatefulSet manages is assigned a unique name of the form <statefulset-name>-<ordinal-index>. In this example, the name of the StatefulSet is mysql. Therefore, Pods in the StatefulSet are named mysql-0, mysql-1, and mysql-2 respectively. By default, they are created sequentially and destroyed in reverse sequential order.

As shown in the following figure, a Pod contains two init containers and two app containers, and is bound to the PersistentVolume provided by the volume vendor through the unique PersistentVolumeClaim.

The functions of Pod-related components are as follows:

The init-mysql container generates configuration files. It extracts the Pod ordinal from the hostname and exports the ordinal into the /mnt/conf.d/server-id.cnf file. It also applies either master.cnf or slave.cnf (depending on the node type) from the ConfigMap by copying the contents into /mnt/conf.

The clone-mysql container clones data. The clone-mysql container in Pod N+1 clones data from Pod N to the PersistentVolume bound.

The xtrabackup container acts as a sidecar. It waits for mysqld in the mysql container to be ready and then runs the START SLAVE command to initialize data replication on the slave. The xtrabackup container also listens for connections from other Pods requesting a data clone.

The StatefulSet associates a unique PC to each Pod by using volumeClaimTemplates. In this sample, Pod N is associated to a PVC named data-mysql-N, which is also bound to the PV provided by the storage system. This mechanism ensures that a rescheduled Pod can still mount the original data.

Service Maintenance

To ensure service performance and improve system reliability, proper maintenance is required after the deployment completes successfully. Common maintenance work related to database services includes service fault recovery, service scaling, service status monitoring, and data backup and recovery.

Service Fault Recovery

Whether a service can recover itself in the case of a fault is one of the key metrics that indicate the system automation level. In the current architecture, the MySQL service can be automatically restored when the host experiences downtime or the master or slave nodes fail to respond. In the case of the aforementioned problems, k8s reschedules and restarts Pods where a problem happens. The StatefulSets can ensure that the names, hostnames, and volumes of these Pods remain consistent with the original items.

Service Scaling

In the one-master-multiple-slave MySQL replication model, scaling means to adjust the number of slaves. Thanks to the Pod startup and destruction ordering guarantee provided by the StatefulSet, the number of slaves can be scaled simply by using the following command.

Kubectl scale statefulset mysql -- replicas = <NumOfReplicas>

Service Status Monitoring

Monitoring service status is one essential part to ensure service stability. In addition to readiness probes and liveness probes, more fine-grained monitoring metrics are often required to detect service health. Users can expose the key metrics in MySQL to Prometheus by using mysqld-exporter and implement monitoring and alerting based on Prometheus. We recommend that users deploy mysqld-exporter in the sidecar mode together with the mysqld container in the same Pod.

Data Backup and Recovery

Data backup and recovery is an effective means to ensure data security. Users can implement data backup and recovery by using either volume interfaces or VolumeSnapshots. The following part describes the two methods.

Use Volume Interfaces

Many volume vendors provide the features to save data snapshots and recover data based on snapshots. These features are usually exposed to users in the form of interfaces. This requires users to be familiar with operation interfaces provided by the corresponding volume vendors. For example, if a service uses Alibaba Cloud disks as external volumes, users need to understand the snapshot interface provided for disks.

Use VolumeSnapshots

Three snapshot-related resource objects are introduced in K8s v1.12: VolumeSnapshot, VolumeSnapshotContent, and VolumeSnapshotClass. These objects provide standard methods to perform snapshot operations. Users can create snapshots of volumes that store MySQL data without perceiving external volumes, or recover data based on snapshots.

Using VolumeSnapshots is obviously a better method than directly using underlying volume interfaces. However, the VolumeSnapshot is still in the Alpha stage, and only a limited number of external volumes support standard snapshot operations. These factors limit the application scenarios of VolumeSnapshot. For more information about VolumeSnapshots, see the Volume Snapshots document.

Deploy MySQL by Using Operators

Although users can deploy and maintain a set of highly available MySQL services in k8s based on StatefulSets, the process is relatively complex. This process requires users to familiarize themselves with various k8s resource objects, learn many MySQL operation details and maintain a set of complex management scripts. Kubernetes Operators are designed to reduce the threshold for deploying complex applications on k8s.

Operator Introduction

An Operator is a method introduced by CoreOS to package, deploy and manage a complex application running on Kubernetes. Operators express the maintainers' knowledge of software operations in the form of code and comprehensively use various k8s resource objects to deploy and maintain complex applications.

An Operator defines new resource objects for a service by using a CustomResourceDefinition and ensures that applications are in the expected state by using custom controllers.

The workflow process of the Operator can be divided into the three steps:

1.Observe: Observe the current status of the target object by using the k8s API.2.Analyze: Find the differences between the desired state and current state.3.Act: Take the necessary steps to make the running state of the application match its expected state

2.Deploy an instance of the Operator in k8s. The Operator will constantly monitor CRUD operations on these resource objects and observes the object state.

3.When a user performs an operation (for example, creating a MySQL cluster), a new MySQLCluster resource object will be created. When the Operator listens for the MySQLCluster creation event, it will create a cluster that matches that user's configuration. This example creates a highly available MySQL cluster based on the Group Replication and uses native k8s resource objects like StatefulSets and Headless Services.

4.When the Operator finds that the desired state and current state have some differences, it performs proper orchestration operations to ensure a consistent state.

Service Deployment

Because the Operator encapsulates complex deployment details, it is now very easy to create a cluster. For example, a user can easily create a multi-primary MySQL cluster consisting of three nodes by using the following configuration.

Service Maintenance

When Operators are used, maintenance is also necessary, including service fault recovery, service scaling, service status monitoring, and data backup and recovery.

Service Fault Recovery

Due to the existence of the StatefulSet, k8s will reschedule a MySQL service instance when it fails to respond. In addition, if a StatefulSet is accidentally deleted, the Operator will recreate one.

Service Scaling

Users can easily scale services by changing the spec.members field of the MySQLCluster resource object. Only the MySQLCluster is exposed to users and underlying k8s resource objects are hidden.

Service Status Monitoring

Prometheus can be deployed on k8s to monitor the state of Operators and individual MySQL clusters. For more information, see Monitoring

Data Backup and Recovery

MySQLBackups and MySQLRestores can be used to back up and recover data, eliminating differences in operations on different volumes. MySQLBackupSchedules can also be used to create scheduled backup tasks.

For example, the following configuration performs a backup on the test database in the mysql-cluster MySQL cluster every 30 minutes.

Summary

This article describes how to deploy and maintain a set of highly available MySQL services through the native k8s resource object StatefulSet and the MySQL Operator. We can see that the Operator hides the orchestration details of complex applications and greatly reduces the threshold to use them in k8s. If you need to deploy other complex applications, we recommend that you use the Operator.