Best practices for building Kubernetes Operators and stateful apps

Palak Bhatia

Product Manager

Jun Xiang Tee

Software Engineer at Google

October 19, 2018

Recently, the Kubernetes community has started to add support for running large stateful applications such as databases, analytics and machine learning. For example, you can use the StatefulSet workload controller to maintain identity for each of the pods, and to use Persistent Volumes to persist data so it can survive a service restart. If your workload depends on local storage, you can use PersistentVolumes with Local SSDs, and you can also use SSD persistent disk as boot disk for improved performance for different kinds of workloads.

However, for many advanced use cases such as backup, restore, and high availability, these core Kubernetes primitives may not be sufficient. That’s where Kubernetes Operators come in. They provide a way to extend Kubernetes functionality with application specific logic using custom resources and custom controllers. With the Operator pattern, you can encode domain knowledge of specific applications into an Kubernetes API extension. Using this, you can create, access and manage applications with kubectl, just as you do for built-in resources such as Pods.

At Google Cloud, we’ve used Operators to better support different applications on Kubernetes. For example, we have Operators for running and managing Spark and Airflow applications in a Kubernetes native way. We’ve also made these Operators available on the GCP Marketplace for an easy click-to-deploy experience. The Spark Operator automatically runs spark-submit on behalf of users, provides cron support for running Spark jobs on a schedule, supports automatic application restarts and re-tries and enables mounting data from local Hadoop configuration as well as Google Cloud Storage. The Airflow Operator creates and manages the necessary Kubernetes resources for an Airflow deployment and supports the creation of Airflow schedulers with different Executors.

As developers, we learned a lot building these Operators. If you’re writing your own operator to manage a Kubernetes application, here are some best practices we recommend.

1. Develop one Operator per application

An Operator can automate various features of an application, but it should be specific to a single application. For example, Airflow is normally used with MySQL and Redis. You should develop an operator for each application (i.e., three operators) rather than a single operator that covers all three of them. This provides better separation of concerns with respect to domain expertise of each application.

Kubebuilder is a comprehensive development kit for building and publishing Kubernetes APIs and Controllers using CRDs. With Kubebuilder, you can write Operators in an easy way without having to learn about all the low level details of how Kubernetes libraries are implemented. To learn more, check out the Kubebuilder book.

3. Use declarative APIs

Design declarative APIs for operators, not imperative APIs. This aligns well with Kubernetes APIs that are declarative in nature. With declarative APIs, users only need to express their desired cluster state, while letting the operator perform all necessary steps to achieve it. With imperative APIs, in contrast, users must specify clearly and in order what steps to perform to achieve the desired state.

4. Compartmentalize features via multiple controllers

An application may have different features such as scaling, backup, restore, and monitoring. An operator should be made up of multiple controllers that specifically handle each of the those features. For example, the operator can have a main controller to spawn and manage application instances, a backup controller to handle backup operations, and a restore controller to handle restore operations. This simplifies the development process via better abstraction and simpler sync loops. Note that each controller should correspond to a specific CRD so that the domain of each controller's responsibility is clear.

5. Use asynchronous sync loops

If an operator detects an error (e.g., failed pod creation) when reconciling the current cluster state to the desired state, it should immediately terminate the current sync call and return the error. The work queue should then schedule a resync at a later time; the sync call should not block the application by continuing to poll the cluster state until the error is resolved. Similarly, controllers that initiate and monitor long-running operations should not synchronously wait for the operations. Instead, the controllers should go back to sleep and check again later.

Monitoring and logging your applications

Once you have written your own operator, you will need to enable logging and monitoring for your applications. This can be complicated to newcomers. Below are some best practices you can follow.

1. Perform application-, node- and cluster-level log aggregation

Kubernetes clusters can get big, especially ones with stateful applications. If you keep a log for every container, you will likely end up with unmanageable amount of logs. To remedy this, you can aggregate your logs. You can perform application-level logging by aggregating container logs and filtering out log messages that meet certain severity and verbosity logging levels. Application-level aggregation requires the ability to tell which application a log belongs to. For this, you may need to integrate application-specific details to the log messages such as adding a prefix for the application name.

Similarly, for node-level and cluster-level logging, you can aggregate all application-level logs within a node or a cluster. Kubernetes doesn’t support this natively, so you may have to use external logging tools such as Google Stackdriver, Elasticsearch, Fluentd, or Kibana to perform the aggregations.

We recommend adding labels to metrics to facilitate aggregation and analysis by monitoring systems. For example, if you are using Prometheus to analyze your Prometheus-style metrics, the added labels help the system a lot in querying and aggregating the metrics.

3. Expose application metrics via pod endpoints for scraping purposes

Instead of writing application metrics to logs, files, or other storage mediums, a more viable option is for application pods to expose a metrics HTTP endpoint for monitoring tools to scrape. This provides better discoverability, uniformity and integration with metric analysis tools such as Google Stackdriver. A good way to achieve this is to use open-source application-specific exporters for exposing Prometheus-style metrics.

There’s more work to be done in making running stateful applications on Kubernetes as easy as it is in a virtual machine, but with the ability to write custom controllers with Kubernetes Operators, we’ve come a long way.

For more insights around the developer experience on Kubernetes and Google Kubernetes Engine (GKE) check out these recent posts: for developers with small environments see how we made it easier and more affordable to get started, for those looking to learn from developers directly see our curated list of must watch talks covering a variety of important topics. Over the next couple weeks we’ll publish more around the Kubernetes developer experience so watch for our series and follow us on @GCPcloud for the latest.