One-off Kubernetes jobs

So far all examples I made for Docker in Swarm Mode or Kubernetes blog posts were built around some sort of a service: web server, message queue, message bus. After all, “service” is a main concept in Swarm Mode, and even the whole micro-service application thing has, well, a “service” in it. But what about one-off jobs: maintenance tasks, scheduled events, or anything else, that we need to run just sometimes, not as a service?

For instance, take unit tests. If tests suite execution takes 60 minutes, I could take 60 containers, distribute tests among them and throw the whole thing into a cluster. A valid use case and end result is definitely not a service.

Doing so in Swarm would be possible, but tricky. We’d still had to create a service and as a bare minimum we should tell it to not restart containers that finished. It’s much simpler with Kubernetes, though. Not only we can schedule execution of individual pods without a service, k8s also has Job and Cron Job workloads, that will make an execution and control over such pods much simpler. Let’s see how we can use these three.

Setup

To follow along you’ll need VirtualBox, minikube and kubectl. Access to Google Container Engine will also do. I covered local setup details before, so let’s skip that part and jump straight into the cluster.

Pods

Assume you need a task to run. For instance, for some strange reason you decided to find all prime numbers between 1 and 70 by using bash script and Kubernetes. Such things happen, you know. If this one-liner does the math (and it does, I checked):

There’s nothing special in this pod except for that we explicitly tell it to stay shut down once it’s finished. Now we can deploy this pod with kubectl create -f pod.yml, watch it started and then start monitoring its STDOUT with kubectl logs -f primes, where the pod will echo its findings.

Get pod state

Shell

1

2

3

4

5

6

7

8

9

10

11

kubectl getpod

#NAME READY STATUS RESTARTS AGE

#primes 1/1 Running 0 3s

kubectl logs-fprimes

#1

#2

#3

#...

#61

#67

After it’s done, we still can see the pod with kubectl get pod, but this time we need to provide --show-all parameter to it. After all, the pod is no longer running.

Pod finished

Shell

1

2

3

kubectl getpod--show-all

#NAME READY STATUS RESTARTS AGE

#primes 0/1 Completed 0 1m

However, such tasks execution approach lacks few important features. Firstly, what if the node this worker pod was running at suddenly shuts down? The pod dies as well, right? Wouldn’t it be cool if something rescheduled it elsewhere?

Moreover, finding prime number in bash is slow. We could’ve made it faster by splitting 1..70 range into, let’s say, three smaller ranges (1-30, 31-50, 51-70) and distributing them between multiple pods. But this means we need to create those pods manually. Not a big deal for three pods, but quite a problem for a hundred.

On the other hand, there’re Job workloads.

Jobs

The Job is special kind of controller that creates and manages a set of pods that are going to do some finite work. Like Deployment, Job will recreate its Pods in case of a node failure. It also has parallelism property to specify how many pods should be doing the job and how many of them should succeed (completions) until the whole jobs becomes “finished”.

Here’s how simple it is to convert a Pod from previous example into a Job workload:

It’s basically a copy-paste excercise. However, the job will look much more interesting if we tell it to run up to 4 workers in parallel as long as it takes to get 8 successful completions:

Parallel job

YAML

1

2

3

4

5

6

#..

spec:

completions: 8

parallelism: 4

template:

#...

Of cause, those 4 parallel pods will be doing the same thing – finding prime numbers, but in real life that could’ve been picking a task from MQ, database, or from anywhere else.

Now, create the job, give it some time and see what’s happening in pods area:

Parallel jobs

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

kubectl create-fjob.yml

#job "primes" created

# several seconds later

kubectl getpods--show-all

#NAME READY STATUS RESTARTS AGE

#primes-5g2xp 0/1 Completed 0 31s

#primes-9l9tf 0/1 ContainerCreating 0 4s

#primes-d2jwk 0/1 Completed 0 14s

#primes-pxhqx 0/1 ContainerCreating 0 4s

#primes-rvq5x 0/1 Completed 0 31s

#primes-rxdrw 1/1 Running 0 4s

#primes-sw2lq 0/1 Completed 0 31s

#primes-v5bv8 0/1 Completed 0 31s

Yup, they are definitely doing something in parallel.

We also could’ve tried other parallelism and completions combinations. For example, skipping parallelism would cause 8 pods to run one after the other until we reach completions count. Alternatively, skipping completions would cause 4 parallel pods to be scheduled, but as soon as they’re done, the Job would be done as well.

Cron Jobs

Like Job starts worker pods, Cron Job starts a Job on schedule. The schedule uses crontab format, so if you ever created any cron job on Linux, you already know how to configure Cron Job in Kubernetes.

Let’s assume that calculating prime numbers became so important for our company, that we need to recalculate them once per minute. Absolutely no problem, I’ll just copy previous job configuration as jobTemplate for a Cron Job and set up a schedule:

We can create and monitor it the same way as we did so for regular jobs.

Create cron job

Shell

1

2

3

4

5

6

7

8

9

10

kubectl create-fcron.yml

#cronjob "primes" created

kubectl getcronjobs

#NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE

#primes */1 * * * * False 0 <none>

kubectl getcronjobs

#NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE

#primes */1 * * * * False 1 Tue, 28 Nov 2017 00:24:00 -0500

Absolutely no surprises.

View cron pods

Shell

1

2

3

4

5

6

7

8

9

10

kubectl getpods--show-all

#NAME READY STATUS RESTARTS AGE

#primes-1511846640-5m9tz 1/1 Running 0 5s

#primes-1511846640-5xvqj 0/1 Completed 0 32s

#primes-1511846640-dn8mq 1/1 Running 0 5s

#primes-1511846640-g98qb 0/1 Completed 0 32s

#primes-1511846640-hk7rl 0/1 Completed 0 32s

#primes-1511846640-kkcks 1/1 Running 0 5s

#primes-1511846640-vv5zm 0/1 Completed 0 32s

#primes-1511846640-xlf4r 1/1 Running 0 5s

Summary

So as you can see there’s not just a way to run one-off Kubernetes jobs, there’re actually three ways. For small ad-hoc tasks running them directly in pods should probably be enough. When a task can be parallelized and we care about giving a second chance to pods that failed for some reason, Job sounds like a reasonable choice make. Finally, if we’re talking about a task that should run on schedule, Cron Job is the way to go.