We have to surmise that. It could have been Kubernetes was halted, server rebooted, or something else. Having detail that stated ... a user-invoked suspend happened here ... would have made it clear what happened.

2) startingDeadlineSeconds

Summary: if a job misses its scheduled time by startingDeadlineSeconds it gets skipped.

The next scheduled time it will attempt to run again.

Below we have a cron job that should run every minute.

The work of this cron job is to sleep for 80 seconds.

We have concurrencyPolicy: Forbid specified. Two or more jobs may not run simultaneously.

startingDeadlineSeconds: 10 means it must start within 10 seconds each minute.

The Pod sleeps for 80 seconds means it will still be running a minute later. One minute later the next job cannot start ( concurrencyPolicy: Forbid ) because previous job still has 20 seconds of running time left. This second job will be skipped. This is what we attempt to observe below.

Previous running job deleted every time a new job starts. This is concurrencyPolicy: Replace in action.

IMPORTANT: NO job completes. NO job ever runs for its full 120 sleep seconds.

Use concurrencyPolicy: Replace if you understand how it works and you need its feature: If it is time for a new job run and the previous job run hasn't finished yet, the cron job replaces the currently running job run with a new job run. ( Replace deletes previous job, Pod and its logs )

IMPORTANT: In this tutorial for this specific case NO job ever completes successfully. It ALWAYS gets replaced. However if you use replace in your environment it will only replace jobs when a previous one exists. That will probably be nearly never in your production environment. Understand replace and its implications. It may be wrong or perfect right for your case.

From
https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#pod-backoff-failure-policy
> Pod Backoff failure policy
> There are situations where you want to fail a Job after some amount of retries due to a logical error in configuration etc. To do so, set .spec.backoffLimit to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6.
> Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s …) capped at six minutes. The back-off count is reset if no new failed Pods appear before the Job's next status check.

The cron job below should run every minute.

It exits immediately upon start with error exit code 1.

It has a backoffLimit of 2. It must retry twice to run in case of problems.

describe output above SEEMS to show all is well, but it is not: no indication of CrashLoopBackOff status.

If you have an hourly cron job that has such problems you are left with little historic paper trail evidence of what happened.

I do not enjoy troubleshooting cron jobs that use backoffLimit and this behaviors.

Tip: write such cron job log information to a persistent volume and use that as your primary research source. Everything in one place and it is persistent. Plus you are in control of what information to write to the logs.

kubectl delete -f myCronJob.yaml
cronjob.batch "mycronjob" deleted

9) Your Turn

These spec feature fields enable useful functionality configurations:

schedule frequency

startingDeadlineSeconds

concurrencyPolicy

parallelism

completions

restartPolicy

Standalone each field is easy to understand. Combining all these YAML spec fields lead to complex interactions and reactions, especially when combined with unexpected long running cron jobs.

Design your own simple tests to learn these features. Then design some complex tests where you test and learn their interactions.

The moment you are running 2 or more different cron jobs simultaneously in 2 or more terminal windows you will know ... you are becoming an expert.

Elastic Container Instance (ECI) is an agile and secure serverless container instance service. You can easily run containers without managing servers. Also you only pay for the resources that have been consumed by the containers. ECI helps you focus on your business applications instead of managing infrastructure.