Getting Started with Kubernetes CronJob

Introduction

In the container orchestration & Kubernetes technology, managing recurring tasks efficiently is crucial for maintaining a healthy and automated system. One powerful tool in the Kubernetes for handling scheduled tasks is the CronJob.

In this article, I will explain what the CronJobs are, their utility in Kubernetes clusters, explore some common use cases, and walk through the process of creating couple of CronJob examples.

What is CronJob?

In Kubernetes CronJob is a resource type used in Kubernetes to automate the execution of tasks on a recurring schedule. It is similar to the traditional cron utility used in Unix-like operating systems, but it operates within the Kubernetes ecosystem.

CronJobs allow users to define jobs, which are tasks or pods that run to completion, and specify a schedule in Cron format (minute, hour, day of month, month, day of week) for when these jobs should be executed. Kubernetes CronJobs ensure that these jobs are run at the specified intervals, providing a convenient way to automate repetitive tasks within Kubernetes clusters.

Kubernetes CronJobs simplify the management of scheduled tasks within Kubernetes clusters, enabling users to automate operations, backups, data processing, perform routine maintenance, and execute batch processes efficiently.

Use Cases

1. Running Scheduled PostgreSQL Queries

Imagine you have a PostgreSQL database running for your system, and you need to run specific queries at regular intervals to generate reports or perform data cleanup. CronJobs can be configured to execute psql queries against the database periodically, automating this process.

2. Microservices Scenarios

In a microservices architecture, various components may require periodic tasks such as log rotation, database backups, or cache refreshing. CronJobs can be employed to schedule these tasks across different microservices, ensuring smooth operation of the entire system.

Creating your First CronJob

Let's walk through the process of creating a simple CronJob using Kubernetes YAML configuration.

Step 1: Define the CronJob

Create a YAML file (e.g., cronjob.yaml) with the following content:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: my-cronjob
  namespace: sample # update your namespace here$$
spec:
  schedule: "*/1 * * * *"  # Runs every minute
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: my-container
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo "Hello, Kubernetes!"
          restartPolicy: OnFailure

Step 2: Apply the Configuration

Apply the YAML configuration using

# kubectl apply -f cronjob-1.yaml

Step 3: Verify CronJob

Check the status of the CronJob using kubectl get cronjobs and kubectl get jobs.

kubectl get cronjobs -n sample

output

NAME         SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
my-cronjob   */1 * * * *   False     0        <none>          38s

kubectl get jobs -n sample

NAME                  COMPLETIONS   DURATION   AGE
my-cronjob-28523782   1/1           4s         11s

Check the logs

kubectl logs jobs/my-cronjob-28523783 -n sample

output

Tue Mar 26 04:23:01 UTC 2024
Hello, Kubernetes!

Explanation of key fields in the CronJob YAML

schedule: Specifies the schedule in Cron format (minute, hour, day of month, month, day of week) when the job should run.
jobTemplate: Defines the template for the Job created by the CronJob, including pod specifications like containers, volumes, and restart policies.

Creating your Second CronJob

Let's create a CronJob that demonstrates a real-world use case: performing daily backups of a PostgreSQL database running in a Kubernetes cluster.

Here's the YAML configuration for the CronJob:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
spec:
  schedule: "0 0 * * *"  # Run at midnight every day
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: postgres-backup
            image: postgres:latest  # You can use a custom image with backup tools installed
            command: ["sh", "-c"]
            args:
            - pg_dump -U <username> -h <host> <database_name> > /backup/$(date +"%Y%m%d").sql
            volumeMounts:
            - name: backup-volume
              mountPath: /backup
          restartPolicy: OnFailure
          volumes:
          - name: backup-volume
            persistentVolumeClaim:
              claimName: postgres-pvc  # Name of the PersistentVolumeClaim for PostgreSQL data

Make sure to replace <username>, <host>, and <database_name> with appropriate values for your PostgreSQL database. also, ensure that you have a PersistentVolumeClaim named postgres-pvc associated with your PostgreSQL deployment.

With this CronJob configuration, Kubernetes will automatically execute the backup command at midnight every day, ensuring that your PostgreSQL database is backed up regularly.

Explanation of key fields in the CronJob YAML

concurrencyPolicy: Determines how to handle multiple executions of the job concurrently. Options include Allow (default), Forbid, and Replace. Here's a breakdown of the possible values for the concurrencyPolicy field:
- Allow: Allows concurrent executions of the job. This means that if a new job is scheduled to run while a previous instance of the job is still running, both jobs will run concurrently.
- Forbid: Disallows concurrent executions of the job. If a new job is scheduled to run while a previous instance of the job is still running, the new job will not start until the previous one completes.
- Replace: Replaces the existing job with the new one if a new job is scheduled to run while the previous instance of the job is still running. This effectively terminates the running job and starts the new one.
successfulJobsHistoryLimit: The successfulJobsHistoryLimit field specifies the number of successfully completed jobs that should be retained in the history of the CronJob. In this case, successfulJobsHistoryLimit: 1 indicates that only the latest successful job will be kept in the history.

Creating your Third CronJob

Below is an example of a CronJob YAML configuration that schedules the execution of a kubectl command:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: kubectl-command
spec:
  schedule: "*/5 * * * *"  # Run every 5 minutes
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: kubectl
            image: bitnami/kubectl:latest
            command:
            - kubectl
            args:
            - <kubectl_command>  # Replace <kubectl_command> with your desired kubectl command and arguments
          restartPolicy: OnFailure

Example:

Also, keep in mind that you will need to set up proper Role-Based Access Control (RBAC) permissions, as you may encounter errors such as 'Error from server (Forbidden): services is forbidden' if your service account lacks the necessary permissions.

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: jp-test
  name: jp-runner
rules:
- apiGroups:
  - extensions
  - apps
  resources:
  - pods
  verbs:
  - 'get'

---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: jp-runner
  namespace: jp-test
subjects:
- kind: ServiceAccount
  name: sa-jp-runner
  namespace: jp-test
roleRef:
  kind: Role
  name: jp-runner
  apiGroup: ""

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa-jp-runner
  namespace: jp-test

---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/5 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: sa-jp-runner
          containers:
          - name: hello
            image: bitnami/kubectl:latest
            command:
            - /bin/sh
            - -c
            - kubectl patch deployment runners -p '{"spec":{"template":{"spec":{"containers":[{"name":"jp-runner","env":[{"name":"START_TIME","value":"'$(date +%s)'"}]}]}}}}' -n jp-test
          restartPolicy: OnFailure

# kubectl apply -f cronjob-3.yaml

Conclusion

In this guide, we've explored the fundamentals of Kubernetes CronJobs, their significance in scheduling recurring tasks within Kubernetes clusters, and provided practical insights into creating and managing CronJobs. By leveraging CronJobs effectively, you can automate routine tasks, streamline operations, and enhance the efficiency of your Kubernetes environment.

References

Running Automated Tasks with a CronJob