Understanding Kubernetes Pods Termination: Exit Code 137

Kubernetes, the open-source platform for automating deployment, scaling, and management of containerized applications, is a powerful tool for data scientists. However, it can sometimes be a bit cryptic, especially when things go wrong. One such instance is when a Kubernetes pod gets terminated with an exit code 137. This blog post aims to demystify this issue and provide solutions to prevent it from happening.

Kubernetes, the open-source platform for automating deployment, scaling, and management of containerized applications, is a powerful tool for data scientists. However, it can sometimes be a bit cryptic, especially when things go wrong. One such instance is when a Kubernetes pod gets terminated with an exit code 137. This blog post aims to demystify this issue and provide solutions to prevent it from happening.

Table of Contents

  1. What is Exit Code 137?
  2. Why does Exit Code 137 Occur?
  3. How to Identify the Issue?
  4. How to Prevent Exit Code 137?
  5. Conclusion

What is Exit Code 137?

Exit code 137 implies that a container within your Kubernetes pod was killed due to an Out Of Memory (OOM) error. This error occurs when the system runs out of memory and the Linux kernel kills processes to free up memory. The exit code 137 is a result of 128 + 9, where 9 corresponds to the SIGKILL signal, indicating that the process was forcefully killed.

Why does Exit Code 137 Occur?

The primary reason for an exit code 137 is that your pod is exceeding the memory limit set in its configuration. Kubernetes uses these limits to ensure that a single pod doesn’t consume all the available resources. If a pod tries to use more memory than its limit, the system will terminate it to prevent it from affecting other pods.

How to Identify the Issue?

The first step in diagnosing an exit code 137 is to check the logs. You can use the kubectl logs command to view the logs of a terminated pod. If the pod was killed due to an OOM error, you might see a message like this:

fatal error: runtime: out of memory

You can also describe the pod using the kubectl describe pod command. In the events section, you might see a message like this:

Last State:     Terminated
Reason:         OOMKilled
Exit Code:      137

How to Prevent Exit Code 137?

There are several ways to prevent your Kubernetes pods from being terminated with an exit code 137:

  1. Increase Memory Limits: The most straightforward solution is to increase the memory limit of your pod. However, be careful not to set the limit too high, as it could lead to resource starvation for other pods.

  2. Optimize Your Application: If increasing the memory limit is not an option, you might need to optimize your application to use less memory. This could involve code optimization or using more memory-efficient data structures and algorithms.

  3. Use Horizontal Pod Autoscaling: Kubernetes supports horizontal pod autoscaling, which automatically scales the number of pods in a replication controller, deployment, replica set, or stateful set based on observed CPU utilization.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50
  1. Use Vertical Pod Autoscaling: Vertical Pod Autoscaling (VPA) automatically adjusts the CPU and memory reservations for your pods, helping prevent OOM errors.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment
    name:       my-app
  updatePolicy:
    updateMode: "Auto"

Conclusion

In conclusion, exit code 137 in Kubernetes is an indication of an OOM error. By understanding its cause and knowing how to prevent it, you can ensure that your applications run smoothly and efficiently on Kubernetes. Remember, Kubernetes is a powerful tool, but like any tool, it requires understanding and proper usage to get the most out of it.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.