Optimizing resource requests and limits for kubernetes pods

4 min readSep 1, 2024

In Kubernetes, correctly defining CPU and memory resources for your pods is crucial for maintaining application performance and ensuring efficient use of cluster resources. Misconfiguration can lead to resource contention, suboptimal performance, or even application crashes. This article provides a detailed guide on how to properly set resource requests and limits in Kubernetes. It’s based on my more than 5 years on managing Kubernetes clusters in production.

Understanding Resource Requests and Limits

In Kubernetes, each pod’s container can be assigned specific resource requests and limits:

Resource Requests: The amount of CPU and memory that a container is guaranteed to have. The Kubernetes scheduler uses these values to decide on which node to place the pod. If a node doesn’t have enough resources to meet a pod’s requests, the pod won’t be scheduled on that node.
Resource Limits: The maximum amount of CPU and memory a container can use. If a container tries to exceed these limits, the system will throttle the container’s CPU usage or terminate it if it exceeds the memory limit.

CPU and Memory Units

CPU: Measured in CPU units. 1 CPU unit in Kubernetes corresponds to one vCPU/core for cloud providers, or one hyperthread on bare-metal Intel processors. CPU can be defined as a fractional number, e.g., 0.5 for half a CPU core.
Memory: Measured in bytes, though Kubernetes allows shorthand like Mi (mebibytes) and Gi (gibibytes). For example, 512Mi or 2Gi.

Example Configuration

Here’s an example of how you might define resource requests and limits in a pod specification:yaml

apiVersion: v1
kind: Pod
metadata:
  name: gisalind
spec:
  containers:
  - name: gisalind
    image: gisalind:v1
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "512Mi"
        cpu: "1"

In this example above:

The container requests 256 MiB of memory and 0.5 CPU (500 millicores).
The container has a limit of 512 MiB of memory and 1 full CPU core.

So before launching the pod, Kubernetes will ensure that the requested resources are available. If not the pod of the application will stay in pending until the resource (Memory and CPU) become available.

Below some guides based on my experience on how to setting properly CPU an Memory request and Limit.

Best Practices for Setting CPU and Memory

1. Understand Application Resource Needs

Before setting resource requests and limits, analyze your application’s resource usage under normal and peak conditions. Use monitoring tools such as Prometheus, Grafana, Cloudwatch or Kubernetes Metrics Server to collect data on CPU and memory usage.

2. Start with Conservative Requests

If you’re unsure about the exact resource needs, start with conservative requests based on observed data. Set the requests slightly below the average usage and limits above the peak usage but within a reasonable margin.

3. Use Horizontal Pod Autoscaling

If your application load varies significantly, consider using Horizontal Pod Autoscaling (HPA). HPA automatically adjusts the number of pods based on CPU utilization or other select metrics, helping you scale resources dynamically.

4. Avoid Overallocation

Overallocating resources can lead to inefficiency in your Kubernetes cluster. If every pod requests more resources than necessary, your cluster could become underutilized, leading to wasted computational power and higher costs. Prefer start small then add more

5. Tune Requests and Limits Iteratively

After deployment, continue monitoring the application’s performance and resource usage. Adjust the requests and limits iteratively to optimize for both performance and resource utilization.

6. Consider Bursting Workloads

For workloads with occasional spikes in resource usage, you might set low resource requests with higher limits. This allows the application to burst beyond its usual needs without permanently occupying those resources.

7. Avoid Memory Overcommitment

Be cautious with setting memory limits. If a container exceeds its memory limit, Kubernetes will terminate it (OOMKilled). This can disrupt application availability. It’s safer to set memory requests closer to expected usage and limits slightly above it, without overcommitting.

Conclusion

Properly defining CPU and memory for your Kubernetes pods is a balancing act between resource efficiency and application performance. Start with data-driven estimates, and refine them over time with iterative adjustments and monitoring. By following best practices, you can ensure that your applications run smoothly without wasting cluster resources.

Et Voilà, now, you’re equipped to effectively optimize resource requests and limits for your Kubernetes pods.”

— — -

I’m Hervé-Gaël KOUAMO, Founder and CTO at HK-TECH, a French tech company specializing in designing, building, and optimizing applications. We also assist businesses throughout their cloud migration journey, ensuring seamless transitions and maximizing their digital potential. You can follow me here or in LinkedIn (Post in French) to receive each sunday my lastest blog Post : https://www.linkedin.com/in/herv%C3%A9-ga%C3%ABl-kouamo-157633197/