Understanding Kubernetes Requests and Limits

Author: Quinn Bast

Date: 8/12/2023, 2:42:00 PM

KubernetesResources

# Understanding Kubernetes Requests and Limits

Kubernetes requests and limits determine how a pod will ask for machine resources like CPU, memory, and more. Requests indicate the amount that a container will be garunteed to have, while the limits indicate the absolute maximum it can use before getting throttled or stopped. Setting these values is VERY IMPORTANT and you should ALWAYS set your resource requests and limits. The reason why these values are so important to set is because of the Kubernetes QoS classes that get applied to pods when particular conditions are met. QoS classes determine how pods are allocated cluster resources and what pods have priority.

Requests Limits
Definition
  • The minimum amount required for your application to run.
  • Kubernetes guarantees that your app will have at least the requested amount of resources (with the possibility of getting more depending on the QoS class).
  • If a machine cannot provide the requested amount of resources your pod will not get scheduled.
  • This is the maximum amount your application can use.
  • If your application uses above the limit:
    • CPU
      • Your application will get throttled
    • Memory
      • Your pod will get OOMKilled and restart itself
Recommendations
  • This should be configured to be the regular amount of resources used during nominal operation.
  • Don’t lowball this number.
  • See the Quality of Service (QoS) Classes below
  • Should be set to the MAXIMUM you want your app to use.
  • See the Quality of Service (QoS) Classes below

# Kubernetes QoS Classes

If a request is set but not a limit, the limit is automatically set to the node’s maximum value.

Garunteed Burstable Best Effort
Condition requests == limits Limit > requests Requests are unset
Definition
  • The pod is guaranteed to have the desired resources.
  • Never killed unless they exceed their limit.
  • Will evict other pods if it can to be able to operate
  • Application is guaranteed the requested amount.
  • Application can use up to the limit amount.
  • May be evicted to make room for Guaranteed QoS applications (if needed).
  • Can use any resources available that remain on the node.
  • Lowest priority – Burstable pods will be prioritized resources.
  • Will get evicted to make room for other pods if they have a higher QoS and need the resources.
  • First to get killed if the cluster runs out of memory/CPU.

# Finding your Pod’s resource usage

Resource usage is only reported if you are running a metrics server (opens new window). I highly recommend running this service in your cluster.

While using k9s (opens new window) or by running kubectl get pods -o wide you will see columns that explain your pod resource utilization:

  • CPU - CPU Measured in millicores
  • MEM - Memory usage in Mb
  • %CPU/R - The percentage of requested CPU used
  • %MEM/R - Percentage of memory used / requested
  • %CPU/L - Percent of CPU limit used
    • If this is >100, the pod is getting CPU throttled
  • %MEM/L - Percent of Memory limit used
    • If this is high, it is very likely that the pod will get OOMKilled soon. Consider increasing the memory limit.