Quotas and Resource Limits in Kubernetes

It is always best to have capacity management on any system, particularly when resources are shared. This is particularly important for memory and CPU use as potentially a rogue application can impact other users. Thankfully containerised solutions make use of underlying technology such as kernel cgroups that allow ring-fencing of resources available on the underlying Host.

The two main criteria when allocating resources are the requests that a container makes and it's limits.

The request is used by the scheduler when deciding where to place the POD. This tells the system how much CPU or memory that the application needs to run. If there are no Nodes that have sufficient resources available then the PODs will not be scheduled.

The limit is a value that can be set which will allow an intervention to take place should there be an issue. A POD will be terminated if it should consume more memory then the limit that has been set.

In the case of a burst of CPU the POD could grab some extra CPU if the limit is made larger then the request but it will be limited at some point, rather then killed off.

For predictable behaviour the limit and request value for a container can be made the same. A higher limit will give a bit more headroom should there be occasional peaks in memory or CPU use.

Setting Limit Ranges

By default containers run with unbounded resources and this may be fine if the application is guaranteed not to start grabbing more CPU or memory. However most people responsible for a production system are probably not going to rely on the application not running amok.

This is where setting limits within the manifest file is useful.

We'll start with a simple Deployment but add some settings within the resources section of the YAML.

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: nginx-set-limits
  name: nginx-set-limits
  namespace: default-limits   # Deployed to default-limits namespace
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-set-limits
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nginx-set-limits
    spec:
      containers:
      - image: nginx
        name: nginx
        resources:
          limits:
            memory: "512Mi"   # This will delete the container if it goes over 512Mi of memory
          requests:
            memory: "384Mi"  # This will tell the scheduler to only run the container on a suitable Node

This can be deployed by applying the manifest file.

kubectl apply -f kubectl apply -f nginx-set-limits.yaml
kubectl describe deployment nginx-set-limits

This shows that the containers that have been created as part of the POD within the deployment have had a limitation of memory use as well as a request that the POD will make of the scheduler to ensure that the Node can support the request.

Adding the resource limits and requests within the YAML for each POD is fine but it is also possible to set a default value within a Namespace that will apply set resources to all deployed PODs, even if it is not explicitly stated in the manifest.

Create Default Limits within a Namespace

We'll now create a Limit Range and apply it to the namespace.

apiVersion: v1
kind: LimitRange
metadata:
  labels:
    app: my-nginx
  name: my-limit-range
  namespace: default-limits
spec:
  limits:
  - default:
      memory: 512Mi
    defaultRequest:
      memory: 256Mi
    type: Container

The created LimitRange created will add a default value requesting 256Mi and having a default limit of 512Mi for all future created PODs.

We can test this by creating another deployment without any default resources.

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: my-nginx
  name: my-nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-nginx
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: my-nginx
    spec:
      containers:
      - image: nginx
        name: nginx
        resources: {}

It can be seen that this Deployment will create 3 replicas and we'll apply the --namespace=default-limits to ensure they are created in our namespace which will enforce the default values.

kubectl apply -f my-nginx.yaml --namespace=default-limits
vagrant@k8s-master:~/limitranges/default-limits$ kubectl get pods --namespace=default-limits 
NAME                               READY   STATUS    RESTARTS   AGE
my-nginx-9b596c8c4-g2pjw           1/1     Running   0          2m23s
my-nginx-9b596c8c4-kc6s9           1/1     Running   0          2m23s
my-nginx-9b596c8c4-w6cxj           1/1     Running   0          2m23s
nginx-set-limits-bc785b8c9-2hwxj   1/1     Running   0          38m

We can then describe one of the PODs running as part of the my-nginx deployment to confirm the resource values that have been assigned.

kubectl describe pod my-nginx-9b596c8c4-g2pjw 

We can also confirm that these values are being applied within the namespace.

vagrant@k8s-master:~/limitranges/default-limits$ kubectl describe namespaces default-limits 
Name:         default-limits
Labels:       <none>
Annotations:  Status:  Active

No resource quota.

Resource Limits
 Type       Resource  Min  Max  Default Request  Default Limit  Max Limit/Request Ratio
 ----       --------  ---  ---  ---------------  -------------  -----------------------
 Container  memory    -    -    256Mi            512Mi          -

Setting a Maximum and Minimum Value

As well as setting a default value it is also possible to set a minimum and maximum value that will stop individual containers from consuming too much resource.

The manifest file to create this is very similar to the one that was used to create the default values.

apiVersion: v1
kind: LimitRange
metadata:
  labels:
    app: my-nginx
  name: my-limit-range
  namespace: default-limits
spec:
  limits:
  - default:
      memory: 512Mi
    defaultRequest:
      memory: 256Mi
    max:
      memory: 768Mi
    min:
      memory: 256Mi
    type: Container

Once this is applied to the namespace we can see that the minimum and maximum values for memory use will also be set for all future containers.

This will prevent any container from going above 768mi and below 256Mi of memory when they are created within the default-limits namespace. This is combined with the default values that will always ensure there is a default request of 256Mi and a default limit of 512Mi if there isn't anything explicitly set within the manifest file.

It is possible to overwrite the default resource limits within a POD manifest but the Resource Limits attached to this namespace will not allow more then 768Mi to be allocated.

Creating a Quota

The next step that can be done to control resources being taken up by the running PODs is the use of a Quota within the Namespace. This allows limits to be placed within a Namespace that will limit such things as CPU, memory and even the number of running PODs.

We'll create the following manifest file and apply it within the Namespace.

apiVersion: v1
kind: ResourceQuota
metadata:
  creationTimestamp: null
  name: my-quota
  namespace: default-limits
spec:
  hard:
    memory: 1500M
    pods: "8"

This gives a simple limit of 1.5G of memory and a maximum of 8 PODs that can be run. There are quite a few other things that can be added to the Quota but for our simple example we will just stick with memory and POD numbers.

Note
If something is added to a Quota there must also be a corresponding value for a container. This is another reason for having default values added as without them the POD would not be scheduled on any Node.

We can then take a look at our Namespace and running PODs which will now show our Quota and how close we are getting to how much we can schedule within this particular Namespace.

To prove the Quota is operational we will scale the my-nginx deployment up from 3 PODs up to 5. This is easily done on the fly by scaling the Deployment:

kubectl scale deployment my-nginx --replicas=5 --record

However when we check the Deployment we see that only 4 PODs have been deployed. We can also see that the Quota shows the number of PODs is still within our range but we would go over the memory Quota if the final POD was deployed (it has a request value of 256Mi).

To allow the POD to be run the Quota would need to be changed or something would have to be deleted to give the memory allocation.

Note
The Resource Limits that have been set dictate the actual memory that would be used so it would be possible for the memory used on the Node to peak above these values. To prevent this the limit and request values can be made the same which gives more predictable behaviour.

Conclusions

The use of Resource Limits gives control of how much compute resource will be used in the cluster by containers. It is very possible to explicitly configure these values for all containers and this should be done based on what the application is designed for.

It is best to set a default value that will be used in case a Deployment is created that hasn't been configured. This will ensure that a mis-configuration does not cause a container to grab all the resources on a Node (particularly in a fault scenario). This will ensure that all PODs created within the Namespace will have a sensible amount of resources allocated, neither too big or too small.

The use of Resource Limits can also be combined with a Quota within the Namespace. This is what gives protection against Deployments being made on a Node that will grab too much resource.

Of course the Quota is applied over a Namespace which will deploy the PODs over multiple Nodes. The use of Quotas, Resource Limits and Namespaces gives a means of ensuring that compute resources can be controlled and helps to prevent overloading of the Cluster. As per any system design careful capacity planning must be made to ensure that resources are not made use of without being managed.