Overview

Using quotas and limit ranges, cluster
administrators can set constraints to limit the number of objects or amount of
compute resources that are used in your project. This helps cluster
administrators better manage and allocate resources across all projects, and
ensure that no projects are using more than is appropriate for the cluster size.

The following sections help you understand how to check on your quota and limit
range settings, what sorts of things they can constrain, and how you can request
or limit compute resources in your own pods and containers.

Quotas

A resource quota, defined by a ResourceQuota object, provides constraints
that limit aggregate resource consumption per project. It can limit the quantity
of objects that can be created in a project by type, as well as the total amount
of compute resources and storage that may be consumed by resources in that project.

Quotas are set by cluster administrators and are scoped to a given project.

Viewing Quotas

You can view usage statistics related to any hard limits defined in a project’s
quota by navigating in the web console to the project’s Quota page.

You can also use the CLI to view quota details:

First, get the list of quotas defined in the project. For example, for a project
called demoproject:

Across all pods in a non-terminal state, the sum of CPU limits cannot exceed this value.

3

Across all pods in a non-terminal state, the sum of memory limits cannot exceed this value.

4

Restricts the quota to only matching pods where spec.activeDeadlineSeconds >=0. For example,
this quota would charge for build or deployer pods, but not long running pods like a web server or database.

The total number of persistent volume claims with a matching storage class that can exist in the project.

Table 3. Object Counts Managed by Quota

Resource Name

Description

pods

The total number of pods in a non-terminal state that can exist in the project.

replicationcontrollers

The total number of replication controllers that can exist in the project.

resourcequotas

The total number of resource quotas that can exist in the project.

services

The total number of services that can exist in the project.

secrets

The total number of secrets that can exist in the project.

configmaps

The total number of ConfigMap objects that can exist in the project.

persistentvolumeclaims

The total number of persistent volume claims that can exist in the project.

openshift.io/imagestreams

The total number of image streams that can exist in the project.

Quota Scopes

Each quota can have an associated set of scopes. A quota will only
measure usage for a resource if it matches the intersection of enumerated
scopes.

Adding a scope to a quota restricts the set of resources to which that quota can
apply. Specifying a resource outside of the allowed set results in a validation
error.

Scope

Description

Terminating

Match pods where spec.activeDeadlineSeconds >= 0.

NotTerminating

Match pods where spec.activeDeadlineSeconds is nil.

BestEffort

Match pods that have best effort quality of service for either cpu or
memory. See the Quality of
Service Classes for more on committing compute resources.

NotBestEffort

Match pods that do not have best effort quality of service for cpu and
memory.

A BestEffort scope restricts a quota to limiting the following resources:

pods

A Terminating, NotTerminating, and NotBestEffort scope restricts a quota
to tracking the following resources:

pods

memory

requests.memory

limits.memory

cpu

requests.cpu

limits.cpu

Quota Enforcement

After a resource quota for a project is first created, the project restricts the
ability to create any new resources that may violate a quota constraint until it
has calculated updated usage statistics.

After a quota is created and usage statistics are updated, the project accepts
the creation of new content. When you create or modify resources, your quota
usage is incremented immediately upon the request to create or modify the
resource.

When you delete a resource, your quota use is decremented during the next full
recalculation of quota statistics for the project.
If project modifications exceed a quota usage limit, the server denies the
action. An appropriate error message is returned explaining the quota constraint
violated, and what your currently observed usage stats are in the system.

Requests vs Limits

When allocating
compute
resources, each container may specify a request and a limit value each for
CPU and memory. Quotas can restrict any of these values.

If the quota has a value specified for requests.cpu or requests.memory,
then it requires that every incoming container make an explicit request for
those resources. If the quota has a value specified for limits.cpu or
limits.memory, then it requires that every incoming container specify an
explicit limit for those resources.

See Compute Resources for more on setting requests
and limits in pods and containers.

Limit Ranges

A limit range, defined by a LimitRange object, enumerates
compute resource
constraints in a project at the pod,
container, image, image stream, and persistent volume claim level, and specifies the amount of resources
that a pod, container, image, image stream, or persistent volume claim can consume.

All resource create and modification requests are evaluated against each
LimitRange object in the project. If the resource violates any of the
enumerated constraints, then the resource is rejected. If the resource does not
set an explicit value, and if the constraint supports a default value, then the
default value is applied to the resource.

Limit ranges are set by cluster administrators and are scoped to a given
project.

Viewing Limit Ranges

You can view any limit ranges defined in a project by navigating in the web
console to the project’s Quota page.

You can also use the CLI to view limit range details:

First, get the list of limit ranges defined in the project. For example, for a
project called demoproject:

$ oc get limits -n demoproject
NAME AGE
resource-limits 6d

Then, describe the limit range you are interested in, for example the
resource-limits limit range:

Container Limits

Min[resource] less than or equal to container.resources.requests[resource]
(required) less than or equal to container/resources.limits[resource]
(optional)

If the configuration defines a min CPU, then the request value must be greater
than the CPU value. A limit value does not need to be specified.

Max

container.resources.limits[resource] (required) less than or equal to
Max[resource]

If the configuration defines a max CPU, then you do not need to define a
request value, but a limit value does need to be set that satisfies the maximum
CPU constraint.

MaxLimitRequestRatio

MaxLimitRequestRatio[resource] less than or equal to (
container.resources.limits[resource] /
container.resources.requests[resource])

If a configuration defines a maxLimitRequestRatio value, then any new
containers must have both a request and limit value. Additionally,
OpenShift Container Platform calculates a limit to request ratio by dividing the limit by the
request. This value should be a non-negative integer greater than 1.

For example, if a container has cpu: 500 in the limit value, and
cpu: 100 in the request value, then its limit to request ratio for cpu is
5. This ratio must be less than or equal to the maxLimitRequestRatio.

Pod Limits

Min[resource] less than or equal to container.resources.requests[resource]
(required) less than or equal to container.resources.limits[resource]
(optional)

Max

container.resources.limits[resource] (required) less than or equal to
Max[resource]

MaxLimitRequestRatio

MaxLimitRequestRatio[resource] less than or equal to (
container.resources.limits[resource] /
container.resources.requests[resource])

Compute Resources

Each container running on a node consumes compute resources, which are
measurable quantities that can be requested, allocated, and consumed.

When authoring a pod configuration file, you can optionally specify how much CPU
and memory (RAM) each container needs in order to better schedule pods in the
cluster and ensure satisfactory performance.

CPU is measured in units called millicores. Each node in a cluster inspects the
operating system to determine the amount of CPU cores on the node, then
multiplies that value by 1000 to express its total capacity. For example, if a
node has 2 cores, the node’s CPU capacity would be represented as 2000m. If you
wanted to use 1/10 of a single core, it would be represented as 100m.

Memory is measured in bytes. In addition, it may be used with SI suffices (E, P,
T, G, M, K) or their power-of-two-equivalents (Ei, Pi, Ti, Gi, Mi, Ki).

CPU Requests

Each container in a pod can specify the amount of CPU it requests on a node. The
scheduler uses CPU requests to find a node with an appropriate fit for a
container.

The CPU request represents a minimum amount of CPU that your container may
consume, but if there is no contention for CPU, it can use all available CPU on
the node. If there is CPU contention on the node, CPU requests provide a
relative weight across all containers on the system for how much CPU time the
container may use.

On the node, CPU requests map to Kernel CFS shares to enforce this behavior.

CPU Limits

Each container in a pod can specify the amount of CPU it is limited to use on a node. CPU limits control the maximum amount of CPU that your container may use independent of contention on the node. If a container attempts to exceed the specified limit, the system will throttle the container. This allows the container to have a consistent level of service independent of the number of pods scheduled to the node.

Memory Requests

By default, a container is able to consume as much memory on the node as possible. In order to improve placement of pods in the cluster, specify the amount of memory required for a container to run. The scheduler will then take available node memory capacity into account prior to binding your pod to a node. A container is still able to consume as much memory on the node as possible even when specifying a request.

Memory Limits

If you specify a memory limit, you can constrain the amount of memory the container can use. For example, if you specify a limit of 200Mi, a container will be limited to using that amount of memory on the node. If the container exceeds the specified memory limit, it will be terminated and potentially restarted dependent upon the container restart policy.

Quality of Service Tiers

A compute resource is classified with a quality of service (QoS) based on the
specified request and limit value.

Quality of Service

Description

BestEffort

Provided when a request and limit are not specified.

Burstable

Provided when a request is specified that is less than an optionally specified
limit.

Guaranteed

Provided when a limit is specified that is equal to an optionally specified
request.

If a container has requests and limits set that would result in a different quality of service for each compute resource, it will be classified as Burstable.

The quality of service has different impacts on different resources, depending
on whether the resource is compressible or not. CPU is a compressible resource,
whereas memory is an incompressible resource.

With CPU Resources:

A BestEffort CPU container is able to consume as much CPU as is available on
a node but runs with the lowest priority.

A Burstable CPU container is guaranteed to get the minimum amount of CPU
requested, but it may or may not get additional CPU time. Excess CPU resources
are distributed based on the amount requested across all containers on the node.

A Guaranteed CPU container is guaranteed to get the amount requested and no
more, even if there are additional CPU cycles available. This provides a
consistent level of performance independent of other activity on the node.

With Memory Resources:

A BestEffort memory container is able to consume as much memory as is
available on the node, but there are no guarantees that the scheduler will place
that container on a node with enough memory to meet its needs. In addition, a
BestEffort container has the greatest chance of being killed if there
is an out of memory event on the node.

A Burstable memory container is scheduled on the node to get the amount of
memory requested, but it may consume more. If there is an out of memory event on
the node, Burstable containers are killed after BestEffort containers when
attempting to recover memory.

A Guaranteed memory container gets the amount of memory requested, but no
more. In the event of an out of memory event, it will only be killed if there
are no more BestEffort or Burstable containers on the system.

Specifying Compute Resources via CLI

Opaque Integer Resources

Opaque integer resources allow cluster operators to provide new node-level
resources that would be otherwise unknown to the system. Users can consume these
resources in pod specifications, similar to CPU and memory. The scheduler performs
resource accounting so that no more than the available amount is
simultaneously allocated to pods.

Opaque integer resources are Alpha currently, and only resource accounting is
implemented. There is no resource quota or limit range support for these
resources, and they have no impact on QoS.

Opaque integer resources are called opaque because OpenShift Container Platform
does not know what the resource is, but will schedule a pod on a node
only if enough of that resource is available. They are called integer resources
because they must be available, or advertised, in integer amounts. The API server
restricts quantities of these resources to whole numbers. Examples of
valid quantities are 3, 3000m, and 3Ki.

The cluster administrator is usually responsible for creating the resources and making them available.
For more information on creating opaque integer resources, see Opaque Integer Resources in the Administrator Guide.

To consume an opaque integer resource in a pod, edit the pod to
include the name of the opaque resource as a key in the spec.containers[].resources.requests field.

For example: The following pod requests two CPUs and one foo (an opaque resource).

The pod will be scheduled only if all of the resource requests are satisfied
(including CPU, memory, and any opaque resources). The pod will remain in the
PENDING state while the resource request cannot be met by any node.