[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: About overcommiting nodes


The document is correct as authored.

The --request is the minimum scheduling guarantee for the compute resource.  The --limit is the maximum amount of compute resource on the local node before being throttled (it is not a guarantee). 

If a limit is specified, but the request is not, the request will default to the limit for that compute resource - I suspect this is what you are encountering.

The ratio of limit/request defines the amount any single container may burst for the specified compute resource.  If the limit is unspecified, you actually may burst up the available node capacity. 

If you do this with memory, you are subject to inducing OOM and the OOMKiller will target offenders based on the QoS guarantee.  For CPU, time is allotted based on the request and is throttled at the limit.  In the future, we will look to more proactively evict containers before inducing OOM and depending on the OS OOMKiller.

In your example 8 CPU node scenario, if each container --requests=cpu=100m and --limits=cpu=500m, the scheduler will schedule up to 80 containers before CPU is exhausted. 

Each container is guaranteed to get a min of 100m CPU, but absent contention, containers may burst up to 500m.

In this scenario, the node is scheduled to its capacity of 8 CPUs, but its burstable limits is 40 CPUs.  This means in practice you have a 5:1 overcommit ratio.

You may find this example I wrote helpful as well:

Hope this helps,

On Fri, Nov 20, 2015 at 6:28 AM, v <vekt0r7 gmx net> wrote:

this is from the docs:
"A node is overcommmitted when it has a pod scheduled that makes no request, or when the sum of limits across all pods on that node exceeds available machine capacity"

What I find interesting is the second part of the sentence with the available machine capacity.
I didn't know it was possible to overcommit nodes like that. What I experienced was that when machine capacity (in our example 8 CPUs) was exhausted, it wasn't possible to schedule more stuff onto that node. It wouldn't make much sense any other way because you can't guarantee a pod that it will have access to 500m CPU when that CPU is overcommitted - it wouldn't be a "guarantee" any more.

I think that this is an error in the docs and the only way to overcommit a node is by scheduling pods that make no requests. Is that correct?


users mailing list
users lists openshift redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]