[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

preferredDuringSchedulingIgnoredDuringExecution rule seems to be ignored



Hello,

I have a question regarding preferred anti-affinity pod placement rules.

I have the following situation: OpenShift 3.11 cluster with 3 masters and 6 nodes (each 62GB and 32 cores [node-932 with 86GB/44cores]) with spare capacities as follows (output from 'oc describe node'):

node-910 (62GB):
  cpu       8465m (26%)    9225m (28%)
  memory    14042Mi (21%)  13762Mi (21%)

node-911 (62GB):
  cpu       13580m (42%)       15440m (48%)
  memory    21917944928 (32%)  27898900096 (41%)

node-912 (62GB):
  cpu       8210m (25%)    8220m (25%)
  memory    12808Mi (19%)  12728Mi (19%)

node-913 (62GB):
  cpu       4210m (13%)   4220m (13%)
  memory    8712Mi (13%)  8632Mi (13%)

node-914 (62GB):
  cpu       4250m (13%)        4280m (13%)
  memory    10460252672 (15%)  11208594432 (16%)

node-915 (62GB):
  cpu       8430m (26%)    9240m (28%)
  memory    12086Mi (18%)  11806Mi (18%)

node-930 (62GB):
  cpu       8310m (25%)    8220m (25%)
  memory    13064Mi (20%)  12728Mi (19%)

node-932 (86GB):
  cpu       210m (0%)   220m (0%)
  memory    520Mi (0%)  440Mi (0%)


There's a 3-node Mongo cluster running, with pods placed as follows:

node-913:
  mongo-2-5-zl7gh

node-914:
  mongo-1-5-crz4l

node-915:
  mongo-0-5-w7mbr

There are other pods on the nodes, but they aren't referenced in my anti-affinity rule, so I've omitted them for simplicity.

The mongo-[012] pods' contain a service label as follows:

  labels:
[..]
    service: mongo

The config for our app-a deployment contains the following specs and affinity rules:

[..]
  template:
    metadata:
[..]
      labels:
        name: app-a
        service: app
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: service
                      operator: In
                      values:
                        - mongo
                topologyKey: kubernetes.io/hostname
              weight: 100
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: name
                    operator: In
                    values:
                      - app-a
              topologyKey: kubernetes.io/hostname
      containers:
[..]
          resources:
            limits:
              cpu: '8'
              memory: 12Gi
            requests:
              cpu: '8'
              memory: 12Gi
[..]



So new app-a pods should quite comfortably fit unto any of the nodes.

It appears that the requiredDuringSchedulingIgnoredDuringExecution anti-affinity rule is being observed as expected. Two app-a pods are never deployed to the same node.

However the preferredDuringSchedulingIgnoredDuringExecution rule seems to be ignored by the scheduler.

If I deploy four app-a pods in the situation above, they are deployed to nodes 910, 913, 914 and 932, thus pairing them up with mongo pods on nodes 913 and 914.

Is this the expected behaviour?


Thanks,

Andre


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]