23.修神篇:高级调度机制之topologyKey
23.修神篇:高级调度机制之topologyKey
上一章节中,我们谈到了预选和优选策略,也谈到了节点亲和度和pod亲和度,最后还抛出了一个关键字段:topologyKey。
这一章节,让我们来全面学习一下这个字段的原理和用法。
1. What is topologyKey?
首先,我们得先了解一下,topologyKey是什么?
原则上,topologyKey可以是任何合法的标签密钥。 但是,出于性能和安全性原因,topologyKey受到以下一些限制:
1>对于亲和关系,以及pod反亲和的硬亲和条件requiredDuringSchedulingIgnoredDuringExecution时,topologyKey不允许为空;
2>由于准入控制器LimitPodHardAntiAffinityTopology的存在,如果计划在pod反亲和的requiredDuringSchedulingIgnoredDuringExecution中使用,则需要修改准入控制器或者直接禁用它;
3>对于pod反亲和中的软亲和:preferredDuringSchedulingIgnoredDuringExecution,如果没指定topologyKey的值,将会使用kubernetes.io/hostname, failure-domain.beta.kubernetes.io/zone和failure-domain.beta.kubernetes.io/region这三个内建的字段值;
除上述情况外,topologyKey可以是任何合法的标签密钥。
2.topologyKey原理
apiVersion: v1 kind: Pod metadata: name: with-pod-affinity spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - S1 topologyKey: failure-domain.beta.kubernetes.io/zone podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: security operator: In values: - S2 topologyKey: failure-domain.beta.kubernetes.io/zone containers: - name: with-pod-affinity image: k8s.gcr.io/pause:2.0
以上是topologyKey的demo和我画的对应调度图,具体说明如下:
1>首先podAffinity期望必须调度至运行有标签是security=s1的pod的节点上,且颗粒度是zone。
所以图中zone=foo或者zone=bar都可以。
特别注意:要么都调度至zone=foo,要么都调度至zone=bar中,混合调度就不成立了
2>然后podAntiAffinity,期望最好不要调度至运行了标签为security=S2的pod的node节点上(颗粒度为主机)
3>综上所述,只会调度在node-3或者node-7上面!
补充说明:
pod affinity and anti-affinity的逻辑表达式(operator)分为:In, NotIn, Exists, DoesNotExist
3.实战:PodAffinity
1)环境准备:有一个pod运行在centos-2.shared上,且标签为app=ngx-new
[root@centos-1 dingqishi]# kubectl get pod -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS ngx-new-cb79d555-2c7qq 1/1 Running 0 44h 10.244.1.7 centos-2.shared <none> <none> app=ngx-new,pod-template-hash=cb79d555
2)编辑deploy-with-required-podAffinity.yaml,我们希望这些pod可以调度至有app=ngx-new标签的pod的节点上,并且颗粒度是zone的那些节点
apiVersion: apps/v1 kind: Deployment metadata: name: myapp-with-pod-affinity spec: replicas: 3 selector: matchLabels: app: myapp template: metadata: name: myapp labels: app: myapp spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: #硬亲和,表示希望调度到有app<in>ngx-new标签的pod的节点上,并且颗粒度是zone - labelSelector: matchExpressions: - {key: app, operator: In, values: ["ngx-new"]} topologyKey: zone containers: - name: myapp image: nginx
3)给2个节点打标签,并且划分不通的zone,预期新pod只会调度至centos-2.shared上
kubectl label nodes centos-2.shared zone=foo kubectl label nodes centos-3.shared zone=bar
注意:如果centos-3.shared也是zone=foo,新pod也会调度到上面;
因为此时,centos-3.shared和centos-2.shared两个节点处于同一位置(zone)
4)apply上面的deploy-with-required-podAffinity.yaml,发现新pod都调度至centos-2.shared上了
[root@centos-1 dingqishi]# kubectl get pod -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS myapp-with-pod-affinity-778f46bf4-92fxq 1/1 Running 0 16m 10.244.1.2 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4 myapp-with-pod-affinity-778f46bf4-gwv6z 1/1 Running 0 16m 10.244.1.3 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4 myapp-with-pod-affinity-778f46bf4-lcvz5 1/1 Running 0 16m 10.244.1.4 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4 ngx-new-cb79d555-2c7qq 1/1 Running 0 44h 10.244.1.7 centos-2.shared <none> <none> app=ngx-new,pod-template-hash=cb79d555
5)将centos-3.shared的标签也修改为zone=foo,这时候我们delete-f,并重新apply.
发现centos-2.shared和centos-3.shared都会被调度到,和预期一样
#修改 centos-3.shared标签 [root@centos-1 dingqishi]# kubectl label nodes centos-3.shared zone=foo --overwrite node/centos-3.shared labeled [root@centos-1 dingqishi]# kubectl get node --show-labels NAME STATUS ROLES AGE VERSION LABELS centos-1.shared Ready master 16d v1.16.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=centos-1.shared,kubernetes.io/os=linux,node-role.kubernetes.io/master= centos-2.shared Ready <none> 16d v1.16.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=centos-2.shared,kubernetes.io/os=linux,zone=foo centos-3.shared Ready <none> 16d v1.16.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=centos-3.shared,kubernetes.io/os=linux,zone=foo #重新apply [root@centos-1 dingqishi]# kubectl apply -f deploy-with-required-podAffinity.yaml deployment.apps/myapp-with-pod-affinity created #观察pod部署情况:发现centos-2.shared和centos-3.shared都会被调度到,和预期一样 [root@centos-1 dingqishi]# kubectl get pod -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS myapp-with-pod-affinity-778f46bf4-2rqnj 1/1 Running 0 34s 10.244.2.8 centos-3.shared <none> <none> app=myapp,pod-template-hash=778f46bf4 myapp-with-pod-affinity-778f46bf4-fjfpr 1/1 Running 0 34s 10.244.2.7 centos-3.shared <none> <none> app=myapp,pod-template-hash=778f46bf4 myapp-with-pod-affinity-778f46bf4-tb8v7 1/1 Running 0 34s 10.244.1.5 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4 ngx-new-cb79d555-2c7qq 1/1 Running 0 44h 10.244.1.7 centos-2.shared <none> <none> app=ngx-new,pod-template-hash=cb79d555
知识点回顾:
pod亲和度,pod对pod的亲和性,表示是否愿意与相关pod调度至一个区域(可以是node、机架、也可以是机房)
如何定义同一区域,则需要使用本章节提及的topologyKey(v1.16特性)进行标识
@版权声明:51CTO独家出品,未经允许不能转载,否则追究法律责任