背景
业务开发需要修改pod的内核参数,这些参数被认为是 unsafe 的参数,需要修改kubelet 的 --allowed-unsafe-sysctls
中才可以用,同时要把pod指定调度到这些kubelet被修改过的节点。
在忘记设置节点亲和性或者nodeSelector的情况下,直接修改deployment,会造成什么样的问题。下面通过实验复现一遍。
实验
自 k8s 1.12 起,sysctls 特性 beta 并默认开启,允许用户在 pod 的 securityContext 中设置内核参数
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
securityContext:
sysctls:
- name: net.core.somaxconn
value: "1024"
containers:
- name: nginx
image: nginx
创建deplyemnt后,过五分钟后查看,集群创建了上千个pod
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-7fbbcfcc7d-4gmrg 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-6dfpm 0/1 SysctlForbidden 0 17s
nginx-7fbbcfcc7d-6jkdn 0/1 SysctlForbidden 0 14s
nginx-7fbbcfcc7d-6mf6z 0/1 SysctlForbidden 0 16s
nginx-7fbbcfcc7d-6p2hs 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-cd759 0/1 SysctlForbidden 0 12s
nginx-7fbbcfcc7d-ckqbl 0/1 SysctlForbidden 0 16s
nginx-7fbbcfcc7d-gtvq4 0/1 SysctlForbidden 0 16s
nginx-7fbbcfcc7d-jbv2p 0/1 SysctlForbidden 0 18s
nginx-7fbbcfcc7d-jdh84 0/1 SysctlForbidden 0 18s
nginx-7fbbcfcc7d-kmd9p 0/1 SysctlForbidden 0 20s
nginx-7fbbcfcc7d-lcp6k 0/1 SysctlForbidden 0 15s
nginx-7fbbcfcc7d-lsdlx 0/1 SysctlForbidden 0 15s
nginx-7fbbcfcc7d-mbd74 0/1 SysctlForbidden 0 19s
nginx-7fbbcfcc7d-mbjnf 0/1 SysctlForbidden 0 18s
nginx-7fbbcfcc7d-mmbj7 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-n2ndn 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-rhjmp 0/1 SysctlForbidden 0 14s
nginx-7fbbcfcc7d-rznhl 0/1 SysctlForbidden 0 13s
nginx-7fbbcfcc7d-sfrl9 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-t9bkk 0/1 SysctlForbidden 0 19s
nginx-7fbbcfcc7d-vd6x8 0/1 SysctlForbidden 0 17s
nginx-7fbbcfcc7d-vt2jh 0/1 SysctlForbidden 0 21s
nginx-7fbbcfcc7d-w4l7n 0/1 SysctlForbidden 0 20s
nginx-7fbbcfcc7d-w5sgq 0/1 SysctlForbidden 0 14s
nginx-7fbbcfcc7d-wlf2c 0/1 SysctlForbidden 0 13s
nginx-7fbbcfcc7d-xh22t 0/1 SysctlForbidden 0 21s
处理方法
kubectl scale deployment --replicas=0 nginx
kubectl delete pods -l app=nginx
总结
为pod设置内核参数前先创建一个临时pod验证过再去修改deployment,避免创建大批量无效的pod。