发布时间:2022-09-23 17:30
在阅读kubernetes源码的时候,首先需要能进行源码的debug,这样才能跟着代码一步步梳理其逻辑。下面就先介绍一下kubernetes的源码调试的环境搭建。
kubernetes是使用golang实现的,所以我们需要安装golang的环境,对应的版本关系如下:
Kubernetes requires Go
1.0 - 1.2 1.4.2
1.3, 1.4 1.6
1.5, 1.6 1.7 - 1.7.5
1.7 1.8.1
1.8 1.8.3
1.9 1.9.1
1.10 1.9.1
1.11 1.10.2
1.12 1.10.4
1.13 1.11.13
1.14 - 1.16 1.12.9
1.17 - 1.18 1.13.15
1.19 - 1.20 1.15.5
1.21 - 1.22 1.16.7
1.23+ 1.17
具体如何安装golang,网上的教程已经很多了,大家自行查询。
安装成功后,通过go version
可以查看具体的版本。
调试需要用到delve,安装流程如下:
go install github.com/go-delve/delve/cmd/dlv
ln -s $GOPATH/bin/dlv $GOROOT/bin/dlv
如果没有配置 G O P A T H 或 者 GOPATH或者 GOPATH或者GOROOT,通过go env
查看对应的配置后,替换掉命令中的环境变量即可。
安装成功后,通过dlv version
可以查看具体的版本。
首先拉取kubernetes的源码,可以切换一下分支,确保kubernetes的版本能跟安装的golang版本匹配上:
git clone https://github.com/kubernetes/kubernetes.git
cd kubernetes
然后安装etcd,kubernetes的源码中提供了安装etcd的脚本:
./hack/install-etcd.sh
看到类似如下日志则说明安装成功:
Downloading https://github.com/coreos/etcd/releases/download/v3.5.1/etcd-v3.5.1-linux-amd64.tar.gz succeed
etcd v3.5.1 installed. To use:
export PATH="/tmp/kubernetes/third_party/etcd:${PATH}"
按照日志中的提示进行操作:
export PATH="/tmp/kubernetes/third_party/etcd:${PATH}"
然后通过./hack/local-up-cluster.sh
就可以在本地启动一个kubernetes集群,但是这样编译出来的代码是没有调试信息的,因此我们需要对脚本进行一点小的调整:
vim hack/local-up-cluster.sh
# 找到类似如下的代码
if [ "x${GO_OUT}" == "x" ]; then
# make -C "${KUBE_ROOT}" WHAT="cmd/kubectl cmd/kube-apiserver cmd/kube-controller-manager cmd/cloud-controller-manager cmd/kubelet cmd/kube-proxy cmd/kube-scheduler"
# 注释掉上面一行代码,然后在make时增加DBG=1的参数,该参数的说明可以参看Makefile中的描述。
# Note: Specify DBG=1 for building unstripped binaries, which allows you to use code debugging
# tools like delve. When DBG is unspecified, it defaults to "-s -w" which strips debug
# information
make DBG=1 -C "${KUBE_ROOT}" WHAT="cmd/kubectl cmd/kube-apiserver cmd/kube-controller-manager cmd/cloud-controller-manager cmd/kubelet cmd/kube-proxy cmd/kube-scheduler"
else
echo "skipped the build."
fi
修改完成后,执行该脚本:
./hack/local-up-cluster.sh
此处如果启动失败,请检查自己是否可以科学上网。如果看到如下信息,则说明启动完成:
To start using your cluster, you can open up another terminal/tab and run:
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
cluster/kubectl.sh
Alternatively, you can write to the default kubeconfig:
export KUBERNETES_PROVIDER=local
cluster/kubectl.sh config set-cluster local --server=https://localhost:6443 --certificate-authority=/var/run/kubernetes/server-ca.crt
cluster/kubectl.sh config set-credentials myself --client-key=/var/run/kubernetes/client-admin.key --client-certificate=/var/run/kubernetes/client-admin.crt
cluster/kubectl.sh config set-context local --cluster=local --user=myself
cluster/kubectl.sh config use-context local
cluster/kubectl.sh
按照提示导出环境变量之后,就可以通过./cluster/kubectl.sh
来访问kubernetes集群:
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
# 查询pods
./cluster/kubectl.sh get pods
我们以需要调试kube-apiserver
为例,首先查看kube-apiserver
进程的启动命令:
ps -ef | grep apiserver
# 可以看到类似如下的输出信息
root 15649 13594 9 19:59 pts/3 00:00:17 /tmp/kubernetes/_output/local/bin/linux/amd64/kube-apiserver --authorization-mode=Node,RBAC --cloud-provider= --cloud-config= --v=3 --vmodule= --audit-policy-file=/tmp/kube-audit-policy-file --audit-log-path=/tmp/kube-apiserver-audit.log --authorization-webhook-config-file= --authentication-token-webhook-config-file= --cert-dir=/var/run/kubernetes --egress-selector-config-file=/tmp/kube_egress_selector_configuration.yaml --client-ca-file=/var/run/kubernetes/client-ca.crt --kubelet-client-certificate=/var/run/kubernetes/client-kube-apiserver.crt --kubelet-client-key=/var/run/kubernetes/client-kube-apiserver.key --service-account-key-file=/tmp/kube-serviceaccount.key --service-account-lookup=true --service-account-issuer=https://kubernetes.default.svc --service-account-jwks-uri=https://kubernetes.default.svc/openid/v1/jwks --service-account-signing-key-file=/tmp/kube-serviceaccount.key --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,Priority,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,NodeRestriction --disable-admission-plugins= --admission-control-config-file= --bind-address=0.0.0.0 --secure-port=6443 --tls-cert-file=/var/run/kubernetes/serving-kube-apiserver.crt --tls-private-key-file=/var/run/kubernetes/serving-kube-apiserver.key --storage-backend=etcd3 --storage-media-type=application/vnd.kubernetes.protobuf --etcd-servers=http://127.0.0.1:2379 --service-cluster-ip-range=10.0.0.0/24 --feature-gates=AllAlpha=false --external-hostname=localhost --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-client-ca-file=/var/run/kubernetes/request-header-ca.crt --requestheader-allowed-names=system:auth-proxy --proxy-client-cert-file=/var/run/kubernetes/client-auth-proxy.crt --proxy-client-key-file=/var/run/kubernetes/client-auth-proxy.key --cors-allowed-origins=/127.0.0.1(:[0-9]+)?$,/localhost(:[0-9]+)?$
这时候我们需要调试apiserver就很简单了,首先kill
掉当前的apiserver的进程,然后通过delve来启动就可以了:
kill -9 15649
dlv --listen=:2345 --headless=true --api-version=2 exec /tmp/kubernetes/_output/local/bin/linux/amd64/kube-apiserver -- --authorization-mode=Node,RBAC --cloud-provider= --cloud-config= --v=3 --vmodule= --audit-policy-file=/tmp/kube-audit-policy-file --audit-log-path=/tmp/kube-apiserver-audit.log --authorization-webhook-config-file= --authentication-token-webhook-config-file= --cert-dir=/var/run/kubernetes --egress-selector-config-file=/tmp/kube_egress_selector_configuration.yaml --client-ca-file=/var/run/kubernetes/client-ca.crt --kubelet-client-certificate=/var/run/kubernetes/client-kube-apiserver.crt --kubelet-client-key=/var/run/kubernetes/client-kube-apiserver.key --service-account-key-file=/tmp/kube-serviceaccount.key --service-account-lookup=true --service-account-issuer=https://kubernetes.default.svc --service-account-jwks-uri=https://kubernetes.default.svc/openid/v1/jwks --service-account-signing-key-file=/tmp/kube-serviceaccount.key --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,Priority,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,NodeRestriction --disable-admission-plugins= --admission-control-config-file= --bind-address=0.0.0.0 --secure-port=6443 --tls-cert-file=/var/run/kubernetes/serving-kube-apiserver.crt --tls-private-key-file=/var/run/kubernetes/serving-kube-apiserver.key --storage-backend=etcd3 --storage-media-type=application/vnd.kubernetes.protobuf --etcd-servers=http://127.0.0.1:2379 --service-cluster-ip-range=10.0.0.0/24 --feature-gates=AllAlpha=false --external-hostname=localhost --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-client-ca-file=/var/run/kubernetes/request-header-ca.crt --requestheader-allowed-names=system:auth-proxy --proxy-client-cert-file=/var/run/kubernetes/client-auth-proxy.crt --proxy-client-key-file=/var/run/kubernetes/client-auth-proxy.key --cors-allowed-origins="/127.0.0.1(:[0-9]+)?$,/localhost(:[0-9]+)?$"
# 看到如下信息就说明启动成功
API server listening at: [::]:2345
2022-04-06T20:05:19+08:00 warning layer=rpc Listening for remote connections (connections are not authenticated nor encrypted)
然后就是使用ide进行远程调试,我这里使用的是goland。首先配置go的远程调试,此处填写远程或本地的主机地址,以及上述步骤中dlv
监控的端口: