worker节点没有k8s.gcr.io/pause:3.3带来的问题
来源:8-2 cicd实践(1)

会飞的小白菜
2022-09-06
今天忽然发件worker不正常:
[root@node-1-bak mycluster]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
node-1-bak Ready master 24h v1.19.7
node-2-bak Ready master 24h v1.19.7
node-3-bak NotReady <none> 24h v1.19.7
查看节点描述和所有pod状态都是正常运行中,后面直接去node-3-bak查看kubelet的状态,kubelet是正常运行的
Sep 06 01:21:18 node-3-bak kubelet[10631]: E0906 01:21:18.048708 10631 kubelet.go:2186] node "node-3-bak" not found
Sep 06 01:21:18 node-3-bak kubelet[10631]: I0906 01:21:18.102505 10631 csi_plugin.go:994] Failed to contact API server when waiting for CSINode publishing: Get "https://l...tion refused
Sep 06 01:38:47 node-3-bak kubelet[10631]: E0906 01:38:47.081227 10631 kubelet.go:2186] node "node-3-bak" not found
Sep 06 01:38:47 node-3-bak kubelet[10631]: I0906 01:38:47.101517 10631 csi_plugin.go:994] Failed to contact API server when waiting for CSINode publishing: Get "https://localhost:6443/apis/storage.k8s.io/v1/csinodes/node-3-bak": dial tcp 127.0.0.1:6443: connect: connection refused
..........
..........
Sep 06 01:38:48 node-3-bak kubelet[10631]: E0906 01:38:48.503435 10631 remote_runtime.go:113] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to get sandbox image "k8s.gcr.io/pause:3.3": failed to pull image "k8s.gcr.io/pause:3.3": failed to pull and unpack image "k8s.gcr.io/pause:3.3": failed to resolve reference "k8s.gcr.io/pause:3.3": failed to do request: Head "https://k8s.gcr.io/v2/pause/manifests/3.3": dial tcp 74.125.204.82:443: i/o timeout
Sep 06 01:38:48 node-3-bak kubelet[10631]: E0906 01:38:48.503495 10631 kuberuntime_sandbox.go:69] CreatePodSandbox for pod "nginx-proxy-node-3-bak_kube-system(4068a5c786f6c77faa6538742baa1d44)" failed: rpc error: code = Unknown desc = failed to get sandbox image "k8s.gcr.io/pause:3.3": failed to pull image "k8s.gcr.io/pause:3.3": failed to pull and unpack image "k8s.gcr.io/pause:3.3": failed to resolve reference "k8s.gcr.io/pause:3.3": failed to do request: Head "https://k8s.gcr.io/v2/pause/manifests/3.3": dial tcp 74.125.204.82:443: i/o timeout
Sep 06 01:38:48 node-3-bak kubelet[10631]: E0906 01:38:48.503519 10631 kuberuntime_manager.go:741] createPodSandbox for pod "nginx-proxy-node-3-bak_kube-system(4068a5c786f6c77faa6538742baa1d44)" failed: rpc error: code = Unknown desc = failed to get sandbox image "k8s.gcr.io/pause:3.3": failed to pull image "k8s.gcr.io/pause:3.3": failed to pull and unpack image "k8s.gcr.io/pause:3.3": failed to resolve reference "k8s.gcr.io/pause:3.3": failed to do request: Head "https://k8s.gcr.io/v2/pause/manifests/3.3": dial tcp 74.125.204.82:443: i/o timeout
Sep 06 01:38:48 node-3-bak kubelet[10631]: E0906 01:38:48.503591 10631 pod_workers.go:191] Error syncing pod 4068a5c786f6c77faa6538742baa1d44 ("nginx-proxy-node-3-bak_kube-system(4068a5c786f6c77faa6538742baa1d44)"), skipping: failed to "CreatePodSandbox" for "nginx-proxy-node-3-bak_kube-system(4068a5c786f6c77faa6538742baa1d44)" with CreatePodSandboxError: "CreatePodSandbox for pod \"nginx-proxy-node-3-bak_kube-system(4068a5c786f6c77faa6538742baa1d44)\" failed: rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.3\": failed to pull image \"k8s.gcr.io/pause:3.3\": failed to pull and unpack image \"k8s.gcr.io/pause:3.3\": failed to resolve reference \"k8s.gcr.io/pause:3.3\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.3\": dial tcp 74.125.204.82:443: i/o timeout
发现k8s.gcr.io/pause:3.3镜像被删掉了,重新pull问题也决解了。但是一开始一直被 Failed to contact API server when waiting for CSINode publishing: Get “https://localhost:6443/apis/storage.k8s.io/v1/csinodes/node-3-bak”: dial tcp 127.0.0.1:6443: connect: connection refused这个报错信息误导了很久,6443是master节点的kube-apiserver在监听,本地是worker子节点本来就没有这个服务。
我主从节点配置都是正确的,node-3-bak只是worker节点,pause:3.3拉取不了和本地为啥要请求127.0.0.1:6443而不是master的节点的呢?
写回答
1回答
-
刘果国
2022-09-07
本地的6443是apiserver的代理,做高可用的方案,以staticpod方式启动。回忆一下课程内容
00
相似问题