[已解决]calico-node处于Init:0/3状态13个小时,coredns也处于pending状态11天

来源:5-7 网络插件-Calico_1

慕移动7138520

2021-10-18

老师您好,11天前,因为node3节点一直有问题,但是由于node1和node2节点是正常的,所以在给您提问了这个问题后https://coding.imooc.com/learn/questiondetail/V21046QJG3RPmxQw.html【已解决】,就继续学习后面5-7和5-8的课程给node2安装了calico和coredns,并且安装成功。
昨晚经过老师提供的思路解决了node3节点的问题后,通过kubectl get nodes查看,发现已经找到了node3节点。

[root@k8s-node1 temp]# kubectl get nodes
NAME        STATUS     ROLES    AGE   VERSION
k8s-node2   Ready      <none>   11d   v1.20.2
k8s-node3   NotReady   <none>   13h   v1.20.2

然后通过命令kubectl get pods发现,已经自动出现了node3节点的calico和coredns,所以我也就一直等待。

[root@k8s-node1 temp]# kubectl get pods -n kube-system -o wide
NAME                                       READY   STATUS              RESTARTS   AGE   IP               NODE        NOMINATED NODE   READINESS GATES
calico-kube-controllers-659bd7879c-4b7wj   1/1     Running             0          11d   10.200.169.129   k8s-node2   <none>           <none>
calico-node-2s45s                          1/1     Running             0          11d   192.168.56.102   k8s-node2   <none>           <none>
calico-node-w5q9w                          0/1     Init:0/3            0          13h   192.168.56.103   k8s-node3   <none>           <none>
coredns-84646c885d-8xmrx                   0/1     Pending             0          11d   <none>           <none>      <none>           <none>
coredns-84646c885d-zmrd2                   1/1     Running             0          11d   10.200.169.130   k8s-node2   <none>           <none>
nginx-proxy-k8s-node3                      1/1     Running             0          13h   192.168.56.103   k8s-node3   <none>           <none>
nodelocaldns-bf7s5                         0/1     ContainerCreating   0          13h   192.168.56.103   k8s-node3   <none>           <none>
nodelocaldns-nr4q2                         1/1     Running             0          11d   192.168.56.102   k8s-node2   <none>           <none>

不过截止到现在已经过去了12个小时,发现calico依然处于Init:0/3的状态,同时也发现coredns已经处于pending状态11天了。

然后看其他人提的问题,发现可以使用kubectl describe pods -n kube-system来看pods状态

[root@k8s-node1 temp]# kubectl describe pods -n kube-system
Name:                 calico-kube-controllers-659bd7879c-4b7wj
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 k8s-node2/192.168.56.102
Start Time:           Wed, 06 Oct 2021 22:50:44 +0800
Labels:               k8s-app=calico-kube-controllers
                      pod-template-hash=659bd7879c
Annotations:          cni.projectcalico.org/containerID: 0314f9db3a1949c8a058fe1522310d3a9a1694f3935b9e15d3c1b3096e6a0041
                      cni.projectcalico.org/podIP: 10.200.169.129/32
                      cni.projectcalico.org/podIPs: 10.200.169.129/32
Status:               Running
IP:                   10.200.169.129
IPs:
  IP:           10.200.169.129
Controlled By:  ReplicaSet/calico-kube-controllers-659bd7879c
Containers:
  calico-kube-controllers:
    Container ID:   containerd://120129bf292be2f8be99aff3c0098b0d2b6076f49d2182957b45fc4398a94c01
    Image:          docker.io/calico/kube-controllers:v3.20.2
    Image ID:       docker.io/calico/kube-controllers@sha256:985620c7a0cf75ebf7b23960dda7594fe40cf00bc2ef59c935c581b1b4fc4c63
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 06 Oct 2021 23:13:06 +0800
    Ready:          True
    Restart Count:  0
    Liveness:       exec [/usr/bin/check-status -l] delay=10s timeout=10s period=10s #success=1 #failure=6
    Readiness:      exec [/usr/bin/check-status -r] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ENABLED_CONTROLLERS:  node
      DATASTORE_TYPE:       kubernetes
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from calico-kube-controllers-token-bmnqh (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  calico-kube-controllers-token-bmnqh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-kube-controllers-token-bmnqh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     CriticalAddonsOnly op=Exists
                 node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>


Name:                 calico-node-2s45s
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 k8s-node2/192.168.56.102
Start Time:           Wed, 06 Oct 2021 22:44:57 +0800
Labels:               controller-revision-hash=66b976f488
                      k8s-app=calico-node
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   192.168.56.102
IPs:
  IP:           192.168.56.102
Controlled By:  DaemonSet/calico-node
Init Containers:
  upgrade-ipam:
    Container ID:  containerd://33390191b5c986a94a49064d779b74dc39e44b511a0e07b8fcfa0df5780eea9d
    Image:         docker.io/calico/cni:v3.20.2
    Image ID:      docker.io/calico/cni@sha256:3dae26de7388ff3c124b9abd7e8d12863887391f23549c4aebfbfa20cc0700a5
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/cni/bin/calico-ipam
      -upgrade
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 06 Oct 2021 22:50:30 +0800
      Finished:     Wed, 06 Oct 2021 22:50:30 +0800
    Ready:          True
    Restart Count:  0
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      KUBERNETES_NODE_NAME:        (v1:spec.nodeName)
      CALICO_NETWORKING_BACKEND:  <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
    Mounts:
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/lib/cni/networks from host-local-net-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
  install-cni:
    Container ID:  containerd://1b709ccb0342fbe6dd3dad0bd9f5907f55cdebeae1100c4230aed3a687836c86
    Image:         docker.io/calico/cni:v3.20.2
    Image ID:      docker.io/calico/cni@sha256:3dae26de7388ff3c124b9abd7e8d12863887391f23549c4aebfbfa20cc0700a5
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/cni/bin/install
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 06 Oct 2021 22:50:31 +0800
      Finished:     Wed, 06 Oct 2021 22:50:31 +0800
    Ready:          True
    Restart Count:  0
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      CNI_CONF_NAME:         10-calico.conflist
      CNI_NETWORK_CONFIG:    <set to the key 'cni_network_config' of config map 'calico-config'>  Optional: false
      KUBERNETES_NODE_NAME:   (v1:spec.nodeName)
      CNI_MTU:               <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      SLEEP:                 false
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
  flexvol-driver:
    Container ID:   containerd://4ddd39c0a52d8861734d7d35231c3f8cece15d144f6c5c037602aee3e00833d1
    Image:          docker.io/calico/pod2daemon-flexvol:v3.20.2
    Image ID:       docker.io/calico/pod2daemon-flexvol@sha256:4514cc1b2f7536fe1d514fad8a2c46103382243bb9db5ea2d0063fcd2001e8f7
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 06 Oct 2021 22:51:13 +0800
      Finished:     Wed, 06 Oct 2021 22:51:13 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /host/driver from flexvol-driver-host (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
Containers:
  calico-node:
    Container ID:   containerd://4120e37dcf4a7eedcd6d9a4b719c870334f5ad50962bc268d4971d1be34f5072
    Image:          docker.io/calico/node:v3.20.2
    Image ID:       docker.io/calico/node@sha256:1ae8f57edec7c3a84cd48dd94132a1dcfd7c61bfff819c559f9379d4a56fe3b1
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 06 Oct 2021 23:10:57 +0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      250m
    Liveness:   exec [/bin/calico-node -felix-live -bird-live] delay=10s timeout=10s period=10s #success=1 #failure=6
    Readiness:  exec [/bin/calico-node -felix-ready -bird-ready] delay=0s timeout=10s period=10s #success=1 #failure=3
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      DATASTORE_TYPE:                     kubernetes
      WAIT_FOR_DATASTORE:                 true
      NODENAME:                            (v1:spec.nodeName)
      CALICO_NETWORKING_BACKEND:          <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
      CLUSTER_TYPE:                       k8s,bgp
      IP:                                  (v1:status.hostIP)
      CALICO_IPV4POOL_IPIP:               Always
      CALICO_IPV4POOL_VXLAN:              Never
      FELIX_IPINIPMTU:                    <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      FELIX_VXLANMTU:                     <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      FELIX_WIREGUARDMTU:                 <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      CALICO_IPV4POOL_CIDR:               10.200.0.0/16
      CALICO_DISABLE_FILE_LOGGING:        true
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
      FELIX_IPV6SUPPORT:                  false
      FELIX_HEALTHENABLED:                true
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /sys/fs/ from sysfs (rw)
      /var/lib/calico from var-lib-calico (rw)
      /var/log/calico/cni from cni-log-dir (ro)
      /var/run/calico from var-run-calico (rw)
      /var/run/nodeagent from policysync (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  var-run-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/calico
    HostPathType:
  var-lib-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/calico
    HostPathType:
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  sysfs:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/
    HostPathType:  DirectoryOrCreate
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:
  cni-log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/calico/cni
    HostPathType:
  host-local-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/cni/networks
    HostPathType:
  policysync:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/nodeagent
    HostPathType:  DirectoryOrCreate
  flexvol-driver-host:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
    HostPathType:  DirectoryOrCreate
  calico-node-token-jnmzr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-node-token-jnmzr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     :NoSchedule op=Exists
                 :NoExecute op=Exists
                 CriticalAddonsOnly op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:          <none>


Name:                 calico-node-w5q9w
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 k8s-node3/192.168.56.103
Start Time:           Sun, 17 Oct 2021 22:06:20 +0800
Labels:               controller-revision-hash=66b976f488
                      k8s-app=calico-node
                      pod-template-generation=1
Annotations:          <none>
Status:               Pending
IP:                   192.168.56.103
IPs:
  IP:           192.168.56.103
Controlled By:  DaemonSet/calico-node
Init Containers:
  upgrade-ipam:
    Container ID:
    Image:         docker.io/calico/cni:v3.20.2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/cni/bin/calico-ipam
      -upgrade
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      KUBERNETES_NODE_NAME:        (v1:spec.nodeName)
      CALICO_NETWORKING_BACKEND:  <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
    Mounts:
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/lib/cni/networks from host-local-net-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
  install-cni:
    Container ID:
    Image:         docker.io/calico/cni:v3.20.2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/cni/bin/install
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      CNI_CONF_NAME:         10-calico.conflist
      CNI_NETWORK_CONFIG:    <set to the key 'cni_network_config' of config map 'calico-config'>  Optional: false
      KUBERNETES_NODE_NAME:   (v1:spec.nodeName)
      CNI_MTU:               <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      SLEEP:                 false
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
  flexvol-driver:
    Container ID:
    Image:          docker.io/calico/pod2daemon-flexvol:v3.20.2
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /host/driver from flexvol-driver-host (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
Containers:
  calico-node:
    Container ID:
    Image:          docker.io/calico/node:v3.20.2
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      250m
    Liveness:   exec [/bin/calico-node -felix-live -bird-live] delay=10s timeout=10s period=10s #success=1 #failure=6
    Readiness:  exec [/bin/calico-node -felix-ready -bird-ready] delay=0s timeout=10s period=10s #success=1 #failure=3
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      DATASTORE_TYPE:                     kubernetes
      WAIT_FOR_DATASTORE:                 true
      NODENAME:                            (v1:spec.nodeName)
      CALICO_NETWORKING_BACKEND:          <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
      CLUSTER_TYPE:                       k8s,bgp
      IP:                                  (v1:status.hostIP)
      CALICO_IPV4POOL_IPIP:               Always
      CALICO_IPV4POOL_VXLAN:              Never
      FELIX_IPINIPMTU:                    <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      FELIX_VXLANMTU:                     <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      FELIX_WIREGUARDMTU:                 <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      CALICO_IPV4POOL_CIDR:               10.200.0.0/16
      CALICO_DISABLE_FILE_LOGGING:        true
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
      FELIX_IPV6SUPPORT:                  false
      FELIX_HEALTHENABLED:                true
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /sys/fs/ from sysfs (rw)
      /var/lib/calico from var-lib-calico (rw)
      /var/log/calico/cni from cni-log-dir (ro)
      /var/run/calico from var-run-calico (rw)
      /var/run/nodeagent from policysync (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-jnmzr (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  var-run-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/calico
    HostPathType:
  var-lib-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/calico
    HostPathType:
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  sysfs:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/
    HostPathType:  DirectoryOrCreate
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:
  cni-log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/calico/cni
    HostPathType:
  host-local-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/cni/networks
    HostPathType:
  policysync:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/nodeagent
    HostPathType:  DirectoryOrCreate
  flexvol-driver-host:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
    HostPathType:  DirectoryOrCreate
  calico-node-token-jnmzr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-node-token-jnmzr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     :NoSchedule op=Exists
                 :NoExecute op=Exists
                 CriticalAddonsOnly op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:          <none>


Name:                 coredns-84646c885d-8xmrx
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 <none>
Labels:               k8s-app=kube-dns
                      pod-template-hash=84646c885d
Annotations:          seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:               Pending
IP:
IPs:                  <none>
Controlled By:        ReplicaSet/coredns-84646c885d
Containers:
  coredns:
    Image:       docker.io/coredns/coredns:1.6.7
    Ports:       53/UDP, 53/TCP, 9153/TCP
    Host Ports:  0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Readiness:    http-get http://:8181/ready delay=0s timeout=5s period=10s #success=1 #failure=10
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-6qxzr (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  coredns-token-6qxzr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  coredns-token-6qxzr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  13h   default-scheduler  0/2 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't match pod anti-affinity rules, 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.


Name:                 coredns-84646c885d-zmrd2
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 k8s-node2/192.168.56.102
Start Time:           Wed, 06 Oct 2021 23:21:03 +0800
Labels:               k8s-app=kube-dns
                      pod-template-hash=84646c885d
Annotations:          cni.projectcalico.org/containerID: e1b3174d8e3e62e8a749750086c54c1fc4f70c2d9d193fbeabfb0bf040ef3a8b
                      cni.projectcalico.org/podIP: 10.200.169.130/32
                      cni.projectcalico.org/podIPs: 10.200.169.130/32
                      seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:               Running
IP:                   10.200.169.130
IPs:
  IP:           10.200.169.130
Controlled By:  ReplicaSet/coredns-84646c885d
Containers:
  coredns:
    Container ID:  containerd://9612bb05614e1370f4f7f93c19170886fb1221535a29fd6a2d3319b608180fdb
    Image:         docker.io/coredns/coredns:1.6.7
    Image ID:      docker.io/coredns/coredns@sha256:2c8d61c46f484d881db43b34d13ca47a269336e576c81cf007ca740fa9ec0800
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Running
      Started:      Wed, 06 Oct 2021 23:22:21 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Readiness:    http-get http://:8181/ready delay=0s timeout=5s period=10s #success=1 #failure=10
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-6qxzr (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  coredns-token-6qxzr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  coredns-token-6qxzr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>


Name:                 nginx-proxy-k8s-node3
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 k8s-node3/192.168.56.103
Start Time:           Sun, 17 Oct 2021 22:06:17 +0800
Labels:               addonmanager.kubernetes.io/mode=Reconcile
                      k8s-app=kube-nginx
Annotations:          kubernetes.io/config.hash: 7cf27d45a874e9a25860392cd83c3e6a
                      kubernetes.io/config.mirror: 7cf27d45a874e9a25860392cd83c3e6a
                      kubernetes.io/config.seen: 2021-10-17T22:06:10.801249819+08:00
                      kubernetes.io/config.source: file
Status:               Running
IP:                   192.168.56.103
IPs:
  IP:           192.168.56.103
Controlled By:  Node/k8s-node3
Containers:
  nginx-proxy:
    Container ID:   containerd://90d5fda83f1466a706e580a6bab4da4cd8cf92b1dfd8fc7b5a7af40a94d5b730
    Image:          docker.io/library/nginx:1.19
    Image ID:       docker.io/library/nginx@sha256:df13abe416e37eb3db4722840dd479b00ba193ac6606e7902331dcea50f4f1f2
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Sun, 17 Oct 2021 22:06:17 +0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        25m
      memory:     32M
    Liveness:     http-get http://:8081/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get http://:8081/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/nginx from etc-nginx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  etc-nginx:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/nginx
    HostPathType:
QoS Class:         Burstable
Node-Selectors:    kubernetes.io/os=linux
Tolerations:       :NoExecute op=Exists
Events:            <none>


Name:                 nodelocaldns-bf7s5
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 k8s-node3/192.168.56.103
Start Time:           Sun, 17 Oct 2021 22:06:20 +0800
Labels:               controller-revision-hash=7fffc798ff
                      k8s-app=nodelocaldns
                      pod-template-generation=1
Annotations:          prometheus.io/port: 9253
                      prometheus.io/scrape: true
Status:               Pending
IP:                   192.168.56.103
IPs:
  IP:           192.168.56.103
Controlled By:  DaemonSet/nodelocaldns
Containers:
  node-cache:
    Container ID:
    Image:         registry.cn-hangzhou.aliyuncs.com/kubernetes-kubespray/dns_k8s-dns-node-cache:1.16.0
    Image ID:
    Ports:         53/UDP, 53/TCP, 9253/TCP
    Host Ports:    53/UDP, 53/TCP, 9253/TCP
    Args:
      -localip
      169.254.25.10
      -conf
      /etc/coredns/Corefile
      -upstreamsvc
      coredns
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-wm6rg (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nodelocaldns
    Optional:  false
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  nodelocaldns-token-wm6rg:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nodelocaldns-token-wm6rg
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     :NoSchedule op=Exists
                 :NoExecute op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:          <none>


Name:                 nodelocaldns-nr4q2
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 k8s-node2/192.168.56.102
Start Time:           Wed, 06 Oct 2021 23:24:21 +0800
Labels:               controller-revision-hash=7fffc798ff
                      k8s-app=nodelocaldns
                      pod-template-generation=1
Annotations:          prometheus.io/port: 9253
                      prometheus.io/scrape: true
Status:               Running
IP:                   192.168.56.102
IPs:
  IP:           192.168.56.102
Controlled By:  DaemonSet/nodelocaldns
Containers:
  node-cache:
    Container ID:  containerd://604877910b9e532a752287411592963ea78413d29be750831e8a057086f874be
    Image:         registry.cn-hangzhou.aliyuncs.com/kubernetes-kubespray/dns_k8s-dns-node-cache:1.16.0
    Image ID:      registry.cn-hangzhou.aliyuncs.com/kubernetes-kubespray/dns_k8s-dns-node-cache@sha256:248c29f0f3106a6f55f7c686521ae6f85966f3c9eed10bf8b68cdc6049b46196
    Ports:         53/UDP, 53/TCP, 9253/TCP
    Host Ports:    53/UDP, 53/TCP, 9253/TCP
    Args:
      -localip
      169.254.25.10
      -conf
      /etc/coredns/Corefile
      -upstreamsvc
      coredns
    State:          Running
      Started:      Wed, 06 Oct 2021 23:24:29 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-wm6rg (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nodelocaldns
    Optional:  false
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  nodelocaldns-token-wm6rg:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nodelocaldns-token-wm6rg
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     :NoSchedule op=Exists
                 :NoExecute op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:          <none>

从以上log里发现,calico一直处于PodInitializing的状态,而coredns则提示 the pod didn’t tolerate.

Events:
Type Reason Age From Message


Warning FailedScheduling 13h default-scheduler 0/2 nodes are available: 1 node(s) didn’t match pod affinity/anti-affinity, 1 node(s) didn’t match pod anti-affinity rules, 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn’t tolerate.

个人猜测可能是因为之前node3有问题的时候,我先在node1节点上继续了5-7和5-8的课程学习,导致了这个问题,请问老师,这个问题应该如何解决呢?感谢~~

PS. 如下是kubectl describe deployment -n kube-system查看的deployment的状态。

[root@k8s-node1 temp]# kubectl describe deployment -n kube-system
Name:               calico-kube-controllers
Namespace:          kube-system
CreationTimestamp:  Wed, 06 Oct 2021 22:43:55 +0800
Labels:             k8s-app=calico-kube-controllers
Annotations:        deployment.kubernetes.io/revision: 1
Selector:           k8s-app=calico-kube-controllers
Replicas:           1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:       Recreate
MinReadySeconds:    0
Pod Template:
  Labels:           k8s-app=calico-kube-controllers
  Service Account:  calico-kube-controllers
  Containers:
   calico-kube-controllers:
    Image:      docker.io/calico/kube-controllers:v3.20.2
    Port:       <none>
    Host Port:  <none>
    Liveness:   exec [/usr/bin/check-status -l] delay=10s timeout=10s period=10s #success=1 #failure=6
    Readiness:  exec [/usr/bin/check-status -r] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ENABLED_CONTROLLERS:  node
      DATASTORE_TYPE:       kubernetes
    Mounts:                 <none>
  Volumes:                  <none>
  Priority Class Name:      system-cluster-critical
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Progressing    True    NewReplicaSetAvailable
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  <none>
NewReplicaSet:   calico-kube-controllers-659bd7879c (1/1 replicas created)
Events:          <none>


Name:                   coredns
Namespace:              kube-system
CreationTimestamp:      Wed, 06 Oct 2021 23:21:03 +0800
Labels:                 addonmanager.kubernetes.io/mode=Reconcile
                        k8s-app=kube-dns
                        kubernetes.io/name=coredns
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               k8s-app=kube-dns
Replicas:               2 desired | 2 updated | 2 total | 1 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  0 max unavailable, 10% max surge
Pod Template:
  Labels:           k8s-app=kube-dns
  Annotations:      seccomp.security.alpha.kubernetes.io/pod: runtime/default
  Service Account:  coredns
  Containers:
   coredns:
    Image:       docker.io/coredns/coredns:1.6.7
    Ports:       53/UDP, 53/TCP, 9153/TCP
    Host Ports:  0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=0s timeout=5s period=10s #success=1 #failure=10
    Readiness:    http-get http://:8181/ready delay=0s timeout=5s period=10s #success=1 #failure=10
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (rw)
  Volumes:
   config-volume:
    Type:               ConfigMap (a volume populated by a ConfigMap)
    Name:               coredns
    Optional:           false
  Priority Class Name:  system-cluster-critical
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      False   MinimumReplicasUnavailable
  Progressing    False   ProgressDeadlineExceeded
OldReplicaSets:  <none>
NewReplicaSet:   coredns-84646c885d (2/2 replicas created)
Events:          <none>

问题解决啦~!!!!

[root@k8s-node1 ~]#  kubectl get pods -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-659bd7879c-4b7wj   1/1     Running   0          14d
calico-node-2s45s                          1/1     Running   0          14d
calico-node-w5q9w                          1/1     Running   0          3d4h
coredns-84646c885d-8xmrx                   1/1     Running   0          14d
coredns-84646c885d-zmrd2                   1/1     Running   0          14d
nginx-proxy-k8s-node3                      1/1     Running   2          3d4h
nodelocaldns-bf7s5                         1/1     Running   2          3d4h
nodelocaldns-nr4q2                         1/1     Running   0          14d

[root@k8s-node1 ~]#  kubectl get nodes
NAME        STATUS   ROLES    AGE    VERSION
k8s-node2   Ready    <none>   14d    v1.20.2
k8s-node3   Ready    <none>   3d4h   v1.20.2
写回答

1回答

刘果国

2021-10-20

分析问题前先要搞清楚组件的依赖关系,启动先后顺序。这里calico是你首要解决的,不要管dns。找到点以后就是去看这个点,只有node3上有问题,一般可以从你掌握的信息里分析问啥有的节点没问题有的就有问题,他们有什么区别?从区别入手。也可以直接从问题入手,那个节点有问题就去哪个节点看日志,多关注警告和错误,多搜索。describe也是个好方法,但要有重点,只看有问题的pod

0
1
慕移动7138520
感谢老师提供思路和方法,describe和journalctl双管齐下,经过2个小时的不断尝试和搜索,问题从init0/3到ErrImagePull到PodInitializing到ContainerCreating到running,终于把问题解决啦,现在一切正常啦,感谢老师~!!!
2021-10-21
共1条回复

Kubernetes生产落地全程实践

一个互联网公司落地Kubernetes全过程点点滴滴

2293 学习 · 2216 问题

查看课程