安装完 k8s集群后,我发现机器无法访问外网了和冒烟测试DNS失败

来源:1-1 课程介绍

慕少8521559

2021-10-14

背景:

买的云厂商hk的三个节点,没有配置代理科学上网服务,因为hk没有墙。


第一个:安装完 k8s集群后,我发现宿主机无法访问外网了。


ping 普通网站不通,ping 什么网站都不通


http://img.mukewang.com/szimg/61683de609ba511e10000466.jpg


是不是 安装脚本把机器的网络或者dns解析服务搞坏了?

然后 我在 宿主机 /etc/resolv.conf 加上了 nameserver 8.8.8.8 又可以了。



第二个:dns 冒烟测试失败

http://img.mukewang.com/szimg/61684c6c09792db310000480.jpg


第三个:发现dns 有几个pod是失败的

http://img.mukewang.com/szimg/61684cc909a740db10001000.jpg

访问其中两个报错信息如下

通过命令 kubectl describe pod/coredns-85967d65-jrnc2 -n kube-system 查看

Name:                 nodelocaldns-4qp7k

Namespace:            kube-system

Priority:             2000000000

Priority Class Name:  system-cluster-critical

Node:                 node-1/10.7.190.74

Start Time:           Thu, 14 Oct 2021 15:54:58 +0800

Labels:               controller-revision-hash=666697fc9

                      k8s-app=nodelocaldns

                      pod-template-generation=1

Annotations:          prometheus.io/port: 9253

                      prometheus.io/scrape: true

Status:               Running

IP:                   10.7.190.74

IPs:

  IP:           10.7.190.74

Controlled By:  DaemonSet/nodelocaldns

Containers:

  node-cache:

    Container ID:  containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f

    Image:         k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0

    Image ID:      k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f

    Ports:         53/UDP, 53/TCP, 9253/TCP

    Host Ports:    53/UDP, 53/TCP, 9253/TCP

    Args:

      -localip

      169.254.25.10

      -conf

      /etc/coredns/Corefile

      -upstreamsvc

      coredns

    State:          Waiting

      Reason:       CrashLoopBackOff

    Last State:     Terminated

      Reason:       Error

      Exit Code:    1

      Started:      Thu, 14 Oct 2021 23:46:52 +0800

      Finished:     Thu, 14 Oct 2021 23:46:53 +0800

    Ready:          False

    Restart Count:  97

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (rw)

      /run/xtables.lock from xtables-lock (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      nodelocaldns

    Optional:  false

  xtables-lock:

    Type:          HostPath (bare host directory volume)

    Path:          /run/xtables.lock

    HostPathType:  FileOrCreate

  nodelocaldns-token-djcd4:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  nodelocaldns-token-djcd4

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     :NoScheduleop=Exists

                 :NoExecuteop=Exists

                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists

                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists

                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists

                 node.kubernetes.io/not-ready:NoExecute op=Exists

                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists

                 node.kubernetes.io/unreachable:NoExecute op=Exists

                 node.kubernetes.io/unschedulable:NoSchedule op=Exists

Events:

  Type     Reason   Age                      From     Message

  ----     ------   ----                     ----     -------

  Warning  BackOff  2m6s (x2231 over 7h52m)  kubelet  Back-off restarting failed container


通过命令 kubectl describe pod/nodelocaldns-4qp7k -n kube-system查看

Name:                 nodelocaldns-4qp7k

Namespace:            kube-system

Priority:             2000000000

Priority Class Name:  system-cluster-critical

Node:                 node-1/10.7.190.74

Start Time:           Thu, 14 Oct 2021 15:54:58 +0800

Labels:               controller-revision-hash=666697fc9

                      k8s-app=nodelocaldns

                      pod-template-generation=1

Annotations:          prometheus.io/port: 9253

                      prometheus.io/scrape: true

Status:               Running

IP:                   10.7.190.74

IPs:

  IP:           10.7.190.74

Controlled By:  DaemonSet/nodelocaldns

Containers:

  node-cache:

    Container ID:  containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f

    Image:         k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0

    Image ID:      k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f

    Ports:         53/UDP, 53/TCP, 9253/TCP

    Host Ports:    53/UDP, 53/TCP, 9253/TCP

    Args:

      -localip

      169.254.25.10

      -conf

      /etc/coredns/Corefile

      -upstreamsvc

      coredns

    State:          Waiting

      Reason:       CrashLoopBackOff

    Last State:     Terminated

      Reason:       Error

      Exit Code:    1

      Started:      Thu, 14 Oct 2021 23:46:52 +0800

      Finished:     Thu, 14 Oct 2021 23:46:53 +0800

    Ready:          False

    Restart Count:  97

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Readiness:    http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (rw)

      /run/xtables.lock from xtables-lock (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      nodelocaldns

    Optional:  false

  xtables-lock:

    Type:          HostPath (bare host directory volume)

    Path:          /run/xtables.lock

    HostPathType:  FileOrCreate

  nodelocaldns-token-djcd4:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  nodelocaldns-token-djcd4

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     :NoScheduleop=Exists

                 :NoExecuteop=Exists

                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists

                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists

                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists

                 node.kubernetes.io/not-ready:NoExecute op=Exists

                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists

                 node.kubernetes.io/unreachable:NoExecute op=Exists

                 node.kubernetes.io/unschedulable:NoSchedule op=Exists

Events:

  Type     Reason   Age                       From     Message

  ----     ------   ----                      ----     -------

  Warning  BackOff  4m53s (x2231 over 7h55m)  kubelet  Back-off restarting failed container


这三者是否有什么联系?


写回答

3回答

刘果国

2021-10-22

先看coredns的问题,nodelocaldns是一个本地dns缓存,依赖coredns。coredns就这一行日志?

0
4
慕少8521559
回复
刘果国
删掉重启就ok了!谢谢~
2021-10-23
共4条回复

刘果国

2021-10-15

恩,我这目前暂时没发现有部署集群影响宿主机访问外网的情况。应该是不会的。

dns的问题pod启动失败了,看看pod的完整启动日志。describe主要是在pending情况分析问题用。logs是pod crash情况用

0
2
慕少8521559
查看nodelocaldns报错 命令是 kubectl logs nodelocaldns-r58k2 -n kube-system 提示信息如下: 2021/10/21 09:28:04 [INFO] Starting node-cache image: 1.16.0 2021/10/21 09:28:04 [INFO] Using Corefile /etc/coredns/Corefile 2021/10/21 09:28:04 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory 2021/10/21 09:28:04 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory plugin/forward: no nameservers found [root@node-1 ~]# kubectl logs nodelocaldns-5dd8l -n kube-system 2021/10/21 09:29:45 [INFO] Starting node-cache image: 1.16.0 2021/10/21 09:29:45 [INFO] Using Corefile /etc/coredns/Corefile 2021/10/21 09:29:45 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory 2021/10/21 09:29:45 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory plugin/forward: no nameservers found
2021-10-21
共2条回复

慕少8521559

提问者

2021-10-14

test

0
0

Kubernetes生产落地全程实践

一个互联网公司落地Kubernetes全过程点点滴滴

2293 学习 · 2216 问题

查看课程