安装完 k8s集群后，我发现机器无法访问外网了和冒烟测试DNS失败

来源：1-1 课程介绍

慕少8521559

2021-10-14

背景：

买的云厂商hk的三个节点，没有配置代理科学上网服务，因为hk没有墙。

第一个：安装完 k8s集群后，我发现宿主机无法访问外网了。

ping 普通网站不通，ping 什么网站都不通

是不是安装脚本把机器的网络或者dns解析服务搞坏了？

然后我在宿主机 /etc/resolv.conf 加上了 nameserver 8.8.8.8 又可以了。

第二个：dns 冒烟测试失败

第三个：发现dns 有几个pod是失败的

访问其中两个报错信息如下

通过命令 kubectl describe pod/coredns-85967d65-jrnc2 -n kube-system 查看

Name: nodelocaldns-4qp7k
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: node-1/10.7.190.74
Start Time: Thu, 14 Oct 2021 15:54:58 +0800
Labels: controller-revision-hash=666697fc9
k8s-app=nodelocaldns
pod-template-generation=1
Annotations: prometheus.io/port: 9253
prometheus.io/scrape: true
Status: Running
IP: 10.7.190.74
IPs:
IP: 10.7.190.74
Controlled By: DaemonSet/nodelocaldns
Containers:
node-cache:
Container ID: containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f
Image: k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0
Image ID: k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f
Ports: 53/UDP, 53/TCP, 9253/TCP
Host Ports: 53/UDP, 53/TCP, 9253/TCP
Args:
-localip
169.254.25.10
-conf
/etc/coredns/Corefile
-upstreamsvc
coredns
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 14 Oct 2021 23:46:52 +0800
Finished: Thu, 14 Oct 2021 23:46:53 +0800
Ready: False
Restart Count: 97
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Readiness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/etc/coredns from config-volume (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: nodelocaldns
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
nodelocaldns-token-djcd4:
Type: Secret (a volume populated by a Secret)
SecretName: nodelocaldns-token-djcd4
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoScheduleop=Exists
:NoExecuteop=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 2m6s (x2231 over 7h52m) kubelet Back-off restarting failed container

通过命令 kubectl describe pod/nodelocaldns-4qp7k -n kube-system查看

Name: nodelocaldns-4qp7k
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: node-1/10.7.190.74
Start Time: Thu, 14 Oct 2021 15:54:58 +0800
Labels: controller-revision-hash=666697fc9
k8s-app=nodelocaldns
pod-template-generation=1
Annotations: prometheus.io/port: 9253
prometheus.io/scrape: true
Status: Running
IP: 10.7.190.74
IPs:
IP: 10.7.190.74
Controlled By: DaemonSet/nodelocaldns
Containers:
node-cache:
Container ID: containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f
Image: k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0
Image ID: k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f
Ports: 53/UDP, 53/TCP, 9253/TCP
Host Ports: 53/UDP, 53/TCP, 9253/TCP
Args:
-localip
169.254.25.10
-conf
/etc/coredns/Corefile
-upstreamsvc
coredns
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 14 Oct 2021 23:46:52 +0800
Finished: Thu, 14 Oct 2021 23:46:53 +0800
Ready: False
Restart Count: 97
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Readiness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/etc/coredns from config-volume (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: nodelocaldns
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
nodelocaldns-token-djcd4:
Type: Secret (a volume populated by a Secret)
SecretName: nodelocaldns-token-djcd4
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoScheduleop=Exists
:NoExecuteop=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m53s (x2231 over 7h55m) kubelet Back-off restarting failed container

这三者是否有什么联系？

写回答

3回答

刘果国

2021-10-22

先看coredns的问题，nodelocaldns是一个本地dns缓存，依赖coredns。coredns就这一行日志？

慕少8521559

刘果国

删掉重启就ok了！谢谢~

2021-10-23

共4条回复

刘果国

2021-10-15

恩，我这目前暂时没发现有部署集群影响宿主机访问外网的情况。应该是不会的。

dns的问题pod启动失败了，看看pod的完整启动日志。describe主要是在pending情况分析问题用。logs是pod crash情况用

慕少8521559

查看nodelocaldns报错命令是 kubectl logs nodelocaldns-r58k2 -n kube-system 提示信息如下： 2021/10/21 09:28:04 [INFO] Starting node-cache image: 1.16.0 2021/10/21 09:28:04 [INFO] Using Corefile /etc/coredns/Corefile 2021/10/21 09:28:04 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory 2021/10/21 09:28:04 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory plugin/forward: no nameservers found [root@node-1 ~]# kubectl logs nodelocaldns-5dd8l -n kube-system 2021/10/21 09:29:45 [INFO] Starting node-cache image: 1.16.0 2021/10/21 09:29:45 [INFO] Using Corefile /etc/coredns/Corefile 2021/10/21 09:29:45 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory 2021/10/21 09:29:45 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory plugin/forward: no nameservers found

2021-10-21

共2条回复