安装完 k8s集群后,我发现机器无法访问外网了和冒烟测试DNS失败
来源:1-1 课程介绍
慕少8521559
2021-10-14
背景:
买的云厂商hk的三个节点,没有配置代理科学上网服务,因为hk没有墙。
第一个:安装完 k8s集群后,我发现宿主机无法访问外网了。
ping 普通网站不通,ping 什么网站都不通
是不是 安装脚本把机器的网络或者dns解析服务搞坏了?
然后 我在 宿主机 /etc/resolv.conf 加上了 nameserver 8.8.8.8 又可以了。
第二个:dns 冒烟测试失败
第三个:发现dns 有几个pod是失败的
访问其中两个报错信息如下
通过命令 kubectl describe pod/coredns-85967d65-jrnc2 -n kube-system 查看
Name: nodelocaldns-4qp7k
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: node-1/10.7.190.74
Start Time: Thu, 14 Oct 2021 15:54:58 +0800
Labels: controller-revision-hash=666697fc9
k8s-app=nodelocaldns
pod-template-generation=1
Annotations: prometheus.io/port: 9253
prometheus.io/scrape: true
Status: Running
IP: 10.7.190.74
IPs:
IP: 10.7.190.74
Controlled By: DaemonSet/nodelocaldns
Containers:
node-cache:
Container ID: containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f
Image: k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0
Image ID: k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f
Ports: 53/UDP, 53/TCP, 9253/TCP
Host Ports: 53/UDP, 53/TCP, 9253/TCP
Args:
-localip
169.254.25.10
-conf
/etc/coredns/Corefile
-upstreamsvc
coredns
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 14 Oct 2021 23:46:52 +0800
Finished: Thu, 14 Oct 2021 23:46:53 +0800
Ready: False
Restart Count: 97
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Readiness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/etc/coredns from config-volume (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: nodelocaldns
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
nodelocaldns-token-djcd4:
Type: Secret (a volume populated by a Secret)
SecretName: nodelocaldns-token-djcd4
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoScheduleop=Exists
:NoExecuteop=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 2m6s (x2231 over 7h52m) kubelet Back-off restarting failed container
通过命令 kubectl describe pod/nodelocaldns-4qp7k -n kube-system查看
Name: nodelocaldns-4qp7k
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: node-1/10.7.190.74
Start Time: Thu, 14 Oct 2021 15:54:58 +0800
Labels: controller-revision-hash=666697fc9
k8s-app=nodelocaldns
pod-template-generation=1
Annotations: prometheus.io/port: 9253
prometheus.io/scrape: true
Status: Running
IP: 10.7.190.74
IPs:
IP: 10.7.190.74
Controlled By: DaemonSet/nodelocaldns
Containers:
node-cache:
Container ID: containerd://614a8ba2ea831dc3a4d648f5f6e3f1c6b49914ff42e68187780b312f3326522f
Image: k8s.gcr.io/dns/k8s-dns-node-cache:1.16.0
Image ID: k8s.gcr.io/dns/k8s-dns-node-cache@sha256:9f78e4cc9ed4c6da3d79d8492d66cde3638be8dbcdab8c72957b1f582e8ce04f
Ports: 53/UDP, 53/TCP, 9253/TCP
Host Ports: 53/UDP, 53/TCP, 9253/TCP
Args:
-localip
169.254.25.10
-conf
/etc/coredns/Corefile
-upstreamsvc
coredns
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 14 Oct 2021 23:46:52 +0800
Finished: Thu, 14 Oct 2021 23:46:53 +0800
Ready: False
Restart Count: 97
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Readiness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/etc/coredns from config-volume (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from nodelocaldns-token-djcd4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: nodelocaldns
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
nodelocaldns-token-djcd4:
Type: Secret (a volume populated by a Secret)
SecretName: nodelocaldns-token-djcd4
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoScheduleop=Exists
:NoExecuteop=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m53s (x2231 over 7h55m) kubelet Back-off restarting failed container
这三者是否有什么联系?
3回答
-
刘果国
2021-10-22
先看coredns的问题,nodelocaldns是一个本地dns缓存,依赖coredns。coredns就这一行日志?
042021-10-23 -
刘果国
2021-10-15
恩,我这目前暂时没发现有部署集群影响宿主机访问外网的情况。应该是不会的。
dns的问题pod启动失败了,看看pod的完整启动日志。describe主要是在pending情况分析问题用。logs是pod crash情况用
022021-10-21 -
慕少8521559
提问者
2021-10-14
test
00
相似问题