dig @10.96.0.10 service-headliness.dev.svc.cluster.local connection timed out; no servers could be reached 的问题

现象

image-20220524174003520

如图,创建服务之后无法进行域名解析

问题复现

创建service-headliness.yaml

apiVersion: v1
kind: Service
metadata:
  name: service-headliness
  namespace: dev
spec:
  selector:
    app: nginx-pod
  clusterIP: None # 将clusterIP设置为None,即可创建headliness Service
  type: ClusterIP
  ports:
  - port: 80    
    targetPort: 80
# 创建service
kubectl create -f service-headliness.yaml

# 查看service详情
kubectl describe svc service-headliness -n dev

# 查看dns记录
kubectl -n kube-system describe svc kube-dns

发现server服务里面可以看到pod的ip,但是dns里面看不到

image-20220524174553938

那这么一看应该是域名解析的问题,于是就就开始找k8s的域名解析pod

kubectl get pods -n kube-system

image-20220524175640273

一看这个状态,CrashLoopBackOff这个状态是 不停错误重启的死循环,这个肯定不正常,看看log

kubectl logs coredns-6955765f44-524cx -n kube-system

E0524 09:56:29.714290 1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host

image-20220524180201362

原因分析

iptables 规则没有清理干净/乱了

解决办法

# 停止 docker
systemctl stop docker

# 停止 kubelet
systemctl stop kubelet

# 刷新iptables
iptables --flush
iptables -tnat --flush

# 启动 kubelet
systemctl start kubelet
# 启动 docker
systemctl start docker

image-20220524181719428

运行正常了

image-20220524181741342

重启一下服务器

reboot

再次查看dns

kubectl -n kube-system describe svc kube-dns

image-20220524182558127

这次正常了

测试域名解析

dig @10.96.0.10 service-headliness.dev.svc.cluster.local

image-20220524182634305

posted @ 2022-05-25 15:19  makalo  阅读(1344)  评论(0编辑  收藏  举报