dig @10.96.0.10 service-headliness.dev.svc.cluster.local connection timed out; no servers could be reached 的问题
现象
如图,创建服务之后无法进行域名解析
问题复现
创建service-headliness.yaml
apiVersion: v1
kind: Service
metadata:
name: service-headliness
namespace: dev
spec:
selector:
app: nginx-pod
clusterIP: None # 将clusterIP设置为None,即可创建headliness Service
type: ClusterIP
ports:
- port: 80
targetPort: 80
# 创建service
kubectl create -f service-headliness.yaml
# 查看service详情
kubectl describe svc service-headliness -n dev
# 查看dns记录
kubectl -n kube-system describe svc kube-dns
发现server服务里面可以看到pod的ip,但是dns里面看不到
那这么一看应该是域名解析的问题,于是就就开始找k8s的域名解析pod
kubectl get pods -n kube-system
一看这个状态,CrashLoopBackOff
这个状态是 不停错误重启的死循环
,这个肯定不正常,看看log
kubectl logs coredns-6955765f44-524cx -n kube-system
E0524 09:56:29.714290 1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
原因分析
iptables 规则没有清理干净/乱了
解决办法
# 停止 docker
systemctl stop docker
# 停止 kubelet
systemctl stop kubelet
# 刷新iptables
iptables --flush
iptables -tnat --flush
# 启动 kubelet
systemctl start kubelet
# 启动 docker
systemctl start docker
运行正常了
重启一下服务器
reboot
再次查看dns
kubectl -n kube-system describe svc kube-dns
这次正常了
测试域名解析
dig @10.96.0.10 service-headliness.dev.svc.cluster.local