错题本
问题1
[root@ceph ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
scheduler Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
问题2
kubectl delete ns test 一直terminating,通常适用于删除名称空间前namespace 中存在资源。通过操作数据库强制删除,临时解决了问题
etcdctl --endpoints=10.4.7.250:2379 \
--cacert="/etc/kubernetes/ssl/ca.pem" \
--cert="/etc/kubernetes/ssl/etcd.pem" \
--key="/etc/kubernetes/ssl/etcd-key.pem" \
del /registry/namespaces/test
问题3
增加node节点出现的问题
遇到的问题:节点增加成功后,调度到该节点上的pod部分无法正常running
现象:调度到该节点的pod如果启用了hostNetwork
则运行状态正常,使用了pod网络的一直处于 ContainerCreating
状态。
故障排查:选择一个处于ContainerCreating
的pod通过 kubectl describe
可以看到类似的提示 : network: failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 172.16.6.1/24
看到这个信息,相比你会想到是cni插件出了问题。我们这里使用的是flannel,按照提示应该和cni0或 flannel.1的网路配置有关。
于是通过ip a 命令查看了cni0 和 flannel.1分配到的地址确实不一样,一个是`172.16.4.1` 另一个是 `172.16.6.0`。
处理办法:删除cni0网卡等待flannel重新分配地址给cni0
ip link set cni0 down
ip link del cni0
重启flannel 服务
问题4
kubelet 缺少 cni
Jan 20 01:15:16 master01 kubelet[16067]: E0120 01:15:16.837384 16067 kubelet.go:2211] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Jan 20 01:15:19 master01 kubelet[16067]: I0120 01:15:19.669530 16067 cni.go:204] "Error validating CNI config list" configList="{\n \"name\": \"cbr0\",\n \"cniVersion\": \"0.3.1\",\n \"plugins\": [\n {\n \"type\": \"flannel\",\n \"delegate\": {\n \"hairpinMode\": true,\n \"isDefaultGateway\": true\n }\n },\n {\n \"type\": \"portmap\",\n \"capabilities\": {\n \"portMappings\": true\n }\n }\n ]\n}\n" err="[failed to find plugin \"flannel\" in path [/opt/kube/bin]]"
Jan 20 01:15:19 master01 kubelet[16067]: I0120 01:15:19.669630 16067 cni.go:239] "Unable to update cni config" err="no valid networks found in /etc/cni/net.d"
问题5
[root@master01 install_k8s]# kubectl logs -f kube-flannel-ds-qvlkg -n kube-flannel
Error from server: Get "https://10.4.7.10:20250/containerLogs/kube-flannel/kube-flannel-ds-qvlkg/kube-flannel?follow=true": x509: cannot validate certificate for 10.4.7.10 because it doesn't contain any IP SA