部署Kubeadm遇到的哪些问题，并且如何解决

1）设置错误导致kubeadm安装k8s失败

提示：ERROR FileContent–proc-sys-net-bridge-bridge-nf-call-iptables

[root@node01 data]# kubeadm join masterIP地址:6443 --token abcdef.0123456789abcdef 
>     --discovery-token-ca-cert-hash sha256:e7a08a24b68c738cccfcc3ae56b7a433704f3753991c782208b46f20c57cf0c9
[preflight] Running pre-flight checks
    [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.12. Latest validated version: 19.03
error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

解决办法:

echo "1" >/proc/sys/net/bridge/bridge-nf-call-iptables
echo "1" >/proc/sys/net/bridge/bridge-nf-call-ip6tables

2）kubeadm安装k8s1.20.1版本添加node节点时出错

参考这个地址：https://bbs.csdn.net/topics/397412529?page=1

accepts at most 1 arg(s), received 2
To see the stack trace of this error execute with --v=5 or higher

稍等片刻后, 加入成功如下:

W0801 19:27:06.500319   12557 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
    [WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

3）部署Kubernetes遇到的问题与解决方法（初始化等）

参考这个地址：https://blog.csdn.net/clareeee/article/details/121100431

4）子节点服务器上运行kubectl get node却发现报错了, 如下

[root@k8s-node02-17:~]# kubectl get node
The connection to the server localhost:8080 was refused - did you specify the right host or port?

可以发现按安装成功后，日志提示的如下步骤操作即可

# 在各个子节点创建.kube目录
[root@master data]# kubectl get nodes
W1116 20:29:22.881159 20594 loader.go:223] Config not found: /etc/kubernetes/admin.conf
The connection to the server localhost:8080 was refused - did you specify the right host or port?

#Master节点执行：
scp /etc/kubernetes/admin.conf root@node01:/etc/kubernetes/admin.conf
scp /etc/kubernetes/admin.conf root@node01:/etc/kubernetes/admin.conf

#Node节点执行：
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile

# 最后运行测试, 发现不报错了
[root@k8s-master01-15 ~]# kubectl get node
NAME               STATUS     ROLES    AGE   VERSION
k8s-master01-15    NotReady   master   20m   v1.18.6
k8s-node01-16      NotReady      19m   v1.18.6
k8s-node02-17      NotReady      19m   v1.18.6

5) kubectl get cs 问题

提示：Get “http://127.0.0.1:10251/healthz”: dial tcp 127.0.0.1:10251: connect: connection refused. 出现这种情况，是/etc/kubernetes/manifests/下的kube-controller-manager.yaml和kube-scheduler.yaml设置的默认端口是0导致的，解决方式是注释掉对应的port即可，操作如下：

[root@k8s-master01-15 manifests]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE                                                                                       ERROR
scheduler            Unhealthy   Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Healthy     ok                                                                                            
etcd-0               Healthy     {"health":"true","reason":""} 


[root@k8s-master01-15 /etc/kubernetes/manifests]# vim kube-scheduler.yaml +19
.....
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
    #- --port=0      # 注释这行


[root@k8s-master01-15 /etc/kubernetes/manifests]# vim kube-controller-manager.yaml +26
.....
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    #- --port=0      # 注释这行

# 重启kubelet.service 服务
systemctl restart kubelet.service

# 检查cs状态
[root@k8s-master01-15 manifests]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE                         ERROR
scheduler            Healthy   ok                              
etcd-0               Healthy   {"health":"true","reason":""}   
controller-manager   Healthy   ok

6) k8s的coredns组件的问题（部署flannel网络插件）

提示：network: open /run/flannel/subnet.env: no such file or directory

[root@k8s-master01-15 core]# kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS              RESTARTS         AGE     IP               NODE     NOMINATED NODE   READINESS GATES
coredns-78fcd69978-6t76m         0/1     ContainerCreating   0                126m               node01              
coredns-78fcd69978-nthg8         0/1     ContainerCreating   0                126m               node01              
.....

[root@k8s-master01-15 core]# kubectl describe pods coredns-78fcd69978-6t76m -n kube-system
.......
Events:
  Type     Reason                  Age                      From     Message
  ----     ------                  ----                     ----     -------
  Normal   SandboxChanged          19m (x4652 over 104m)    kubelet  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  4m26s (x5461 over 104m)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8e2992e19a969235ff30271e317565b48ffe57d8261dc86f92249005a5eaaec5" network for pod "coredns-78fcd69978-6t76m": networkPlugin cni failed to set up pod "coredns-78fcd69978-6t76m_kube-system" network: open /run/flannel/subnet.env: no such file or directory

解决方案：

查看是否有 /run/flannel/subnet.env 这个文件，master 上是存在的，也有内容：

[root@k8s-master01-15:~]# mkdir -p /run/flannel/

cat >> /run/flannel/subnet.env

7）发现 kube-proxy出现异常，状态：CrashLoopBackOff

kube-proxy是作用于service的，作用主要是负责service的实现，实现了内部从pod到service和外部的从node port向service的访问

[root@master core]# kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS              RESTARTS         AGE    IP               NODE     NOMINATED NODE   READINESS GATES
coredns-78fcd69978-dts5t         0/1     ContainerCreating   0                146m              node02              
coredns-78fcd69978-v8g7z         0/1     ContainerCreating   0                146m              node02              
etcd-master                      1/1     Running             0                147m   172.23.199.15   master              
kube-apiserver-master            1/1     Running             0                147m   172.23.199.15   master              
kube-controller-manager-master   1/1     Running             0                147m   172.23.199.15   master              
kube-proxy-9nxhp                 0/1     CrashLoopBackOff    33 (37s ago)     144m   172.23.199.16   node01              
kube-proxy-gqrvl                 0/1     CrashLoopBackOff    33 (86s ago)     145m   172.23.199.17   node02              
kube-proxy-p825v                 0/1     CrashLoopBackOff    33 (2m54s ago)   146m   172.23.199.15   master              
kube-scheduler-master            1/1     Running             0                147m   172.23.199.15   master              

# 使用kubectl describe XXX 排查pod状态的信息
[root@master core]# kubectl describe pod kube-proxy-9nxhp -n kube-system
...
Events:
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Warning  BackOff  8s (x695 over 150m)  kubelet  Back-off restarting failed container

解决方案：

在1.19版本之前,kubeadm部署方式启用ipvs模式时,初始化配置文件需要添加以下内容:

---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
  SupportIPVSProxyMode: true
mode: ipvs

如果部署是1.20以上版本,又是使用kubeadm进行集群初始化时,虽然可以正常部署,但是查看pod情况的时候可以看到kube-proxy无法运行成功,报错部分内容如下:

[root@k8s-master01-15:~]# kubectl get pod -A|grep kube-proxy
kube-system kube-proxy-7vrbv 0/1 CrashLoopBackOff 9 43m
kube-system kube-proxy-ghs7h 0/1 CrashLoopBackOff 9 43m
kube-system kube-proxy-l9twb 0/1 CrashLoopBackOff 1 7s

查看日志信息

[root@k8s-master01-15:~]# kubectl logs kube-proxy-9qbwp  -n kube-system
E0216 03:00:11.595910       1 run.go:74] "command failed" err="failed complete: unrecognized feature gate: SupportIPVSProxyMode"

通过报错可以看到kube-proxy无法识别SupportIPVSProxyMode这个字段,于是访问官方查看最新版本ipvs开启的正确配置,通过https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/ipvs/README.md#check-ipvs-proxy-rules，可以看到官方说明:

Cluster Created by Kubeadm
If you are using kubeadm with a configuration file, you have to add mode: ipvs in a KubeProxyConfiguration (separated by — that is also passed to kubeadm init).

kubeProxy:
  config:
    mode: ipvs

由于集群已经初始化成功了,所以现在改kubeadm初始化配置文件没有意义,因为我们需要直接修改kube-proxy的启动配置

通过查看kube-pxory的资源清单可以知道, kube-proxy的配置文件是通过configmap方式挂载到容器中的,因此我们只需要对应修改configmap中的配置内容,就可以将无效字段删除

[root@k8s-master01-15:~]# kubectl -o yaml get pod -n kube-system kube-proxy-24tkb
...... # 其他内容省略
   99     volumes:
   100    - configMap:
   101        defaultMode: 420
   102        name: kube-proxy
   103      name: kube-proxy
...... # 其他内容省略

[root@k8s-master01-15:~]# kubectl get cm -n kube-system
NAME                                 DATA   AGE
coredns                              1      19h
extension-apiserver-authentication   6      19h
kube-flannel-cfg                     2      109m
kube-proxy                           2      19h
kube-root-ca.crt                     1      19h
kubeadm-config                       1      19h
kubelet-config-1.23                  1      19h

在编辑模式中找到以下字段,删除后保存退出

[root@k8s-master01-15:~]# kubectl edit cm kube-proxy -n kube-system　　　　
featureGates: 
  SupportIPVSProxyMode: true

然后将删除所有kube-proxy进行重启,查看pod运行情况

[root@k8s-master01-15:~]# watchpod -n kube-system
NAME                               READY   STATUS    RESTARTS       AGE
coredns-64897985d-c9t7s            1/1     Running   0              19h
coredns-64897985d-knvxg            1/1     Running   0              19h
etcd-master01                      1/1     Running   0              19h
kube-apiserver-master01            1/1     Running   0              19h
kube-controller-manager-master01   1/1     Running   0              19h
kube-flannel-ds-6lbmw              1/1     Running   15 (56m ago)   110m
kube-flannel-ds-97mkh              1/1     Running   15 (56m ago)   110m
kube-flannel-ds-fthvm              1/1     Running   15 (56m ago)   110m
kube-proxy-4jj7b                   1/1     Running   0              55m
kube-proxy-ksltf                   1/1     Running   0              55m
kube-proxy-w8dcr                   1/1     Running   0              55m
kube-scheduler-master01            1/1     Running   0              19h

在服务器上安装ipvsadm,查看ipvs模式是否启用成功

[root@k8s-master01-15:~]# yum install ipvsadm -y
[root@k8s-master01-15:~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.96.0.1:443 rr
  -> 172.23.142.233:6443          Masq    1      3          0         
TCP  10.96.0.10:53 rr
  -> 10.244.0.2:53                Masq    1      0          0         
  -> 10.244.0.3:53                Masq    1      0          0         
TCP  10.96.0.10:9153 rr
  -> 10.244.0.2:9153              Masq    1      0          0         
  -> 10.244.0.3:9153              Masq    1      0          0         
UDP  10.96.0.10:53 rr
  -> 10.244.0.2:53                Masq    1      0          0         
  -> 10.244.0.3:53                Masq    1      0          0

8）解决pod的IP无法ping通的问题

集群安装完成后, 启动一个pod

# 启动pod, 命名为nginx-offi, 里面运行的容器为从官网拉取的Nginx镜像
[root@k8s-master01-15:~]# kubectl run nginx-offi --image=nginx
pod/nginx-offi created

# 查看pod的运行信息, 可以看到状态为 "Running" ,IP为 "10.244.1.7", 运行在了 "k8s-node01-16" 节点上
[root@k8s-master01-15:~]# kubectl get pod -o wide
NAME            READY   STATUS    RESTARTS   AGE   IP           NODE             NOMINATED NODE   READINESS GATES
nginx-offi      1/1     Running   0          55s   10.244.1.7   k8s-node01-16

但是如果在主节点k8s-master01-15 或另一个子节点 k8s-node02-17上访问刚才运行的pod, 却发现访问不到, 可以尝试 ping一下该IP地址：10.244.1.7也ping不通, 尽管前面我们已经安装好了flannel.

UP发现：是iptables 规则的问题, 前面我们在初始化服务器设置的时候清除了iptables的规则, 但主要原因是不是安装了 flannel 还是哪一步的问题, 会导致 iptables 里面又多出了规则

# 查看iptables
(root@k8s-master01-15:~)# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
KUBE-FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */
DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER (1 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
# Warning: iptables-legacy tables present, use iptables-legacy to see them

需要再次清空 iptables 规则

iptables -F &&  iptables -X &&  iptables -F -t nat &&  iptables -X -t nat

再次查看iptables

(root@k8s-master01-15:~)# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
KUBE-FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
# Warning: iptables-legacy tables present, use iptables-legacy to see them

再次ping或者访问pod, 即可成功

(root@k8s-master01-15:~)# curl 10.244.1.7

Welcome to nginx!




If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.


For online documentation and support please refer to
nginx.org.
Commercial support is available at
nginx.com.


Thank you for using nginx.

posted @ 2023-05-04 17:42 珊瑚贝博客阅读(921) 评论(0) 编辑收藏举报

刷新页面返回顶部

部署Kubeadm遇到的哪些问题，并且如何解决

1）设置错误导致kubeadm安装k8s失败

2）kubeadm安装k8s1.20.1版本添加node节点时出错

3）部署Kubernetes遇到的问题与解决方法（初始化等）

4）子节点服务器上运行kubectl get node却发现报错了, 如下

5) kubectl get cs 问题

6) k8s的coredns组件的问题（部署flannel网络插件）

解决方案：

7）发现 kube-proxy出现异常，状态：CrashLoopBackOff

解决方案：

8）解决pod的IP无法ping通的问题

Welcome to nginx!

公告