安装kubernetes-dashboard时显示为CrashLoopBackOff或Error状态的可能原因

在CentOS7上安装完K8S单Master+3Node集群后,安装kubernetes-dashboard时显示为CrashLoopBackOff或Error状态:

kubernetes-dashboard   dashboard-metrics-scraper-6f669b9c9b-btkzj   1/1     Running            0
    2m3s
kubernetes-dashboard   kubernetes-dashboard-758765f476-x8rc7        0/1     CrashLoopBackOff   2 (12s ago

查看具体的Pod日志:

[root@master1 ~]# kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-758765f476-x8rc7
2023/01/09 07:53:16 Using namespace: kubernetes-dashboard
2023/01/09 07:53:16 Using in-cluster config to connect to apiserver
2023/01/09 07:53:16 Starting overwatch
2023/01/09 07:53:16 Using secret token for csrf signing
2023/01/09 07:53:16 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get "https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf": 
dial tcp 10.96.0.1:443: connect: no route to host goroutine 1 [running]: github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004dfae8) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:41 +0x30e github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:66 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc000096b80) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:527 +0x94 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0x19aba3a?) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:495 +0x32 github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:594 main.main() /home/runner/work/dashboard/dashboard/src/app/backend/dashboard.go:96 +0x1cf

起初以为是相关端口(例如443)没在防火墙开放的原因,于是就把相关端口在所有机器上都开放了,并使用以下命令卸载干净了kubernetes-dashboard:

sudo kubectl delete deployment kubernetes-dashboard --namespace=kubernetes-dashboard 
sudo kubectl delete service kubernetes-dashboard  --namespace=kubernetes-dashboard 
sudo kubectl delete service dashboard-metrics-scraper  --namespace=kubernetes-dashboard 
sudo kubectl delete role.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard 
sudo kubectl delete clusterrole.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard
sudo kubectl delete rolebinding.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard
sudo kubectl delete clusterrolebinding.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard
sudo kubectl delete deployment.apps kubernetes-dashboard --namespace=kubernetes-dashboard
sudo kubectl delete deployment.apps dashboard-metrics-scraper --namespace=kubernetes-dashboard
sudo kubectl delete sa kubernetes-dashboard --namespace=kubernetes-dashboard 
sudo kubectl delete secret kubernetes-dashboard-certs --namespace=kubernetes-dashboard
sudo kubectl delete secret kubernetes-dashboard-csrf --namespace=kubernetes-dashboard
sudo kubectl delete secret kubernetes-dashboard-key-holder --namespace=kubernetes-dashboard
sudo kubectl delete namespace kubernetes-dashboard 
sudo kubectl delete configmap kubernetes-dashboard-settings 

重装之后发现还是老子样:

[root@master1 ~]# kubectl apply -f kubernetes-dashboard.yaml 

 于是干脆将所有机器上的防火墙临时关闭了:

[root@master1 ~]# systemctl stop firewalld

然后重新卸载干净kubernetes-dashboard,并进行重装,结果还是报错,但错误有一点点不一样了:

[root@master1 ~]# kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-758765f476-g7lqs
2023/01/09 08:03:05 Starting overwatch
2023/01/09 08:03:05 Using namespace: kubernetes-dashboard
2023/01/09 08:03:05 Using in-cluster config to connect to apiserver
2023/01/09 08:03:05 Using secret token for csrf signing
2023/01/09 08:03:05 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get "https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf": 
dial tcp 10.96.0.1:443: i/o timeout goroutine 1 [running]: github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc00049fae8) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:41 +0x30e github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:66 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc0001c4100) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:527 +0x94 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0x19aba3a?) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:495 +0x32 github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:594 main.main() /home/runner/work/dashboard/dashboard/src/app/backend/dashboard.go:96 +0x1cf

此时可以确定的是——应该不是端口开放之类的防火墙问题了,因为此时所有机器的防火墙都处于关闭状态!所以网上其他兄弟说的执行:iptables -L -n --line-numbers | grep dashboard,发现是ipatables规则问题就不成立了,虽然我这里也确实能查到相关拦截规则记录:

[root@master1 ~]# iptables -L -n --line-numbers | grep dashboard
1    REJECT     tcp  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes-dashboard/kubernetes-dashboard has no endpoints */ ADDRTYPE match dst-type LOCAL tcp dpt:32646 reject-with icmp-port-unreachable
1    REJECT     tcp  --  0.0.0.0/0            10.103.89.4          /* kubernetes-dashboard/kubernetes-dashboard has no endpoints */ tcp dpt:443 reject-with icmp-port-unreachable

经过多方查证,最后发现竟然是一开始使用 kubeadm init 命令初始化主节点Master时,配置文件里的 podSubnet: 192.168.0.0/16 (或直接的命令参数--pod-network-cidr=192.168.0.0/16)的网段与当前Master节点的内网IP地址(192.168.17.3)网段重叠,导致在分配网络的时候会出现问题,在安装kubernetes-dashboard组件的时候Node工作节点和Master主节点之间无法互访,导致安装失败

解决办法是:重新安装k8s集群

步骤是:

1、卸载干净了kubernetes-dashboard (见前文)

2、删除Node节点

[root@master1 ~]# kubectl get nodes

[root@master1 ~]# kubectl drain node1.localk8s --delete-emptydir-data --force --ignore-daemonsets
[root@master1 ~]# kubectl drain node2.localk8s --delete-emptydir-data --force --ignore-daemonsets
[root@master1 ~]# kubectl drain node3.localk8s --delete-emptydir-data --force --ignore-daemonsets

[root@master1 ~]# kubectl delete nodes node1.localk8s node2.localk8s node3.localk8s

 

2、重置K8S集群

[root@master1 ~]# kubeadm reset

3、删除残余文件

[root@master1 ~]# rm -rf /etc/kubernetes
[root@master1 ~]# rm -rf /var/lib/etcd/
[root@master1 ~]# rm -rf $HOME/.kube

4、修改kubeadm init 初始化配置文件中的podSubnet,并保存

比如 podSubnet: 192.168.0.0/16 改为 podSubnet: 192.169.0.0/16 

5、重新初始化K8S主节点

[root@master1 ~]# kubeadm init --config=/etc/kubeadm/init.default.yaml --upload-certs

后面重新将安装K8S集群安装部署操作就行了,然后重新安装kubernetes-dashboard

最后查看安装结果:

[root@master1 ~]# kubectl get svc,pods  -n kubernetes-dashboard
NAME                                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
service/dashboard-metrics-scraper   ClusterIP   10.96.254.169   <none>        8000/TCP        67s
service/kubernetes-dashboard        NodePort    10.96.146.223   <none>        443:31529/TCP   67s

NAME                                             READY   STATUS    RESTARTS   AGE
pod/dashboard-metrics-scraper-6f669b9c9b-lcdnv   1/1     Running   0          67s
pod/kubernetes-dashboard-758765f476-vfjx5        1/1     Running   0          67s

现在可以根据结果中显示的端口号,使用任意一个Node节点去访问kubernetes-dashboard了(Chrome打不开的话,请尝试一下Firefox,主要出现在低版本dashboard):

获取指定用户(kubernetes-dashboard)的Token命令:

kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep kubernetes-dashboard | awk '{print $1}')

 如果你的kubernetes-dashboard通知栏报各种 forbidden,首先你要确保你的kubernetes-dashboard 与你当前的Kubernates兼容,这部分信息可以从 https://github.com/kubernetes/dashboard/releases获取到,版本兼容的情况下,执行赋权命令(否则控制台什么也没有,没有查看权限,红色部分为上述Token对应的用户名和namespace):

kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:kubernetes-dashboard

 

最后建议一下,最好在kubernetes-dashboard的yaml配置文件中,将dashboard的访问端口一开始就固定下来:

增加一个nodePort: 31529 用于暴露服务
spec:
    type: NodePort
    ports:
      - prot: 443
        targetProt: 8443
        nodePort: 31529 

以下命令用于实时监控集群内Pod的状态:

----实时监控----
watch -n 3 kubectl get pods -A
----单次查看(可-n指定namespace)----
kubectl get pods -A -o wide

 附上kubernetes-dashboard v2.5.1 的配置文件(红色为自定义增加的配置)

Kubernetes version1.201.211.221.23
Compatibility ? ? ?

 

# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: Namespace
metadata:
  name: kubernetes-dashboard

---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 31529
  selector:
    k8s-app: kubernetes-dashboard

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kubernetes-dashboard
type: Opaque

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-csrf
  namespace: kubernetes-dashboard
type: Opaque
data:
  csrf: ""

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-key-holder
  namespace: kubernetes-dashboard
type: Opaque

---

kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-settings
  namespace: kubernetes-dashboard

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
rules:
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
    verbs: ["get", "update", "delete"]
    # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["kubernetes-dashboard-settings"]
    verbs: ["get", "update"]
    # Allow Dashboard to get metrics.
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["heapster", "dashboard-metrics-scraper"]
    verbs: ["proxy"]
  - apiGroups: [""]
    resources: ["services/proxy"]
    resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
    verbs: ["get"]

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
rules:
  # Allow Metrics Scraper to get metrics from the Metrics server
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods", "nodes"]
    verbs: ["get", "list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: kubernetes-dashboard
          image: kubernetesui/dashboard:v2.5.1
          imagePullPolicy: Always
          ports:
            - containerPort: 8443
              protocol: TCP
          args:
            - --auto-generate-certificates
            - --namespace=kubernetes-dashboard
            # Uncomment the following line to manually specify Kubernetes API server Host
            # If not specified, Dashboard will attempt to auto discover the API server and connect
            # to it. Uncomment only if the default does not work.
            # - --apiserver-host=http://my-address:port
          volumeMounts:
            - name: kubernetes-dashboard-certs
              mountPath: /certs
              # Create on-disk volume to store exec logs
            - mountPath: /tmp
              name: tmp-volume
          livenessProbe:
            httpGet:
              scheme: HTTPS
              path: /
              port: 8443
            initialDelaySeconds: 30
            timeoutSeconds: 30
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      volumes:
        - name: kubernetes-dashboard-certs
          secret:
            secretName: kubernetes-dashboard-certs
        - name: tmp-volume
          emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 8000
      targetPort: 8000
  selector:
    k8s-app: dashboard-metrics-scraper

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: dashboard-metrics-scraper
  template:
    metadata:
      labels:
        k8s-app: dashboard-metrics-scraper
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: dashboard-metrics-scraper
          image: kubernetesui/metrics-scraper:v1.0.7
          ports:
            - containerPort: 8000
              protocol: TCP
          livenessProbe:
            httpGet:
              scheme: HTTP
              path: /
              port: 8000
            initialDelaySeconds: 30
            timeoutSeconds: 30
          volumeMounts:
          - mountPath: /tmp
            name: tmp-volume
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      volumes:
        - name: tmp-volume
          emptyDir: {}
View Code

 

posted @ 2023-01-09 17:18  岁月已走远  阅读(1373)  评论(1编辑  收藏  举报