安装kubernetes-dashboard时显示为CrashLoopBackOff或Error状态的可能原因
在CentOS7上安装完K8S单Master+3Node集群后,安装kubernetes-dashboard时显示为CrashLoopBackOff或Error状态:
kubernetes-dashboard dashboard-metrics-scraper-6f669b9c9b-btkzj 1/1 Running 0 2m3s kubernetes-dashboard kubernetes-dashboard-758765f476-x8rc7 0/1 CrashLoopBackOff 2 (12s ago
查看具体的Pod日志:
[root@master1 ~]# kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-758765f476-x8rc7 2023/01/09 07:53:16 Using namespace: kubernetes-dashboard 2023/01/09 07:53:16 Using in-cluster config to connect to apiserver 2023/01/09 07:53:16 Starting overwatch 2023/01/09 07:53:16 Using secret token for csrf signing 2023/01/09 07:53:16 Initializing csrf token from kubernetes-dashboard-csrf secret panic: Get "https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf":
dial tcp 10.96.0.1:443: connect: no route to host goroutine 1 [running]: github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004dfae8) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:41 +0x30e github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:66 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc000096b80) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:527 +0x94 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0x19aba3a?) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:495 +0x32 github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:594 main.main() /home/runner/work/dashboard/dashboard/src/app/backend/dashboard.go:96 +0x1cf
起初以为是相关端口(例如443)没在防火墙开放的原因,于是就把相关端口在所有机器上都开放了,并使用以下命令卸载干净了kubernetes-dashboard:
sudo kubectl delete deployment kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete service kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete service dashboard-metrics-scraper --namespace=kubernetes-dashboard sudo kubectl delete role.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete clusterrole.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete rolebinding.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete clusterrolebinding.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete deployment.apps kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete deployment.apps dashboard-metrics-scraper --namespace=kubernetes-dashboard sudo kubectl delete sa kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete secret kubernetes-dashboard-certs --namespace=kubernetes-dashboard sudo kubectl delete secret kubernetes-dashboard-csrf --namespace=kubernetes-dashboard sudo kubectl delete secret kubernetes-dashboard-key-holder --namespace=kubernetes-dashboard sudo kubectl delete namespace kubernetes-dashboard sudo kubectl delete configmap kubernetes-dashboard-settings
重装之后发现还是老子样:
[root@master1 ~]# kubectl apply -f kubernetes-dashboard.yaml
于是干脆将所有机器上的防火墙临时关闭了:
[root@master1 ~]# systemctl stop firewalld
然后重新卸载干净kubernetes-dashboard,并进行重装,结果还是报错,但错误有一点点不一样了:
[root@master1 ~]# kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-758765f476-g7lqs 2023/01/09 08:03:05 Starting overwatch 2023/01/09 08:03:05 Using namespace: kubernetes-dashboard 2023/01/09 08:03:05 Using in-cluster config to connect to apiserver 2023/01/09 08:03:05 Using secret token for csrf signing 2023/01/09 08:03:05 Initializing csrf token from kubernetes-dashboard-csrf secret panic: Get "https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf":
dial tcp 10.96.0.1:443: i/o timeout goroutine 1 [running]: github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc00049fae8) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:41 +0x30e github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:66 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc0001c4100) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:527 +0x94 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0x19aba3a?) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:495 +0x32 github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:594 main.main() /home/runner/work/dashboard/dashboard/src/app/backend/dashboard.go:96 +0x1cf
此时可以确定的是——应该不是端口开放之类的防火墙问题了,因为此时所有机器的防火墙都处于关闭状态!所以网上其他兄弟说的执行:iptables -L -n --line-numbers | grep dashboard,发现是ipatables规则问题就不成立了,虽然我这里也确实能查到相关拦截规则记录:
[root@master1 ~]# iptables -L -n --line-numbers | grep dashboard 1 REJECT tcp -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes-dashboard/kubernetes-dashboard has no endpoints */ ADDRTYPE match dst-type LOCAL tcp dpt:32646 reject-with icmp-port-unreachable 1 REJECT tcp -- 0.0.0.0/0 10.103.89.4 /* kubernetes-dashboard/kubernetes-dashboard has no endpoints */ tcp dpt:443 reject-with icmp-port-unreachable
经过多方查证,最后发现竟然是一开始使用 kubeadm init 命令初始化主节点Master时,配置文件里的 podSubnet: 192.168.0.0/16 (或直接的命令参数--pod-network-cidr=192.168.0.0/16)的网段与当前Master节点的内网IP地址(192.168.17.3)网段重叠,导致在分配网络的时候会出现问题,在安装kubernetes-dashboard组件的时候Node工作节点和Master主节点之间无法互访,导致安装失败。
解决办法是:重新安装k8s集群
步骤是:
1、卸载干净了kubernetes-dashboard (见前文)
2、删除Node节点
[root@master1 ~]# kubectl get nodes [root@master1 ~]# kubectl drain node1.localk8s --delete-emptydir-data --force --ignore-daemonsets [root@master1 ~]# kubectl drain node2.localk8s --delete-emptydir-data --force --ignore-daemonsets [root@master1 ~]# kubectl drain node3.localk8s --delete-emptydir-data --force --ignore-daemonsets [root@master1 ~]# kubectl delete nodes node1.localk8s node2.localk8s node3.localk8s
2、重置K8S集群
[root@master1 ~]# kubeadm reset
3、删除残余文件
[root@master1 ~]# rm -rf /etc/kubernetes [root@master1 ~]# rm -rf /var/lib/etcd/ [root@master1 ~]# rm -rf $HOME/.kube
4、修改kubeadm init 初始化配置文件中的podSubnet,并保存
比如 podSubnet: 192.168.0.0/16 改为 podSubnet: 192.169.0.0/16
5、重新初始化K8S主节点
[root@master1 ~]# kubeadm init --config=/etc/kubeadm/init.default.yaml --upload-certs
后面重新将安装K8S集群安装部署操作就行了,然后重新安装kubernetes-dashboard
最后查看安装结果:
[root@master1 ~]# kubectl get svc,pods -n kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dashboard-metrics-scraper ClusterIP 10.96.254.169 <none> 8000/TCP 67s service/kubernetes-dashboard NodePort 10.96.146.223 <none> 443:31529/TCP 67s NAME READY STATUS RESTARTS AGE pod/dashboard-metrics-scraper-6f669b9c9b-lcdnv 1/1 Running 0 67s pod/kubernetes-dashboard-758765f476-vfjx5 1/1 Running 0 67s
现在可以根据结果中显示的端口号,使用任意一个Node节点去访问kubernetes-dashboard了(Chrome打不开的话,请尝试一下Firefox,主要出现在低版本dashboard):
获取指定用户(kubernetes-dashboard)的Token命令:
kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep kubernetes-dashboard | awk '{print $1}')
如果你的kubernetes-dashboard通知栏报各种 forbidden,首先你要确保你的kubernetes-dashboard 与你当前的Kubernates兼容,这部分信息可以从 https://github.com/kubernetes/dashboard/releases获取到,版本兼容的情况下,执行赋权命令(否则控制台什么也没有,没有查看权限,红色部分为上述Token对应的用户名和namespace):
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:kubernetes-dashboard
最后建议一下,最好在kubernetes-dashboard的yaml配置文件中,将dashboard的访问端口一开始就固定下来:
增加一个nodePort: 31529 用于暴露服务
spec:
type: NodePort
ports:
- prot: 443
targetProt: 8443
nodePort: 31529
以下命令用于实时监控集群内Pod的状态:
----实时监控----
watch -n 3 kubectl get pods -A
----单次查看(可-n指定namespace)----
kubectl get pods -A -o wide
附上kubernetes-dashboard v2.5.1 的配置文件(红色为自定义增加的配置)
Kubernetes version | 1.20 | 1.21 | 1.22 | 1.23 |
---|---|---|---|---|
Compatibility | ? | ? | ? | ✓ |
# Copyright 2017 The Kubernetes Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. apiVersion: v1 kind: Namespace metadata: name: kubernetes-dashboard --- apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard --- kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: type: NodePort ports: - port: 443 targetPort: 8443 nodePort: 31529 selector: k8s-app: kubernetes-dashboard --- apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-certs namespace: kubernetes-dashboard type: Opaque --- apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-csrf namespace: kubernetes-dashboard type: Opaque data: csrf: "" --- apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-key-holder namespace: kubernetes-dashboard type: Opaque --- kind: ConfigMap apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-settings namespace: kubernetes-dashboard --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard rules: # Allow Dashboard to get, update and delete Dashboard exclusive secrets. - apiGroups: [""] resources: ["secrets"] resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"] verbs: ["get", "update", "delete"] # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. - apiGroups: [""] resources: ["configmaps"] resourceNames: ["kubernetes-dashboard-settings"] verbs: ["get", "update"] # Allow Dashboard to get metrics. - apiGroups: [""] resources: ["services"] resourceNames: ["heapster", "dashboard-metrics-scraper"] verbs: ["proxy"] - apiGroups: [""] resources: ["services/proxy"] resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"] verbs: ["get"] --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard rules: # Allow Metrics Scraper to get metrics from the Metrics server - apiGroups: ["metrics.k8s.io"] resources: ["pods", "nodes"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: kubernetes-dashboard subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: kubernetes-dashboard subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kubernetes-dashboard --- kind: Deployment apiVersion: apps/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: kubernetes-dashboard template: metadata: labels: k8s-app: kubernetes-dashboard spec: securityContext: seccompProfile: type: RuntimeDefault containers: - name: kubernetes-dashboard image: kubernetesui/dashboard:v2.5.1 imagePullPolicy: Always ports: - containerPort: 8443 protocol: TCP args: - --auto-generate-certificates - --namespace=kubernetes-dashboard # Uncomment the following line to manually specify Kubernetes API server Host # If not specified, Dashboard will attempt to auto discover the API server and connect # to it. Uncomment only if the default does not work. # - --apiserver-host=http://my-address:port volumeMounts: - name: kubernetes-dashboard-certs mountPath: /certs # Create on-disk volume to store exec logs - mountPath: /tmp name: tmp-volume livenessProbe: httpGet: scheme: HTTPS path: / port: 8443 initialDelaySeconds: 30 timeoutSeconds: 30 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1001 runAsGroup: 2001 volumes: - name: kubernetes-dashboard-certs secret: secretName: kubernetes-dashboard-certs - name: tmp-volume emptyDir: {} serviceAccountName: kubernetes-dashboard nodeSelector: "kubernetes.io/os": linux # Comment the following tolerations if Dashboard must not be deployed on master tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule --- kind: Service apiVersion: v1 metadata: labels: k8s-app: dashboard-metrics-scraper name: dashboard-metrics-scraper namespace: kubernetes-dashboard spec: ports: - port: 8000 targetPort: 8000 selector: k8s-app: dashboard-metrics-scraper --- kind: Deployment apiVersion: apps/v1 metadata: labels: k8s-app: dashboard-metrics-scraper name: dashboard-metrics-scraper namespace: kubernetes-dashboard spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: dashboard-metrics-scraper template: metadata: labels: k8s-app: dashboard-metrics-scraper spec: securityContext: seccompProfile: type: RuntimeDefault containers: - name: dashboard-metrics-scraper image: kubernetesui/metrics-scraper:v1.0.7 ports: - containerPort: 8000 protocol: TCP livenessProbe: httpGet: scheme: HTTP path: / port: 8000 initialDelaySeconds: 30 timeoutSeconds: 30 volumeMounts: - mountPath: /tmp name: tmp-volume securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1001 runAsGroup: 2001 serviceAccountName: kubernetes-dashboard nodeSelector: "kubernetes.io/os": linux # Comment the following tolerations if Dashboard must not be deployed on master tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule volumes: - name: tmp-volume emptyDir: {}