k8s(v1.20.4)集群中一些pods启动异常处理
查看pod的状态
查看pod的不正常的日志
[root@node01 ~]# kubectl logs -f pods/ks-installer-5d65c99d54-8trlk -n kubesphere-system
2022-09-15T15:08:25+08:00 INFO : shell-operator latest
2022-09-15T15:08:25+08:00 INFO : HTTP SERVER Listening on 0.0.0.0:9115
2022-09-15T15:08:25+08:00 INFO : Use temporary dir: /tmp/shell-operator
2022-09-15T15:08:25+08:00 INFO : Initialize hooks manager ...
2022-09-15T15:08:25+08:00 INFO : Search and load hooks ...
2022-09-15T15:08:25+08:00 INFO : Load hook config from '/hooks/kubesphere/installRunner.py'
2022-09-15T15:08:26+08:00 INFO : Load hook config from '/hooks/kubesphere/schedule.sh'
2022-09-15T15:08:26+08:00 INFO : Initializing schedule manager ...
2022-09-15T15:08:26+08:00 INFO : KUBE Init Kubernetes client
2022-09-15T15:08:26+08:00 INFO : KUBE-INIT Kubernetes client is configured successfully
2022-09-15T15:08:26+08:00 INFO : MAIN: run main loop
2022-09-15T15:08:26+08:00 INFO : MAIN: add onStartup tasks
2022-09-15T15:08:26+08:00 INFO : Running schedule manager ...
2022-09-15T15:08:26+08:00 INFO : QUEUE add all HookRun@OnStartup
2022-09-15T15:08:26+08:00 INFO : MSTOR Create new metric shell_operator_live_ticks
2022-09-15T15:08:26+08:00 INFO : MSTOR Create new metric shell_operator_tasks_queue_length
2022-09-15T15:08:26+08:00 ERROR : error getting GVR for kind 'ClusterConfiguration': unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
2022-09-15T15:08:26+08:00 ERROR : Enable kube events for hooks error: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
2022-09-15T15:08:29+08:00 INFO : TASK_RUN Exit: program halts.
[root@node01 ~]# kubectl logs -f pods/ks-apiserver-5885f8687d-cwlcs -n kubesphere-system
W0915 15:11:21.193117 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
W0915 15:11:21.196895 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0915 15:11:21.293454 1 interface.go:60] start helm repo informer
I0915 15:11:21.374315 1 apiserver.go:372] Start cache objects
Error: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
2022/09/15 15:11:21 unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
查看k8s的apiservice资源
[root@node01 ~]# kubectl get apiservice
NAME SERVICE AVAILABLE AGE
v1. Local True 413d
v1.admissionregistration.k8s.io Local True 413d
v1.apiextensions.k8s.io Local True 413d
v1.apps Local True 413d
v1.authentication.k8s.io Local True 413d
v1.authorization.k8s.io Local True 413d
v1.autoscaling Local True 413d
v1.batch Local True 413d
v1.certificates.k8s.io Local True 413d
v1.coordination.k8s.io Local True 413d
v1.crd.projectcalico.org Local True 47h
v1.events.k8s.io Local True 413d
v1.monitoring.coreos.com Local True 47h
v1.networking.k8s.io Local True 413d
v1.node.k8s.io Local True 413d
v1.rbac.authorization.k8s.io Local True 413d
v1.scheduling.k8s.io Local True 413d
v1.storage.k8s.io Local True 413d
v1.velero.io Local True 47h
v1alpha1.application.kubesphere.io Local True 47h
v1alpha1.cluster.kubesphere.io Local True 47h
v1alpha1.devops.kubesphere.io Local True 47h
v1alpha1.installer.kubesphere.io Local True 47h
v1alpha1.monitoring.kubesphere.io Local True 47h
v1alpha1.network.kubesphere.io Local True 47h
v1alpha1.storage.kubesphere.io Local True 47h
v1alpha1.tenant.kubesphere.io Local True 47h
v1alpha2.iam.kubesphere.io Local True 47h
v1alpha2.quota.kubesphere.io Local True 47h
v1alpha2.servicemesh.kubesphere.io Local True 47h
v1alpha2.tenant.kubesphere.io Local True 47h
v1alpha3.devops.kubesphere.io Local True 47h
v1beta1.admissionregistration.k8s.io Local True 413d
v1beta1.apiextensions.k8s.io Local True 413d
v1beta1.app.k8s.io Local True 47h
v1beta1.authentication.k8s.io Local True 413d
v1beta1.authorization.k8s.io Local True 413d
v1beta1.batch Local True 413d
v1beta1.certificates.k8s.io Local True 413d
v1beta1.coordination.k8s.io Local True 413d
v1beta1.discovery.k8s.io Local True 413d
v1beta1.events.k8s.io Local True 413d
v1beta1.extensions Local True 413d
v1beta1.flowcontrol.apiserver.k8s.io Local True 413d
v1beta1.metrics.k8s.io kube-system/metrics-server False (MissingEndpoints) 339d
v1beta1.networking.k8s.io Local True 413d
v1beta1.node.k8s.io Local True 413d
v1beta1.policy Local True 413d
v1beta1.rbac.authorization.k8s.io Local True 413d
v1beta1.scheduling.k8s.io Local True 413d
v1beta1.snapshot.storage.k8s.io Local True 47h
v1beta1.storage.k8s.io Local True 413d
v2beta1.autoscaling Local True 413d
v2beta1.notification.kubesphere.io Local True 47h
v2beta2.autoscaling Local True 413d
[root@node01 ~]# kubectl api-resources | grep apiservice
apiservices apiregistration.k8s.io/v1 false APIService
error: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
发现资源里面有一个False
操作
[root@node01 ~]# kubectl delete apiservice v1beta1.metrics.k8s.io
apiservice.apiregistration.k8s.io "v1beta1.metrics.k8s.io" deleted
操作删除完之后在通过查看apiservice没有了,过了一会就pod状态恢复正常。
参考
https://blog.csdn.net/weixin_40449300/article/details/108348420
https://www.cnblogs.com/jackluo/p/12222335.html