博客园  :: 首页  :: 管理

笔者今天在对k8s,v1.23.6版本的的master节点使用如下命令进行初始化时

[root@k8s-master qq-5201351]# kubeadm init \
> --apiserver-advertise-address 192.18.106.87 \
> --image-repository registry.aliyuncs.com/google_containers \
> --kubernetes-version v1.23.6 \
> --service-cidr=10.96.0.0/12 \
> --pod-network-cidr=10.224.0.0/16

遇到如下报错:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

其核心报错,也就如下几条,最为主要的就是说,kubelet isn't running or healthy.

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifes ts". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248 /healthz": dial tcp [::1]:10248: connect: connection refused.

刚看到这个报错时,还是有一点懵的,但仔细一看,还好下面提出了一些排查方法,和可能的原因 ,其中有一点看起来很是有用

- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

这个与docker的一个配置有关系,即要求将cgroups disabled,于是笔者尝试如下解决方法

到/etc/docker/daemon.json配置文件中,至少需要添加上如下一段内容(如果有其他配置选项也可以添加到花括号之中)

{
"exec-opts":["native.cgroupdriver=systemd"]
}

说明:docker默认使用的Cgroup Driver是cgroupfs,我们上面是将其修改成systemd,这些通过docker info可以看出

然后我们重启docker让配置生效,因为这才刚开始从master节点搭建,于是笔者再使用kubeadm reset重置

最后我们再进行kubadm初始化master节点时,就不会有报错了~

 

 

 

尊重别人的劳动成果 转载请务必注明出处:https://www.cnblogs.com/5201351/p/17378926.html