openeuler 22.03 lts sp4 使用 kubeadm 部署 k8s-v1.28.2 高可用集群

废话篇

这篇文章什么时候写的

北京时间:2024年9月

为什么是 openeuler

为什么是 22.03 lts sp4

因为 22.03 lts sp42024年6月份 的最新版本,生命周期也是持续到 2026 年的

高可用架构

  • 如果是公有云服务器,可以直接买公有云的 lb 服务就好了,简单粗暴有人抓
  • 如果是本地私有化,我这边使用的是 keepalived+nginx(stream 4层负载) 的架构来实现 apiserver 的高可用
    • 本次实验是以容器的形式来部署 nginx 和 keepalived,主要目的是为了减少不同环境差异导致部署方式不同
    • 下面的丑图来解释一下 ha 的场景
      • keepalived 使用 backup 的模式部署
      • VIP 所在机器的 keepalived 对当前节点的 nginx 做健康检测,通过对应端口负载到背后的 apiserver 服务
        • 使用 nginx 的 steam 是为了节省机器的资源开支,用 upstream 属于七层负载,相较而言,资源使用会更高

在这里插入图片描述

题外话

当时本来想用静态 pod 的方式来运行 nginx 和 keepalived,后来发现,静态 pod 不支持 API 对象,只能放弃了,具体的查看 创建静态 Pod

  • 下面的这个部署方式,也就适合测试环境使用,生产环境,不建议把高可用组件放到同一个 k8s 集群里面,最好是外面独立部署,包括 etcd 也可以考虑外置

干活篇

环境介绍

组件版本
OS openEuler 22.03 (LTS-SP4)
containerd 1.6.33
k8s 1.28.2-0
nerdctl 1.7.6
nginx 1.26.0
keepalived 2.3.1

机器 ip 和对应的服务

IPHOSTNAMESERVICE/ROLE
192.168.22.111 manager-k8s-cluster-01 k8s-master+k8s-worker+keepalived+nginx
192.168.22.112 manager-k8s-cluster-02 k8s-master+k8s-worker+keepalived+nginx
192.168.22.113 manager-k8s-cluster-03 k8s-master+k8s-worker+keepalived+nginx
192.168.22.114 manager-k8s-cluster-04 k8s-worker
192.168.22.115 manager-k8s-cluster-05 k8s-worker
192.168.22.200 / VIP

系统初始化相关

  • 如果是虚拟机还没就绪,可以先启动一台机器,执行完初始化后,直接克隆机器更方便快捷
  • 如果机器已经就绪了,下面的初始化操作,每个机器都需要执行
  • 下面的操作省略了静态 ip时间同步的操作,大家自己操作一下
关闭防火墙
systemctl disable firewalld --now
关闭 selinux
setenforce 0
sed -i '/SELINUX/s/enforcing/disabled/g' /etc/selinux/config
关闭 swap
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
开启内核模块
# 针对于 kubeproxy 使用 ipvs 模式的
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
# 常规要开启的
modprobe nf_conntrack
modprobe br_netfilter
modprobe overlay
开启模块自动加载服务
cat > /etc/modules-load.d/k8s-modules.conf <<EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
br_netfilter
overlay
EOF

设置为开机自启

systemctl enable systemd-modules-load --now
sysctl 内核参数调整
cat <<EOF > /etc/sysctl.d/kubernetes.conf
# 开启数据包转发功能(实现vxlan)
net.ipv4.ip_forward=1
# iptables对bridge的数据进行处理
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-arptables=1
# 不允许将TIME-WAIT sockets重新用于新的TCP连接
net.ipv4.tcp_tw_reuse=0
# socket监听(listen)的backlog上限
net.core.somaxconn=32768
# 最大跟踪连接数,默认 nf_conntrack_buckets * 4
net.netfilter.nf_conntrack_max=1000000
# 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
vm.swappiness=0
# 计算当前的内存映射文件数。
vm.max_map_count=655360
# 内核可分配的最大文件数
fs.file-max=6553600
# 持久连接
net.ipv4.tcp_keepalive_time=600
net.ipv4.tcp_keepalive_intvl=30
net.ipv4.tcp_keepalive_probes=10
EOF

立即生效

sysctl -p /etc/sysctl.d/kubernetes.conf
清空 iptables 规则
iptables -F && \
iptables -X && \
iptables -F -t nat && \
iptables -X -t nat && \
iptables -P FORWARD ACCEPT
安装各种依赖和工具
yum install -y vim wget tar net-tools jq bash-completion tree bind-utils telnet unzip nc
修改 .bashrc 文件

具体参考我之前的博客:关于 openeuler 22.03-LTS-SP4 scp 失败问题的记录,主要影响的是 scp 命令,具体的,看大家自己选择

安装 kubeadm 和 kubelet

k8s 官方也没有 openeuler 的源,但是可以直接使用 kubernetes-el7 的源来安装,下面是配置 kubernetes-el7

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安装 kubeadm 的时候,会自动安装 kubelet 和 kubectl 以及一些依赖的组件

yum install -y kubeadm-1.28.2-0

验证版本

kubeadm version

正常返回下面的内容,说明没问题

kubeadm version: &version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.2", GitCommit:"89a4ea3e1e4ddd7f7572286090359983e0387b2f", GitTreeState:"clean", BuildDate:"2023-09-13T09:34:32Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}
简化 kubectl 命令

有时候实在懒得敲 kubectl 了,只想敲一个 k

ln -s /usr/bin/kubectl /usr/bin/k
启动 kubelet

配置开机自启

systemctl enable kubelet --now
安装 containerd

openeuler 可以用 docker 的 centos 里面的 rpm 来安装,这一点,还是比较方便的

cat <<EOF> /etc/yum.repos.d/docker.repo
[docker-ce-centos]
name=Docker CE Stable centos
baseurl=https://mirrors.aliyun.com/docker-ce/linux/centos/7.9/x86_64/stable
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/docker-ce/linux/centos/gpg
EOF

安装 containerd

yum install -y containerd.io-1.6.33

生成默认的配置文件

containerd config default > /etc/containerd/config.toml

别的配置大家可以根据实际情况修改,国内的话,有一个参数可以修改,也可以不修改

  • sandbox_image 这个参数要指定 pause 镜像,默认的是 registry.k8s.io/pause:3.6,可以自己提前准备好镜像,然后修改成这个 tag,也可以和我一样,替换成国内阿里的
  • SystemdCgroup = false 这个参数需要修改,因为后面的 kubelet 也是用 systemd 这个 cgroup,默认导出的配置是 false,不配置会有下面的报错
    • openat2 /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf2248c8a5ab6855d0410a9f38c37b4a0.slice/cpuset.mems: no such file or directory
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9"
SystemdCgroup = true

启动 containerd,加入开机自启

systemctl enable containerd --now

配置 crictl 命令(安装 kubeadm 的时候,默认把 crictl 命令作为依赖下载了,需要通过配置文件,让 crictl 命令读取 containerdsocket 文件来达到管理 containerd 的目的)

crictl 命令默认的配置文件是 /etc/crictl.yaml,也可以自定义,使用 crictl 命令的时候加上 --config 来指定配置文件就可以了

echo 'runtime-endpoint: unix:///run/containerd/containerd.sock' > /etc/crictl.yaml

检查 crictl 和 containerd 的版本

crictl version

能展示下面的版本信息,说明部署和启动都没有问题了

Version:  0.1.0
RuntimeName:  containerd
RuntimeVersion:  1.6.33
RuntimeApiVersion:  v1
镜像准备

kubeadm 部署需要用到镜像,如果是内网环境,需要提前准备好镜像,然后导入镜像,用下面的命令可以查看需要提前准备哪些镜像

  • image-repository 就是后面 kubeadm 配置文件里面指定的,国内可以用下面的阿里云
  • kubernetes-version 是指定 k8s 的版本
kubeadm config images list \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--kubernetes-version 1.28.2

正常情况下,会输出下面这些内容

registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.2
registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1

如果当前环境有网,网络可能不是很好,也可以提前用下面的命令先把镜像拉下来,这样不会在初始化阶段超时报错

kubeadm config images pull \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--kubernetes-version 1.28.2

拉取过程也会有下面这样的输出,到 coredns 说明镜像都拉取好了

[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.28.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.9-0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1

也可以提前把 calico 镜像准备好

ctr -n k8s.io image pull docker.io/calico/cni:v3.28.1
ctr -n k8s.io image pull docker.io/calico/node:v3.28.1
ctr -n k8s.io image pull docker.io/calico/kube-controllers:v3.28.1

初始化的操作,到这里就结束了

部署 master 组件

集群初始化

准备初始化的配置文件,相关的配置文件,可以从官方获取:Configuration APIs

# 集群相关的一些配置
## https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta3/
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  # apiserver 服务的 ip 地址和端口
  advertiseAddress: 192.168.22.111
  bindPort: 6443
nodeRegistration:
  # 容器运行时的选择
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  # k8s 的节点名称,也就是以后 kubectl get nodes 查看的名字
  ## 不指定的话,一般都是直接读取本机的 hostname
  ## 这里看个人习惯
  name: 192.168.22.111
  # 节点污点相关的,根据自己的情况配置
  taints: null
---
apiServer:
  # 高可用涉及到的 ip 地址属于额外的配置
  ## 需要在初始化的时候,加入到证书的 ip 清单里面
  certSANs:
  - 192.168.22.200
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
# k8s 相关证书的目录
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
# apiserver 的访问地址,先写当前节点的 ip
controlPlaneEndpoint: 192.168.22.111:6443
controllerManager: {}
dns: {}
etcd:
  local:
    # etcd 的数据持久化目录,尽量放 ssd 固态盘上面,etcd 比较在意磁盘 io
    dataDir: /var/lib/etcd
# 镜像仓库地址,官方默认是 registry.k8s.io,咱们国内可以写阿里的
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
networking:
  # k8s dns 解析的域
  dnsDomain: cluster.local
  # k8s service 的网段
  serviceSubnet: 10.96.0.0/12
  # k8s pod 的网段
  ## 文章最后处会整理一下 ABC 三类地址的范围
  podSubnet: 172.22.0.0/16
scheduler: {}

# kubelet 相关的配置
## https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# 指定 cgroup 是 systemd
cgroupDriver: systemd
cgroupsPerQOS: true
# 配置容器的日志轮转
## 配置容器日志达到多少大小开始轮转,默认是 10Mi
containerLogMaxSize: 100Mi
## 配置容器日志轮转的最大文件数量,默认是 5
containerLogMaxFiles: 5

# kube-proxy 相关的配置
## https://kubernetes.io/docs/reference/config-api/kube-proxy-config.v1alpha1/
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# 代理模式,只有 iptables 和 ipvs 可选,windows 用 kernelspace
mode: iptables

通过配置文件初始化集群

kubeadm init --config kubeadm.yaml

返回类似下面的内容,说明集群已经初始化完成了

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
        --control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25
安装 calico 网络插件

官方的 yaml 在 github 上有,可以 clone 到本地,然后获取到这个 yaml 文件,里面有内容需要修改

  • 取消 CALICO_IPV4POOL_CIDR 的注释,值就是 kubeadm 初始化文件里面的 podSubnet
  • 增加 IP_AUTODETECTION_METHOD 指定一下网卡,如果本地多网卡,可能会有一些不知名的问题存在
            - name: CALICO_IPV4POOL_CIDR
              value: "172.22.0.0/16"
            - name: IP_AUTODETECTION_METHOD
              value: "interface=ens3"

apply 完 yaml 后,检查 pod 是否都正常启动

kubectl get pod -n kube-system

calico 这些都是 running 的就可以了

NAME                                      READY   STATUS    RESTARTS        AGE
calico-kube-controllers-97d84d657-bjlkx   1/1     Running   0               29s
calico-node-gppdv                         1/1     Running   0               29s

验证集群是否正常

kubectl get nodes

节点都是 ready 就可以了

NAME            STATUS   ROLES           AGE     VERSION
192.168.22.111   Ready    control-plane   9m29s   v1.28.2
其他 master 节点加入集群

剩下的 master 节点都要执行

mkdir -p /etc/kubernetes/pki/etcd

分发证书

# 分发给 192.168.22.112 节点
scp /etc/kubernetes/pki/{ca.crt,ca.key,sa.key,sa.pub,front-proxy-ca.crt,front-proxy-ca.key} 192.168.22.112:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/{ca.crt,ca.key} 192.168.22.112:/etc/kubernetes/pki/etcd/
# 分发给 192.168.22.113 节点
scp /etc/kubernetes/pki/{ca.crt,ca.key,sa.key,sa.pub,front-proxy-ca.crt,front-proxy-ca.key} 192.168.22.113:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/{ca.crt,ca.key} 192.168.22.113:/etc/kubernetes/pki/etcd/

通过上面初始化完成后给出的 join 命令来加入,下面的命令,分别在各自的 master 节点执行

# 192.168.22.112 节点执行
kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
        --control-plane \
        --node-name 192.168.22.112
# 192.168.22.113 节点执行
kubeadm join 192.168.22.111:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
        --control-plane \
        --node-name 192.168.22.113

返回类似下面的输出,说明加入集群成功了

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.
安装 nginx

这个 nginx 是用来对 apisever 做负载均衡的,这里先把节点的污点去了,不然 master 节点没法运行 pod

kubectl taint node --all node-role.kubernetes.io/control-plane-

创建 namespace

k create ns ha

下面是完整的 yaml 文件,包括了 namespace,configmap,deployment,里面的 ip 要注意换成自己本地的 ip,我这里就启动了两个 nginx

---
apiVersion: v1
data:
  nginx.conf: |
    worker_processes 1;
    events {
        worker_connections  1024;
    }
    stream {
        upstream k8s-apiserver {
            hash $remote_addr consistent;
            server 192.168.22.111:6443 max_fails=3 fail_timeout=30s;
            server 192.168.22.112:6443 max_fails=3 fail_timeout=30s;
            server 192.168.22.113:6443 max_fails=3 fail_timeout=30s;
        }

        server {
            listen       *:8443;
            proxy_connect_timeout 120s;
            proxy_pass k8s-apiserver;
        }
    }
kind: ConfigMap
metadata:
  name: nginx-lb-apiserver-cm
  namespace: ha
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-lb-apiserver
  name: nginx-lb-apiserver
  namespace: ha
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx-lb-apiserver
  template:
    metadata:
      labels:
        app: nginx-lb-apiserver
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - 192.168.22.111
                - 192.168.22.112
                - 192.168.22.113
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx-lb-apiserver
            topologyKey: kubernetes.io/hostname
      containers:
      - image: docker.m.daocloud.io/nginx:1.26.0
        imagePullPolicy: IfNotPresent
        name: nginx-lb-apiserver
        resources:
          limits:
            cpu: 1000m
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/nginx
          name: config
      hostNetwork: true
      volumes:
      - configMap:
          name: nginx-lb-apiserver-cm
        name: config
安装 keepalived

高可用组件采用 k8s pod 的形式部署,containerd 构建镜像比较麻烦,可以找个有 docker 的环境构建,然后导入镜像

构建 keepalived 镜像
  • Dockerfile
FROM docker.m.daocloud.io/debian:stable-20240904-slim

ENV LANG="en_US.UTF-8"
ENV LANGUAGE="en_US:en"
ENV LC_ALL="en_US.UTF-8"
ENV KEEPALIVED_VERSION="2.3.1"

RUN sed -i.bak 's/deb.debian.org/mirrors.aliyun.com/g' /etc/apt/sources.list.d/debian.sources && \
    apt-get update && \
    apt-get install -y autoconf \
    make \
    curl \
    gcc \
    ipset \
    iptables \
    musl-dev \
    openssl \
    libssl-dev \
    net-tools \
    ncat \
    && curl -o keepalived.tar.gz -SL https://keepalived.org/software/keepalived-${KEEPALIVED_VERSION}.tar.gz \
    && tar xf keepalived.tar.gz \
    && cd keepalived-${KEEPALIVED_VERSION} \
    && ./configure --disable-dynamic-linking \
    && make \
    && make install \
    && rm -f /keepalived.tar.gz \
    && apt-get remove -y musl-dev \
       libssl-dev \
       make \
       && apt-get -ys clean

构建镜像

docker build -t keepalived-2.3.1:debian-20240904-slim .

导出镜像

docker save keepalived-2.3.1:debian-20240904-slim > keepalived-2.3.1_debian-20240904-slim.tar

分发镜像到 master 节点

scp keepalived-2.3.1_debian-20240904-slim.tar 192.168.22.111:/tmp/
scp keepalived-2.3.1_debian-20240904-slim.tar 192.168.22.112:/tmp/
scp keepalived-2.3.1_debian-20240904-slim.tar 192.168.22.113:/tmp/

k8s 集群导入镜像

ctr -n k8s.io image import /tmp/keepalived-2.3.1_debian-20240904-slim.tar

下面的 yaml 里面的 ip 地址和 ens3 网卡名字修改成自己环境的就可以了,下面的 image 名字也是上面构建的镜像名字,如果有不一样的,也要修改

---
apiVersion: v1
data:
  keepalived.conf: |
    global_defs {
    }

    vrrp_script chk_nginx  {
        script "/etc/keepalived/chk_health/chk_nginx.sh"
        interval 2
        fall 3
        rise 2
        timeout 3
    }

    vrrp_instance VI_1 {
        state BACKUP
        interface ens3
        virtual_router_id 100
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass keep@lived
        }

        virtual_ipaddress {
            192.168.22.200
        }

        track_interface {
            ens3
        }

        nopreempt

        track_script {
            chk_nginx
        }
    }
kind: ConfigMap
metadata:
  name: keepalived-ha-apiserver-cm
  namespace: ha
---
apiVersion: v1
data:
  chk_nginx.sh: |
    #!/bin/bash
    exitNum=0

    while true
    do
      if ! nc -z 127.0.0.1 8443; then
        let exitNum++
        sleep 3
        [ ${exitNum} -lt 3 ] || exit 1
      else
        exit 0
      fi
    done
kind: ConfigMap
metadata:
  name: keepalived-ha-apiserver-chk-cm
  namespace: ha
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: keepalived-ha-apiserver
  name: keepalived-ha-apiserver
  namespace: ha
spec:
  replicas: 3
  selector:
    matchLabels:
      app: keepalived-ha-apiserver
  template:
    metadata:
      labels:
        app: keepalived-ha-apiserver
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - 192.168.22.111
                - 192.168.22.112
                - 192.168.22.113
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - keepalived-ha-apiserver
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -c
        - /usr/local/sbin/keepalived -n -l -f /etc/keepalived/keepalived.conf
        command:
        - bash
        image: keepalived-2.3.1:debian-20240904-slim
        imagePullPolicy: IfNotPresent
        name: keepalived-ha-apiserver
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
        volumeMounts:
        - mountPath: /etc/keepalived
          name: config
        - mountPath: /etc/keepalived/chk_health
          name: chekscript
      hostNetwork: true
      volumes:
      - configMap:
          name: keepalived-ha-apiserver-cm
        name: config
      - configMap:
          defaultMode: 493
          name: keepalived-ha-apiserver-chk-cm
        name: chekscript

验证 vip 是否通了,这里使用 vip+nginx 的端口来验证

nc -zv 192.168.22.200 8443

返回类似下面的输出说明网络是通的

Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 192.168.22.200:8443.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

切换成高可用访问

修改 controlPlaneEndpoint

在 init 节点操作以下命令修改 kubeadm-config 里面的 controlPlaneEndpoint 地址

kubectl edit cm -n kube-system kubeadm-config

修改成 vip 的地址

controlPlaneEndpoint: 192.168.22.200:8443
修改 kubeconfig 证书

以下操作,三个 master 节点都需要执行

替换 kubeconfig 文件中的 apiserver 地址,可以注释掉老的,然后写一个新的

  • admin.conf
  • controller-manager.conf
  • kubelet.conf
  • scheduler.conf
    server: https://192.168.22.200:8443
    # server: https://192.168.22.111:6443
重启 master 组件

只需要逐个节点重启 controller-managerschedulerkubelet 来验证

mv /etc/kubernetes/manifests/kube-scheduler.yaml .
# 可以等待一会,或者执行 crictl ps 看看是否有 scheduler 的容器存在
mv kube-scheduler.yaml /etc/kubernetes/manifests/
mv /etc/kubernetes/manifests/kube-controller-manager.yaml .
# 可以等待一会,或者执行 crictl ps 看看是否有 scheduler 的容器存在
mv kube-controller-manager.yaml /etc/kubernetes/manifests/
systemctl restart kubelet

重启完成后,执行下面的命令,看看是否能正常获取节点信息,可以正常获取就没问题了

kubectl get node --kubeconfig /etc/kubernetes/admin.conf
修改 kube-proxy 配置

用其中一个节点操作就可以了

k edit cm -n kube-system kube-proxy

主要修改 server 这块

  kubeconfig.conf: |-
    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://192.168.22.200:8443
        # server: https://192.168.22.111:6443
      name: default

重启 kube-proxy

k get pod -n kube-system | awk '/kube-proxy/ {print $1}' | xargs k delete pod -n kube-system

worker 节点加入集群

生成 join 命令

kubeadm token create --print-join-command --ttl=0

正常情况会返回类似下面的内容,就可以在其他的 worker 节点去执行了

kubeadm join 192.168.22.200:8443 --token lkuzqm.0kzboiy72rryp3fb --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25

下面是我加入的两个节点

kubeadm join 192.168.22.200:8443 --token lkuzqm.0kzboiy72rryp3fb \
        --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
        --node-name 192.168.22.114
kubeadm join 192.168.22.200:8443 --token lkuzqm.0kzboiy72rryp3fb \
        --discovery-token-ca-cert-hash sha256:592a0811d0c53cbafbeedec9899b95b494da2c0456cc3ef65ef2533ddfa26c25 \
        --node-name 192.168.22.115

更新十年证书

这里使用 github 上一个大佬写的脚本来更新证书,不采用编译 kubeadm 的方式,相对方便很多

  • update-kube-cert
    • 这个脚本的简要逻辑就是从当前集群的 k8s 证书里面通过 openssl 命令去读取一些相关的内容,基于之前的 ca 根证书(kubeadm 默认的 ca 证书是10年的,只是各个组件的证书只配置了一年)来重新生成各个组件的证书
    • 如果访问 github 有问题的,可以直接复制下面的脚本内容
#!/usr/bin/env bash

set -o errexit
set -o pipefail
# set -o xtrace

# set output color
NC='\033[0m'
RED='\033[31m'
GREEN='\033[32m'
YELLOW='\033[33m'
BLUE='\033[34m'
# set default cri
CRI="docker"

log::err() {
  printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][${RED}ERROR${NC}] %b\n" "$@"
}

log::info() {
  printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][INFO] %b\n" "$@"
}

log::warning() {
  printf "[$(date +'%Y-%m-%dT%H:%M:%S.%2N%z')][${YELLOW}WARNING${NC}] \033[0m%b\n" "$@"
}

check_file() {
  if [[ ! -r ${1} ]]; then
    log::err "can not find ${1}"
    exit 1
  fi
}

# get x509v3 subject alternative name from the old certificate
cert::get_subject_alt_name() {
  local cert=${1}.crt
  local alt_name

  check_file "${cert}"
  alt_name=$(openssl x509 -text -noout -in "${cert}" | grep -A1 'Alternative' | tail -n1 | sed 's/[[:space:]]*Address//g')
  printf "%s\n" "${alt_name}"
}

# get subject from the old certificate
cert::get_subj() {
  local cert=${1}.crt
  local subj

  check_file "${cert}"
  subj=$(openssl x509 -text -noout -in "${cert}" | grep "Subject:" | sed 's/Subject:/\//g;s/\,/\//;s/[[:space:]]//g')
  printf "%s\n" "${subj}"
}

cert::backup_file() {
  local file=${1}
  if [[ ! -e ${file}.old-$(date +%Y%m%d) ]]; then
    cp -rp "${file}" "${file}.old-$(date +%Y%m%d)"
    log::info "backup ${file} to ${file}.old-$(date +%Y%m%d)"
  else
    log::warning "does not backup, ${file}.old-$(date +%Y%m%d) already exists"
  fi
}

# check certificate expiration
cert::check_cert_expiration() {
  local cert=${1}.crt
  local cert_expires

  cert_expires=$(openssl x509 -text -noout -in "${cert}" | awk -F ": " '/Not After/{print$2}')
  printf "%s\n" "${cert_expires}"
}

# check kubeconfig expiration
cert::check_kubeconfig_expiration() {
  local config=${1}.conf
  local cert
  local cert_expires

  cert=$(grep "client-certificate-data" "${config}" | awk '{print$2}' | base64 -d)
  cert_expires=$(openssl x509 -text -noout -in <(printf "%s" "${cert}") | awk -F ": " '/Not After/{print$2}')
  printf "%s\n" "${cert_expires}"
}

# check etcd certificates expiration
cert::check_etcd_certs_expiration() {
  local cert
  local certs

  certs=(
    "${ETCD_CERT_CA}"
    "${ETCD_CERT_SERVER}"
    "${ETCD_CERT_PEER}"
    "${ETCD_CERT_HEALTHCHECK_CLIENT}"
    "${ETCD_CERT_APISERVER_ETCD_CLIENT}"
  )

  for cert in "${certs[@]}"; do
    if [[ ! -r ${cert} ]]; then
      printf "%-50s%-30s\n" "${cert}.crt" "$(cert::check_cert_expiration "${cert}")"
    fi
  done
}

# check master certificates expiration
cert::check_master_certs_expiration() {
  local certs
  local kubeconfs
  local cert
  local conf

  certs=(
    "${CERT_CA}"
    "${CERT_APISERVER}"
    "${CERT_APISERVER_KUBELET_CLIENT}"
    "${FRONT_PROXY_CA}"
    "${FRONT_PROXY_CLIENT}"
  )

  # add support for super_admin.conf, which was added after k8s v1.30.
  if [ -f "${CONF_SUPER_ADMIN}.conf" ]; then
    kubeconfs=(
      "${CONF_CONTROLLER_MANAGER}"
      "${CONF_SCHEDULER}"
      "${CONF_ADMIN}"
      "${CONF_SUPER_ADMIN}"
    )
  else 
    kubeconfs=(
      "${CONF_CONTROLLER_MANAGER}"
      "${CONF_SCHEDULER}"
      "${CONF_ADMIN}"
    )
  fi

  printf "%-50s%-30s\n" "CERTIFICATE" "EXPIRES"

  for conf in "${kubeconfs[@]}"; do
    if [[ ! -r ${conf} ]]; then
      printf "%-50s%-30s\n" "${conf}.config" "$(cert::check_kubeconfig_expiration "${conf}")"
    fi
  done

  for cert in "${certs[@]}"; do
    if [[ ! -r ${cert} ]]; then
      printf "%-50s%-30s\n" "${cert}.crt" "$(cert::check_cert_expiration "${cert}")"
    fi
  done
}

# check all certificates expiration
cert::check_all_expiration() {
  cert::check_master_certs_expiration
  cert::check_etcd_certs_expiration
}

# generate certificate whit client, server or peer
# Args:
#   $1 (the name of certificate)
#   $2 (the type of certificate, must be one of client, server, peer)
#   $3 (the subject of certificates)
#   $4 (the validity of certificates) (days)
#   $5 (the name of ca)
#   $6 (the x509v3 subject alternative name of certificate when the type of certificate is server or peer)
cert::gen_cert() {
  local cert_name=${1}
  local cert_type=${2}
  local subj=${3}
  local cert_days=${4}
  local ca_name=${5}
  local alt_name=${6}
  local ca_cert=${ca_name}.crt
  local ca_key=${ca_name}.key
  local cert=${cert_name}.crt
  local key=${cert_name}.key
  local csr=${cert_name}.csr
  local common_csr_conf='distinguished_name = dn\n[dn]\n[v3_ext]\nkeyUsage = critical, digitalSignature, keyEncipherment\n'

  for file in "${ca_cert}" "${ca_key}" "${cert}" "${key}"; do
    check_file "${file}"
  done

  case "${cert_type}" in
  client)
    csr_conf=$(printf "%bextendedKeyUsage = clientAuth\n" "${common_csr_conf}")
    ;;
  server)
    csr_conf=$(printf "%bextendedKeyUsage = serverAuth\nsubjectAltName = %b\n" "${common_csr_conf}" "${alt_name}")
    ;;
  peer)
    csr_conf=$(printf "%bextendedKeyUsage = serverAuth, clientAuth\nsubjectAltName = %b\n" "${common_csr_conf}" "${alt_name}")
    ;;
  *)
    log::err "unknow, unsupported certs type: ${YELLOW}${cert_type}${NC}, supported type: client, server, peer"
    exit 1
    ;;
  esac

  # gen csr
  openssl req -new -key "${key}" -subj "${subj}" -reqexts v3_ext \
    -config <(printf "%b" "${csr_conf}") \
    -out "${csr}" >/dev/null 2>&1
  # gen cert
  openssl x509 -in "${csr}" -req -CA "${ca_cert}" -CAkey "${ca_key}" -CAcreateserial -extensions v3_ext \
    -extfile <(printf "%b" "${csr_conf}") \
    -days "${cert_days}" -out "${cert}" >/dev/null 2>&1

  rm -f "${csr}"
}

cert::update_kubeconf() {
  local cert_name=${1}
  local kubeconf_file=${cert_name}.conf
  local cert=${cert_name}.crt
  local key=${cert_name}.key
  local subj
  local cert_base64

  check_file "${kubeconf_file}"
  # get the key from the old kubeconf
  grep "client-key-data" "${kubeconf_file}" | awk '{print$2}' | base64 -d >"${key}"
  # get the old certificate from the old kubeconf
  grep "client-certificate-data" "${kubeconf_file}" | awk '{print$2}' | base64 -d >"${cert}"
  # get subject from the old certificate
  subj=$(cert::get_subj "${cert_name}")
  cert::gen_cert "${cert_name}" "client" "${subj}" "${CERT_DAYS}" "${CERT_CA}"
  # get certificate base64 code
  cert_base64=$(base64 -w 0 "${cert}")

  # set certificate base64 code to kubeconf
  sed -i 's/client-certificate-data:.*/client-certificate-data: '"${cert_base64}"'/g' "${kubeconf_file}"

  rm -f "${cert}"
  rm -f "${key}"
}

cert::update_etcd_cert() {
  local subj
  local subject_alt_name
  local cert

  # generate etcd server,peer certificate
  # /etc/kubernetes/pki/etcd/server
  # /etc/kubernetes/pki/etcd/peer
  for cert in ${ETCD_CERT_SERVER} ${ETCD_CERT_PEER}; do
    subj=$(cert::get_subj "${cert}")
    subject_alt_name=$(cert::get_subject_alt_name "${cert}")
    cert::gen_cert "${cert}" "peer" "${subj}" "${CERT_DAYS}" "${ETCD_CERT_CA}" "${subject_alt_name}"
    log::info "${GREEN}updated ${BLUE}${cert}.conf${NC}"
  done

  # generate etcd healthcheck-client,apiserver-etcd-client certificate
  # /etc/kubernetes/pki/etcd/healthcheck-client
  # /etc/kubernetes/pki/apiserver-etcd-client
  for cert in ${ETCD_CERT_HEALTHCHECK_CLIENT} ${ETCD_CERT_APISERVER_ETCD_CLIENT}; do
    subj=$(cert::get_subj "${cert}")
    cert::gen_cert "${cert}" "client" "${subj}" "${CERT_DAYS}" "${ETCD_CERT_CA}"
    log::info "${GREEN}updated ${BLUE}${cert}.conf${NC}"
  done

  # restart etcd
  case $CRI in
    "docker")
      docker ps | awk '/k8s_etcd/{print$1}' | xargs -r -I '{}' docker restart {} >/dev/null 2>&1 || true
      ;;
    "containerd")
      crictl ps | awk '/etcd-/{print$(NF-1)}' | xargs -r -I '{}' crictl stopp {} >/dev/null 2>&1 || true
      ;;
  esac
  log::info "restarted etcd with ${CRI}"
}

cert::update_master_cert() {
  local subj
  local subject_alt_name
  local conf

  # generate apiserver server certificate
  # /etc/kubernetes/pki/apiserver
  subj=$(cert::get_subj "${CERT_APISERVER}")
  subject_alt_name=$(cert::get_subject_alt_name "${CERT_APISERVER}")
  cert::gen_cert "${CERT_APISERVER}" "server" "${subj}" "${CERT_DAYS}" "${CERT_CA}" "${subject_alt_name}"
  log::info "${GREEN}updated ${BLUE}${CERT_APISERVER}.crt${NC}"

  # generate apiserver-kubelet-client certificate
  # /etc/kubernetes/pki/apiserver-kubelet-client
  subj=$(cert::get_subj "${CERT_APISERVER_KUBELET_CLIENT}")
  cert::gen_cert "${CERT_APISERVER_KUBELET_CLIENT}" "client" "${subj}" "${CERT_DAYS}" "${CERT_CA}"
  log::info "${GREEN}updated ${BLUE}${CERT_APISERVER_KUBELET_CLIENT}.crt${NC}"

  # generate kubeconf for controller-manager,scheduler and kubelet
  # /etc/kubernetes/controller-manager,scheduler,admin,kubelet.conf,super_admin(added after k8s v1.30.)

  if [ -f "${CONF_SUPER_ADMIN}.conf" ]; then
    conf_list="${CONF_CONTROLLER_MANAGER} ${CONF_SCHEDULER} ${CONF_ADMIN} ${CONF_KUBELET} ${CONF_SUPER_ADMIN}"
  else 
    conf_list="${CONF_CONTROLLER_MANAGER} ${CONF_SCHEDULER} ${CONF_ADMIN} ${CONF_KUBELET}"
  fi
  
  for conf in ${conf_list}; do
    if [[ ${conf##*/} == "kubelet" ]]; then
      # https://github.com/kubernetes/kubeadm/issues/1753
      set +e
      grep kubelet-client-current.pem /etc/kubernetes/kubelet.conf >/dev/null 2>&1
      kubelet_cert_auto_update=$?
      set -e
      if [[ "$kubelet_cert_auto_update" == "0" ]]; then
        log::info "does not need to update kubelet.conf"
        continue
      fi
    fi

    # update kubeconf
    cert::update_kubeconf "${conf}"
    log::info "${GREEN}updated ${BLUE}${conf}.conf${NC}"

    # copy admin.conf to ${HOME}/.kube/config
    if [[ ${conf##*/} == "admin" ]]; then
      mkdir -p "${HOME}/.kube"
      local config=${HOME}/.kube/config
      local config_backup
      config_backup=${HOME}/.kube/config.old-$(date +%Y%m%d)
      if [[ -f ${config} ]] && [[ ! -f ${config_backup} ]]; then
        cp -fp "${config}" "${config_backup}"
        log::info "backup ${config} to ${config_backup}"
      fi
      cp -fp "${conf}.conf" "${HOME}/.kube/config"
      log::info "copy the admin.conf to ${HOME}/.kube/config"
    fi
  done

  # generate front-proxy-client certificate
  # /etc/kubernetes/pki/front-proxy-client
  subj=$(cert::get_subj "${FRONT_PROXY_CLIENT}")
  cert::gen_cert "${FRONT_PROXY_CLIENT}" "client" "${subj}" "${CERT_DAYS}" "${FRONT_PROXY_CA}"
  log::info "${GREEN}updated ${BLUE}${FRONT_PROXY_CLIENT}.crt${NC}"

  # restart apiserver, controller-manager, scheduler and kubelet
  for item in "apiserver" "controller-manager" "scheduler"; do
    case $CRI in
      "docker")
        docker ps | awk '/k8s_kube-'${item}'/{print$1}' | xargs -r -I '{}' docker restart {} >/dev/null 2>&1 || true
        ;;
      "containerd")
        crictl ps | awk '/kube-'${item}'-/{print $(NF-1)}' | xargs -r -I '{}' crictl stopp {} >/dev/null 2>&1 || true
        ;;
    esac
    log::info "restarted ${item} with ${CRI}"
  done
  systemctl restart kubelet || true
  log::info "restarted kubelet"
}

main() {
  local node_type=$1

  # read the options
  ARGS=`getopt -o c: --long cri: -- "$@"`
  eval set -- "$ARGS"
  # extract options and their arguments into variables.
  while true
  do
    case "$1" in
      -c|--cri)
        case "$2" in
          "docker"|"containerd")
            CRI=$2
            shift 2
            ;;
          *)
            echo 'Unsupported cri. Valid options are "docker", "containerd".'
            exit 1
            ;;
        esac
        ;;
      --)
        shift
        break
        ;;
      *)
        echo "Invalid arguments."
        exit 1
        ;;
    esac
  done

  CERT_DAYS=3650

  KUBE_PATH=/etc/kubernetes
  PKI_PATH=${KUBE_PATH}/pki

  # master certificates path
  # apiserver
  CERT_CA=${PKI_PATH}/ca
  CERT_APISERVER=${PKI_PATH}/apiserver
  CERT_APISERVER_KUBELET_CLIENT=${PKI_PATH}/apiserver-kubelet-client
  CONF_CONTROLLER_MANAGER=${KUBE_PATH}/controller-manager
  CONF_SCHEDULER=${KUBE_PATH}/scheduler
  CONF_ADMIN=${KUBE_PATH}/admin
  CONF_SUPER_ADMIN=${KUBE_PATH}/super-admin
  CONF_KUBELET=${KUBE_PATH}/kubelet
  # front-proxy
  FRONT_PROXY_CA=${PKI_PATH}/front-proxy-ca
  FRONT_PROXY_CLIENT=${PKI_PATH}/front-proxy-client

  # etcd certificates path
  ETCD_CERT_CA=${PKI_PATH}/etcd/ca
  ETCD_CERT_SERVER=${PKI_PATH}/etcd/server
  ETCD_CERT_PEER=${PKI_PATH}/etcd/peer
  ETCD_CERT_HEALTHCHECK_CLIENT=${PKI_PATH}/etcd/healthcheck-client
  ETCD_CERT_APISERVER_ETCD_CLIENT=${PKI_PATH}/apiserver-etcd-client

  case ${node_type} in
  # etcd)
  # # update etcd certificates
  #   cert::update_etcd_cert
  # ;;
  master)
    # check certificates expiration
    cert::check_master_certs_expiration
    # backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)
    cert::backup_file "${KUBE_PATH}"
    # update master certificates and kubeconf
    log::info "${GREEN}updating...${NC}"
    cert::update_master_cert
    log::info "${GREEN}done!!!${NC}"
    # check certificates expiration after certificates updated
    cert::check_master_certs_expiration
    ;;
  all)
    # check certificates expiration
    cert::check_all_expiration
    # backup $KUBE_PATH to $KUBE_PATH.old-$(date +%Y%m%d)
    cert::backup_file "${KUBE_PATH}"
    # update etcd certificates
    log::info "${GREEN}updating...${NC}"
    cert::update_etcd_cert
    # update master certificates and kubeconf
    cert::update_master_cert
    log::info "${GREEN}done!!!${NC}"
    # check certificates expiration after certificates updated
    cert::check_all_expiration
    ;;
  check)
    # check certificates expiration
    cert::check_all_expiration
    ;;
  *)
    log::err "unknown, unsupported cert type: ${node_type}, supported type: \"all\", \"master\""
    printf "Documentation: https://github.com/yuyicai/update-kube-cert
  example:
    '\033[32m./update-kubeadm-cert.sh all\033[0m' update all etcd certificates, master certificates and kubeconf
      /etc/kubernetes
      ├── admin.conf
      ├── super-admin.conf
      ├── controller-manager.conf
      ├── scheduler.conf
      ├── kubelet.conf
      └── pki
          ├── apiserver.crt
          ├── apiserver-etcd-client.crt
          ├── apiserver-kubelet-client.crt
          ├── front-proxy-client.crt
          └── etcd
              ├── healthcheck-client.crt
              ├── peer.crt
              └── server.crt

    '\033[32m./update-kubeadm-cert.sh master\033[0m' update only master certificates and kubeconf
      /etc/kubernetes
      ├── admin.conf
      ├── super-admin.conf
      ├── controller-manager.conf
      ├── scheduler.conf
      ├── kubelet.conf
      └── pki
          ├── apiserver.crt
          ├── apiserver-kubelet-client.crt
          └── front-proxy-client.crt
"
    exit 1
    ;;
  esac
}

main "$@"

脚本需要在所有的 master 节点执行

bash update-kubeadm-cert.sh all --cri containerd

可以检查一下证书的到期时间

kubeadm certs check-expiration

可以看到,都是十年后到期了

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Sep 13, 2034 13:01 UTC   9y              ca                      no
apiserver                  Sep 13, 2034 13:01 UTC   9y              ca                      no
apiserver-etcd-client      Sep 13, 2034 13:01 UTC   9y              etcd-ca                 no
apiserver-kubelet-client   Sep 13, 2034 13:01 UTC   9y              ca                      no
controller-manager.conf    Sep 13, 2034 13:01 UTC   9y              ca                      no
etcd-healthcheck-client    Sep 13, 2034 13:01 UTC   9y              etcd-ca                 no
etcd-peer                  Sep 13, 2034 13:01 UTC   9y              etcd-ca                 no
etcd-server                Sep 13, 2034 13:01 UTC   9y              etcd-ca                 no
front-proxy-client         Sep 13, 2034 13:01 UTC   9y              front-proxy-ca          no
scheduler.conf             Sep 13, 2034 13:01 UTC   9y              ca                      no

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Sep 13, 2034 11:08 UTC   9y              no
etcd-ca                 Sep 13, 2034 11:08 UTC   9y              no
front-proxy-ca          Sep 13, 2034 11:08 UTC   9y              no

到这里,整个高可用的 k8s 集群就部署完成了

模拟节点故障

通过 ip a 命令看看哪个节点有 vip 存在

2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:08:f7:18 brd ff:ff:ff:ff:ff:ff
    inet 192.168.22.113/24 brd 192.168.22.255 scope global noprefixroute ens3
       valid_lft forever preferred_lft forever
    inet 192.168.22.200/32 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe08:f718/64 scope link
       valid_lft forever preferred_lft forever

我直接把这个节点关机来模拟节点故障,我们去另一个节点,查看 vip 是不是来了

2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:88:e7:0d brd ff:ff:ff:ff:ff:ff
    inet 192.168.22.112/24 brd 192.168.22.255 scope global noprefixroute ens3
       valid_lft forever preferred_lft forever
    inet 192.168.22.200/32 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe88:e70d/64 scope link
       valid_lft forever preferred_lft forever

查看节点状态,113 节点现在是 notready 的,因为被我关机了,也是可以正常获取节点信息的

NAME             STATUS     ROLES           AGE     VERSION
192.168.22.111   Ready      control-plane   115m    v1.28.2
192.168.22.112   Ready      control-plane   110m    v1.28.2
192.168.22.113   NotReady   control-plane   108m    v1.28.2
192.168.22.114   Ready      <none>          3m49s   v1.28.2
192.168.22.115   Ready      <none>          3m38s   v1.28.2

ABC 三类地址总结

类别地址范围默认子网掩码网络位/主机位可用 IP 数量私有地址范围
A 类 1.0.0.0126.255.255.255 255.0.0.0 8 位 / 24 位 16,777,214 10.0.0.010.255.255.255
B 类 128.0.0.0191.255.255.255 255.255.0.0 16 位 / 16 位 65,534 172.16.0.0172.31.255.255
C 类 192.0.0.0223.255.255.255 255.255.255.0 24 位 / 8 位 254 192.168.0.0192.168.255.255
posted @ 2024-09-17 09:28  月巴左耳东  阅读(12)  评论(0编辑  收藏  举报  来源