k8s高可用集群
之前基于kubeadm 搭建的单master双node 的k8s集群, 问题就是master 节点挂掉之后会导致整个集群不可用, 所以简单研究下多master的搭建方式。
1. 方案简介
用到的高可用技术主要是keepalived 和 haproxy。
1. keepalived
Keepalived主要是通过虚拟路由冗余来实现高可用功能。
Keepalived一个基于VRRP(Virtual Router Redundancy Protocol - 虚拟路由冗余协议) 协议来实现的 LVS 服务高可用方案,可以利用其来解决单点故障。一个LVS服务会有2台服务器运行Keepalived,一台为主服务器(MASTER),一台为备份服务器(BACKUP),但是对外表现为一个虚拟IP,主服务器会发送特定的消息给备份服务器,当备份服务器收不到这个消息的时候,即主服务器宕机的时候, 备份服务器就会接管虚拟IP,继续提供服务,从而保证了高可用性。
2. haproxy
haproxy 类似于nginx, 是一个负载均衡、反向代理软件。 nginx 采用master-workers 进程模型,每个进程单线程,多核CPU能充分利用。 haproxy 是多线程,单进程就能实现高性能,虽然haproxy 也支持多进程。
2. 搭建集群
基本上照着下面流程操作就可以了。
1. 安装要求
部署Kubernetes集群机器需要满足以下几个条件:
(1) 一台或多台机器,操作系统 CentOS7.x-86_x64
(2)硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
(3) 可以访问外网,需要拉取镜像,如果服务器不能上网,需要提前下载镜像并导入节点
(4) 禁止swap分区
2. 准备环境
接下来进行如下操作
# 关闭防火墙 systemctl stop firewalld systemctl disable firewalld # 关闭selinux sed -i 's/enforcing/disabled/' /etc/selinux/config # 永久 setenforce 0 # 临时 # 关闭swap swapoff -a # 临时 sed -ri 's/.*swap.*/#&/' /etc/fstab # 永久 # 根据规划设置主机名 hostnamectl set-hostname <hostname> # 在master添加hosts cat >> /etc/hosts << EOF 192.168.13.110 master.k8s.io k8s-vip 192.168.13.107 master01.k8s.io k8smaster01 192.168.13.108 master02.k8s.io k8smaster02 192.168.13.109 node01.k8s.io k8snode01 EOF # 将桥接的IPv4流量传递到iptables的链 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system # 生效 # 时间同步 yum install ntpdate -y ntpdate time.windows.com
3. 所有master 节点部署keepalived
1. 安装相关包和keepalived
yum install -y conntrack-tools libseccomp libtool-ltdl yum install -y keepalived
2. 配置master 节点
k8smaster01
cat > /etc/keepalived/keepalived.conf <<EOF ! Configuration File for keepalived global_defs { router_id k8s } vrrp_script check_haproxy { script "killall -0 haproxy" interval 3 weight -2 fall 10 rise 2 } vrrp_instance VI_1 { state MASTER interface ens33 virtual_router_id 51 priority 250 advert_int 1 authentication { auth_type PASS auth_pass ceb1b3ec013d66163d6ab } virtual_ipaddress { 192.168.13.110 } track_script { check_haproxy } } EOF
k8smaster02
cat > /etc/keepalived/keepalived.conf <<EOF ! Configuration File for keepalived global_defs { router_id k8s } vrrp_script check_haproxy { script "killall -0 haproxy" interval 3 weight -2 fall 10 rise 2 } vrrp_instance VI_1 { state BACKUP interface ens33 virtual_router_id 51 priority 200 advert_int 1 authentication { auth_type PASS auth_pass ceb1b3ec013d66163d6ab } virtual_ipaddress { 192.168.13.110 } track_script { check_haproxy } } EOF
3. 启动和检查
# 启动keepalived
systemctl start keepalived.service
# 设置开机启动
systemctl enable keepalived.service
# 查看启动状态
systemctl status keepalived.service
启动后查看master1的网卡信息: 可以看到有vip 相关的信息
ip a s ens33
4. 部署haproxy
1. 安装
yum install -y haproxy
2. 配置
两台master节点的配置均相同,配置中声明了后端代理的两个master节点服务器,指定了haproxy运行的端口为16443等,因此16443端口为集群的入口
cat > /etc/haproxy/haproxy.cfg << EOF #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 #--------------------------------------------------------------------- # kubernetes apiserver frontend which proxys to the backends #--------------------------------------------------------------------- frontend kubernetes-apiserver mode tcp bind *:16443 option tcplog default_backend kubernetes-apiserver #--------------------------------------------------------------------- # round robin balancing between the various backends #--------------------------------------------------------------------- backend kubernetes-apiserver mode tcp balance roundrobin server master01.k8s.io 192.168.13.107:6443 check server master02.k8s.io 192.168.13.108:6443 check #--------------------------------------------------------------------- # collection haproxy statistics message #--------------------------------------------------------------------- listen stats bind *:1080 stats auth admin:awesomePassword stats refresh 5s stats realm HAProxy\ Statistics stats uri /admin?stats EOF
3. 启动和检查
两台master 都启动
# 设置开机启动
systemctl enable haproxy
# 开启haproxy
systemctl start haproxy
# 查看启动状态
systemctl status haproxy
检查端口
netstat -lntup|grep haproxy
5. 所有节点安装Docker/kubeadm/kubelet
Kubernetes默认CRI(容器运行时)为Docker,因此先安装Docker。
1. 安装docker
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo yum -y install docker-ce-18.06.1.ce-3.el7 systemctl enable docker && systemctl start docker docker --version
修改docker 镜像源:
cat > /etc/docker/daemon.json << EOF { "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"] } EOF
2.
cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
3.
yum install -y kubelet-1.21.3 kubeadm-1.21.3 kubectl-1.21.3 systemctl enable kubelet
6. 部署k8smaster
1. 创建kubeadm 配置文件
在具有vip的master上操作,这里为master1
mkdir /usr/local/kubernetes/manifests -p cd /usr/local/kubernetes/manifests/ vi kubeadm-config.yaml apiServer: certSANs: - k8smaster01 - k8smaster02 - master.k8s.io - 192.168.13.110 - 192.168.13.107 - 192.168.13.108 - 127.0.0.1 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta1 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: "master.k8s.io:16443" controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.21.3 networking: dnsDomain: cluster.local podSubnet: 10.244.0.0/16 serviceSubnet: 10.1.0.0/16 scheduler: {}
2. 在k8smaster01 节点执行
kubeadm init --config kubeadm-config.yaml
执行后提示如下:
[addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: kubeadm join master.k8s.io:16443 --token piealo.bck99wdpdv14rlo6 \ --discovery-token-ca-cert-hash sha256:e0579b642a2b62219627f9f19af5227dadb539f9db11992585644d4f126046e5 \ --control-plane Then you can join any number of worker nodes by running the following on each as root: kubeadm join master.k8s.io:16443 --token piealo.bck99wdpdv14rlo6 \ --discovery-token-ca-cert-hash sha256:e0579b642a2b62219627f9f19af5227dadb539f9db11992585644d4f126046e5
按照提示配置环境变量,使用kubectl工具:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
检查:
[root@k8smaster01 manifests]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8smaster01 NotReady control-plane,master 144m v1.21.3 [root@k8smaster01 manifests]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-59d64cd4d4-62bhq 0/1 Pending 0 144m coredns-59d64cd4d4-95dl5 0/1 Pending 0 144m etcd-k8smaster01 1/1 Running 0 144m kube-apiserver-k8smaster01 1/1 Running 0 144m kube-controller-manager-k8smaster01 1/1 Running 0 144m kube-proxy-df8c8 1/1 Running 0 144m kube-scheduler-k8smaster01 1/1 Running 0 145m
3. 安装集群网络
mkdir flannel cd flannel wget -c https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
创建以及查看:
kubectl apply -f kube-flannel.yml
kubectl get pods -n kube-system
4. k8smaster02 加入节点
(1)
从master1复制密钥及相关文件到master2
ssh root@192.168.13.108 mkdir -p /etc/kubernetes/pki/etcd scp /etc/kubernetes/admin.conf root@192.168.13.108:/etc/kubernetes scp /etc/kubernetes/pki/{ca.*,sa.*,front-proxy-ca.*} root@192.168.13.108:/etc/kubernetes/pki scp /etc/kubernetes/pki/etcd/ca.* root@192.168.13.108:/etc/kubernetes/pki/etcd
(2) master2加入集群
在k8smaster02 执行在k8smaster01上init后输出的join命令,需要带上参数 `--control-plane` 表示把master控制节点加入集群
kubeadm join master.k8s.io:16443 --token piealo.bck99wdpdv14rlo6 \ --discovery-token-ca-cert-hash sha256:e0579b642a2b62219627f9f19af5227dadb539f9db11992585644d4f126046e5 --control-plane
执行完后输出:
This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
然后执行上面输出的信息:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
检查状态:
[root@k8smaster02 ~]# kubectl get node NAME STATUS ROLES AGE VERSION k8smaster01 Ready control-plane,master 17h v1.21.3 k8smaster02 NotReady control-plane,master 6m21s v1.21.3 [root@k8smaster02 ~]# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-59d64cd4d4-62bhq 1/1 Running 0 17h kube-system coredns-59d64cd4d4-95dl5 1/1 Running 0 17h kube-system etcd-k8smaster01 1/1 Running 0 17h kube-system etcd-k8smaster02 1/1 Running 0 6m22s kube-system kube-apiserver-k8smaster01 1/1 Running 0 17h kube-system kube-apiserver-k8smaster02 1/1 Running 0 6m25s kube-system kube-controller-manager-k8smaster01 1/1 Running 1 17h kube-system kube-controller-manager-k8smaster02 1/1 Running 0 6m26s kube-system kube-flannel-ds-p2std 1/1 Running 0 15h kube-system kube-flannel-ds-vc2w2 0/1 Init:ImagePullBackOff 0 6m27s kube-system kube-proxy-df8c8 1/1 Running 0 17h kube-system kube-proxy-nx8dg 1/1 Running 0 6m27s kube-system kube-scheduler-k8smaster01 1/1 Running 1 17h kube-system kube-scheduler-k8smaster02 1/1 Running 0 6m26s
3. k8snode01 加入集群
(1) 在k8snode01 节点上执行之前输出的信息:
kubeadm join master.k8s.io:16443 --token piealo.bck99wdpdv14rlo6 \ --discovery-token-ca-cert-hash sha256:e0579b642a2b62219627f9f19af5227dadb539f9db11992585644d4f126046e5
执行完后输出如下:
This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
(2) 在master 节点查看
[root@k8smaster01 manifests]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8smaster01 Ready control-plane,master 17h v1.21.3 k8smaster02 NotReady control-plane,master 11m v1.21.3 k8snode01 NotReady <none> 2m26s v1.21.3
(3) 重新安装网络
[root@k8smaster01 flannel]# pwd /root/flannel [root@k8smaster01 flannel]# kubectl apply -f kube-flannel.yml
(4) 再次查看集群状态
[root@k8smaster01 flannel]# kubectl cluster-info Kubernetes control plane is running at https://master.k8s.io:16443 CoreDNS is running at https://master.k8s.io:16443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. [root@k8smaster01 flannel]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8smaster01 Ready control-plane,master 18h v1.21.3 192.168.13.107 <none> CentOS Linux 7 (Core) 3.10.0-1160.49.1.el7.x86_64 docker://18.6.1 k8smaster02 Ready control-plane,master 38m v1.21.3 192.168.13.108 <none> CentOS Linux 7 (Core) 3.10.0-1160.49.1.el7.x86_64 docker://18.6.1 k8snode01 Ready <none> 28m v1.21.3 192.168.13.109 <none> CentOS Linux 7 (Core) 3.10.0-1160.49.1.el7.x86_64 docker://18.6.1
4. 测试集群
[root@k8smaster01 flannel]# kubectl create deployment nginx --image=nginx deployment.apps/nginx created [root@k8smaster01 flannel]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6799fc88d8-g7vcg 0/1 ContainerCreating 0 11s <none> k8snode01 <none> <none> [root@k8smaster01 flannel]# kubectl expose deployment nginx --port=80 --type=NodePort service/nginx exposed [root@k8smaster01 flannel]# kubectl get pod,svc NAME READY STATUS RESTARTS AGE pod/nginx-6799fc88d8-g7vcg 1/1 Running 0 98s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 18h service/nginx NodePort 10.1.49.27 <none> 80:32367/TCP 60s
然后从外部查看任意一个节点的32367 端口即可查看到nginx。 107、08、09、110 四个ip都可以访问到。