二进制集群升级
k8s 二进制集群平滑升级 1.15.2升级至1.16.4
基于之前的二进制安装 kubernetes 集群来升级
1,既然要平滑升级,我们肯定不能一口气把所有节点同时升级,我们需要一个节点一个节点的来操作,先删除一个节点,然后kube-scheduler会把删除的这个节点上的 pod 迁移到还存在的节点上,先升级节点
2,升级 master ,首先需要先把要升级的 master 节点从 nginx 中摘除,防止有流量进来
3,升级服务
4,流量切回来,查看集群状态是否正常
node 节点升级
1,下载1.16.4 版本的二进制包
我们到 github 上面下载需要的二进制安装包 https://github.com/kubernetes/kubernetes/releases/tag/v1.16.14
mkdir -p /opt/src/kubernetes-v1.16.4
wget https://dl.k8s.io/v1.16.15/kubernetes-server-linux-amd64.tar.gz -O /opt/src/kubernetes-v1.16.4
2,删除一个节点
[root@jx4-74 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jx4-74.host.com Ready master,node 33d v1.15.4
jx4-75.host.com Ready master,node 33d v1.15.4
jx4-76.host.com Ready node 33d v1.15.4
[root@jx4-74 ~]# kubectl delete node jx4-76.host.com
node "jx4-76.host.com" deleted
[root@jx4-74 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jx4-74.host.com Ready master,node 33d v1.15.4
jx4-75.host.com Ready master,node 33d v1.15.4
将节点设置成不可调度状态:
[root@jx4-74 ~]# kubectl cordon jx4-76.host.com
node/jx4-75.host.com cordoned
[root@jx4-74 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jx4-74.host.com Ready master,node 33d v1.15.4
jx4-75.host.com Ready master,node 33d v1.15.4
jx4-76.host.com Ready node,SchedulingDisabled 33d v1.15.4
当节点设置成不可调度状态之后,新启动的 pod 不会调度到此节点上,但是该节点上正在运行的 Pod 将不会被影响。
驱逐节点上的 Pod
[root@jx4-74 ~]# kubectl drain jx4-76.host.com --delete-local-data --ignore-daemonsets --force
- --delete-local-data 即使pod使用了emptyDir也删除
- --ignore-daemonsets 忽略deamonset控制器的pod,如果不忽略,deamonset控制器控制的pod被删除后可能马上又在此节点上启动起来,会成为死循环;
- --force 不加force参数只会删除该NODE上由ReplicationController, ReplicaSet, DaemonSet,StatefulSet or Job创建的Pod,加了后还会删除'裸奔的pod'(没有绑定到任何replication controller)
[root@jx4-74 ~]# kubectl get pods -o wide --all-namespaces | grep jx4-76.host.com
default test-nginx-r7wcs 1/1 Running 0 29h 10.103.75.4 jx4-76.host.com <none> <none>
kube-system traefik-ingress-controller-mflr8 1/1 Running 0 29h 10.103.75.3 jx4-76.host.com <none> <none>
monitoring log-loki-c8d2v 1/1 Running 0 29h 10.103.75.6 jx4-76.host.com <none> <none>
monitoring node-exporter-qdh7g 2/2 Running 0 29h 192.168.4.75 jx4-76.host.com <none> <none>
[root@jx4-74 ~]# kubectl get ds --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
default test-nginx 3 3 3 3 3 <none> 22d
kube-system traefik-ingress-controller 3 3 3 3 3 <none> 29d
monitoring log-loki 3 3 3 3 3 <none> 5d2h
monitoring node-exporter 3 3 3 3 3 <none> 19d
可以看到除了 deamonSet
控制器之外的所有 pods 都已经被驱逐,
摘除此节点
[root@jx4-74 ~]# kubectl delete node jx4-76.host.com
3, 升级节点
[root@jx4-76 kubernetes-v1.16.4]# tar xvf kubernetes-server-linux-amd64.tar.gz -C /opt/kubernetes-1.16.4/
[root@jx4-76 kubernetes-v1.16.4]# cd /opt/kubernetes-1.16.4/
[root@jx4-76 kubernetes-1.16.4]# mv kubernetes/* ./
[root@jx4-76 kubernetes-1.16.4]# rm -rf kubernetes kubernetes-src.tar.gz LICENSES
[root@jx4-76 kubernetes-1.16.4]# cd server/bin/
[root@jx4-76 bin]# ls
apiextensions-apiserver kube-apiserver kube-controller-manager kubectl kube-proxy.docker_tag kube-scheduler.docker_tag
hyperkube kube-apiserver.docker_tag kube-controller-manager.docker_tag kubelet kube-proxy.tar kube-scheduler.tar
kubeadm kube-apiserver.tar kube-controller-manager.tar kube-proxy kube-scheduler mounter
[root@jx4-76 bin]# rm -f *_tag *.tar
[root@jx4-76 bin]# ls
apiextensions-apiserver hyperkube kubeadm kube-apiserver kube-controller-manager kubectl kubelet kube-proxy kube-scheduler mounter
[root@jx4-76 bin]# cp -r /opt/kubernetes-1.15.4/server/bin/conf/ ./
[root@jx4-76 bin]# cp -r /opt/kubernetes-1.15.4/server/bin/certs/ ./
[root@jx4-76 bin]# cp /opt/kubernetes-1.15.4/server/bin/*.sh kubernetes-1.16.4/server/bin/
[root@jx4-76 opt]# rm -rf kubernetes
[root@jx4-76 opt]# ln -s /opt/kubernetes-1.16.4/ kubernetes
[root@jx4-76 opt]# supervisorctl restart kube-kubelet-4-76 kube-proxy-4-76
[root@jx4-76 opt]# supervisorctl status
flanneld-4-76 RUNNING pid 1154, uptime 29 days, 18:05:52
kube-kubelet-4-76 RUNNING pid 15550, uptime 0:01:12
kube-proxy-4-76 RUNNING pid 16173, uptime 0:00:42
然后查看 node 版本
[root@jx4-74 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jx4-74.host.com Ready master,node 33d v1.15.4
jx4-75.host.com Ready master,node 33d v1.15.4
jx4-76.host.com Ready <none> 73s v1.16.15
可以看到此节点我们已经升级成功了。
然后也查看到 76 上面已经有 running 的 pod 了
[root@jx4-74 ~]# kubectl get pods --all-namespaces -o wide | grep jx4-76.host.com
default test-nginx-szlnq 1/1 Running 0 2m58s 10.103.76.2 jx4-76.host.com <none> <none>
kube-system traefik-ingress-controller-r7dtx 1/1 Running 0 2m58s 10.103.76.4 jx4-76.host.com <none> <none>
monitoring log-loki-2rtwt 1/1 Running 0 2m58s 10.103.76.3 jx4-76.host.com <none> <none>
monitoring node-exporter-8zpd7 2/2 Running 0 3m8s 192.168.4.76 jx4-76.host.com <none> <none>
master 升级
master 升级和 node 升级大致相同,先升级一个,把升级的哪台先从 nginx 负载里面摘除,然后在单个服务单个服务的升级,这里面不在赘述。
升级完成之后我们查看节点信息
[root@jx4-74 tmp]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jx4-74.host.com Ready master,node 33d v1.16.15
jx4-75.host.com Ready master,node 80m v1.16.15
jx4-76.host.com Ready node 3h40m v1.16.15