一文搞懂kubernetes Deployment之滚动更新、回滚应用及策略;(二)
后续内容会更新在个人站点: https.malusspectabilis.top
1.Deployment控制器详细信息中包含了其更新策略的相关配置。kubectl describe命令中输出的StrategyType、RollingUpdateStrategy字段等;
root@kubernetes-master01:~# kubectl describe deploy sleep
Name: sleep
Namespace: default
CreationTimestamp: Wed, 25 May 2022 15:03:14 +0800
Labels: <none>
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=sleep
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
# Deployment默认更新策略就是RollingUpdate默认更新策略是25%
1.Deployment的更新策略:
Deployment控制器支持两种更新策略: 滚动更新(RollingUpdate)和删除式更新(Recreate)也称为单批次更新。
1.删除式更新(Recreate),当更新策略设定为Recreate,在更新镜像时,它会先删除现在正在运行的Pod,等彻底杀死后,重新创建新的RS(ReplicaSet)然后启动对应的Pod,在整个更新过程中,会造成服务一段时间无法提供服务。也称之为单批次更新。
2.滚动更新(Rolling Update)滚动更新是默认的更新策略,一次仅更新一批Pod,当更新的Pod就绪后再更新另一批,直到全部更新完成为止;该策略实现了不间断服务的目标,但是在更新过程中,不同客户端得到的响应内容可能会来自不同版本的应用。会出现新老版本共存状态。
2.ReCreate实践;
Recreate分为三个步骤:
1.杀死所有旧版本的Pod,此时Pod无法正常对外提供服务;
2.创建新的RS,启动新的Pod;
3.等待Pod就绪,对外提供服务;
2.1应用配置示例
# 须在Spec字段中明确定义strategy滚动更新策略和type类型
root@kubernetes-master01:~# cat nginx-deployment-test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-nginx-test
namespace: default
spec:
strategy: # 滚动更新策略
type: Recreate # Recreate表示的是删除式更新,也称为单批次更新;
replicas: 2
selector:
matchLabels:
app: nginx-deployment
template:
metadata:
labels:
app: nginx-deployment
spec:
containers:
- name: nginx
image: nginx:1.16
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
root@kubernetes-master01:~# kubectl apply -f nginx-deployment-test.yaml
2.2访问测试,访问前端的Service。查看Nginx的版本,测试服务是否会中断;
# 跟上任何符号会报错Nginx的错误页,现版本是1.16。当然是为了证实nginx版本。
root@kubernetes-master01:~# curl 10.107.246.117/v
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
2.3现版本是1.16,现在测试滚动更新;
1.修改Yaml配置文件修改编辑配置文件的spec.containers.image字段修改Nginx的版本号
2.通过set image来修改
3.通过kubectl edit deployment deployment名称
我是通过set image来修改。
root@kubernetes-master01:~# kubectl set image deployment/deployment-nginx-test nginx=nginx:latest
deployment.apps/deployment-nginx-test image updated
2.4访问测试,的确中间是有业务访问间断的,因此在生产环境不建议这种方式。在更新的过程中旧Pod是处于Terminating状态。通常只有当应用的新旧版本不兼容(例如依赖的后端数据的格式不同且无法兼容)时才会使用Recreate重建策略;
root@kubernetes-master01:~# while sleep 0.5; do curl http://10.107.246.117/version;done
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
curl: (7) Failed to connect to 10.107.246.117 port 80: Connection refused
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
3.RollingUpdate实践
滚动更新时,应用升级期间还要确保可用的Pod对象数量不低于阈值以确保可用持续处理客户端的服务请求,变动的方式和Pod对象的数量范围将通过spec.strategy.rollingUpdate.maxSurge和spec.strategy.rollingUpdate.maxUnavailable两个属性同时进行定义;
滚动更新(RollingUpdate)一次仅更新一批Pod,当更新的Pod就绪后,再更新另一批,直到全部更新完成为止,该策略实现了不间断服务的目标,在更新过程中可能会出现不同的应用版本且并存,同时提供服务的情况。
1.创建新的RS,然后根据新的镜像运行新的Pod。
2.删除旧的Pod,启动新的Pod,当新Pod就绪后,继续删除旧Pod,启动新Pod。
3.持续第二步过程,一直到所有Pod都被更新成功。
3.1准备应用配置文件
# type字段须配置为RollingUpdate
spec:
strategy:
type: RollingUpdate
# 应用配置文件,现版本Nginx是1.16版本,要滚动到1.21.5;
root@kubernetes-master01:~# cat nginx-deployment-test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-nginx-test
namespace: default
spec:
strategy:
type: RollingUpdate
replicas: 2
selector:
matchLabels:
app: nginx-deployment
template:
metadata:
labels:
app: nginx-deployment
spec:
containers:
- name: nginx
image: nginx:1.16
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
root@kubernetes-master01:~# kubectl apply -f nginx-deployment-test.yaml
deployment.apps/deployment-nginx-test created
# Pod也是运行正常的。
root@kubernetes-master01:~# kubectl get pods -l app=nginx-deployment
NAME READY STATUS RESTARTS AGE
deployment-nginx-test-65b579f8d5-gdpd5 1/1 Running 0 3m14s
deployment-nginx-test-65b579f8d5-spq9w 1/1 Running 0 3m14s
3.2,我们现在访问前端的Service;OK版本一直是1.16;
root@kubernetes-master01:~# while sleep 0.5; do curl http://10.103.162.78/v;done
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
3.3修改Yaml配置文件进行滚动升级;并观察是否业务中断;
# Pod状态;1.16版本的Pod是65b579f8d5,现在是76f7b87b7c
root@kubernetes-master01:~# kubectl get pods
deployment-nginx-test-76f7b87b7c-n2jmm 1/1 Running 0 35s
deployment-nginx-test-76f7b87b7c-qvxhg 0/1 ContainerCreating 0 11s
# 滚动升级状态;
# 我修改的是配置文件,并非set image;
root@kubernetes-master01:~# kubectl apply -f nginx-deployment-test.yaml && kubectl rollout status deploy deployment-nginx-test
deployment.apps/deployment-nginx-test configured
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 old replicas are pending termination...
deployment "deployment-nginx-test" successfully rolled out
# 查看Replicaset;现在65b579的Pod数量被置于0
root@kubernetes-master01:~# kubectl get replicaset
deployment-nginx-test-577977f4b6 2 2 2 7m
deployment-nginx-test-65b579f8d5 0 0 0 17m
3.4观察访问
# 业务访问正常,没有中断,但是一段时间会出现新老版本共存状态;
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
^C
4.应用的回滚
4.1.我们可以通过命令来查看更新的历史版本;
root@kubernetes-master01:~# kubectl rollout history deployment deployment-nginx-test
deployment.apps/deployment-nginx-test
REVISION CHANGE-CAUSE
1 <none>
2 <none>
4.2.也可以查看具体镜像详情,跟上序号;
root@kubernetes-master01:~# kubectl rollout history deploy deployment-nginx-test --revision=2
deployment.apps/deployment-nginx-test with revision #2
Pod Template:
Labels: app=nginx-deployment
pod-template-hash=76f7b87b7c
Containers:
nginx:
Image: nginx:1.16
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
4.3.现在想要回滚到1.16这个版本;
# kubectl rollout undo 命令来执行回滚;
root@kubernetes-master01:~# kubectl rollout undo deploy deployment-nginx-test --to-revision=2 && kubectl rollout status deploy deployment-nginx-test
deployment.apps/deployment-nginx-test rolled back
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 out of 2 new replicas have been updated...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "deployment-nginx-test" rollout to finish: 1 old replicas are pending termination...
deployment "deployment-nginx-test" successfully rolled out
4.4.观察访问状态;亦会出现新老版本交替情况
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.21.5</center>
</body>
</html>
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
^C
4.5.查看Pod和Replicaset
root@kubernetes-master01:~# kubectl get pods -l app=nginx-deployment
NAME READY STATUS RESTARTS AGE
deployment-nginx-test-76f7b87b7c-5tx7z 1/1 Running 0 2m34s
deployment-nginx-test-76f7b87b7c-t9ld2 1/1 Running 0 2m57s
# 也是没有问题。
root@kubernetes-master01:~# kubectl get replicaset -l app=nginx-deployment
NAME DESIRED CURRENT READY AGE
deployment-nginx-test-577977f4b6 0 0 0 20m
deployment-nginx-test-65b579f8d5 0 0 0 31m
deployment-nginx-test-76f7b87b7c 2 2 2 24m
5.RollingUpdate之maxSurge实践以及其他参数。
5.1maxSurge
指定升级期间存在的总Pod对象数量最多可超出期望值的个数,可以是0也可以是整数;
例如: 期望的值是5,maxSurge的属性是2,则表示Pod对象总数不能超过6个。计算公式: 5+(5×20%)=6
5.2maxUnavailable
升级期间不可用的Pod副本数(包括新版本)最多不能低于期望值的个数,默认值为1;
例如: 期望的值为5个,maxunavailable属性为2,则Pod处于正常的状态至少有4个,计算公式: 5-(5×20%)=4。
5.3配置如下
配置replicas为7个副本
配置maxSurge为百分之20
配置maxUnavailable为百分之20
root@kubernetes-master01:~# cat app1-rollingUpdate.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment-surge
namespace: default
spec:
progressDeadlineSeconds: 600
revisionHistoryLimit: 15
minReadySeconds: 10
strategy:
rollingUpdate:
maxSurge: 20%
maxUnavailable: 20%
type: RollingUpdate
replicas: 7
selector:
matchLabels:
app: app-deploy
template:
metadata:
labels:
app: app-deploy
spec:
containers:
- name: app-deploy
image: nginx:1.16
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
root@kubernetes-master01:~# kubectl apply -f app1-rollingUpdate.yaml
5.4.Pod运行正常;
root@kubernetes-master01:~# kubectl get pods -l app=app-deploy
NAME READY STATUS RESTARTS AGE
app-deployment-surge-5865df56c9-44j6h 1/1 Running 0 28s
app-deployment-surge-5865df56c9-4hfm9 1/1 Running 0 28s
app-deployment-surge-5865df56c9-6l6hh 1/1 Running 0 28s
app-deployment-surge-5865df56c9-c2w5x 1/1 Running 0 28s
app-deployment-surge-5865df56c9-fm4xd 1/1 Running 0 28s
app-deployment-surge-5865df56c9-jhxkb 1/1 Running 0 28s
app-deployment-surge-5865df56c9-q4tqf 1/1 Running 0 28s
5.5通过Service访问,可以看到nginx的版本是1.16。我们重点是看滚动更新策略,不是更新版本,版本可忽略;
root@kubernetes-master01:~# curl 10.109.231.242/v
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>
5.6准备更新;且查看replicaset副本
修改yaml的配置文件;
image: nginx:latest
root@kubernetes-master01:~# kubectl describe replicaset app-deployment-surge-5c5cdb76b4
Name: app-deployment-surge-5c5cdb76b4
Namespace: default
Selector: app=app-deploy,pod-template-hash=5c5cdb76b4
Labels: app=app-deploy
pod-template-hash=5c5cdb76b4
Annotations: deployment.kubernetes.io/desired-replicas: 7 # 所需要是7个
deployment.kubernetes.io/max-replicas: 9 # 这里显示最大副本数为9个
deployment.kubernetes.io/revision: 2 # 2表示我们当前所在版本
Controlled By: Deployment/app-deployment-surge
Replicas: 7 current / 7 desired
Pods Status: 7 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=app-deploy
pod-template-hash=5c5cdb76b4
Containers:
app-deploy:
Image: nginx:latest
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 3m14s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-c8cfd
Normal SuccessfulCreate 3m14s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-t78hv
Normal SuccessfulCreate 3m14s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-dkl2r
Normal SuccessfulCreate 3m1s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-tmrr8
Normal SuccessfulCreate 3m1s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-qn94k
Normal SuccessfulCreate 3m1s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-c7ht8
Normal SuccessfulCreate 2m48s replicaset-controller Created pod: app-deployment-surge-5c5cdb76b4-97t8n
5.7通过命令验证
root@kubernetes-master01:~# kubectl rollout history deploy app-deployment-surge --revision=2
deployment.apps/app-deployment-surge with revision #2
Pod Template:
Labels: app=app-deploy
pod-template-hash=5c5cdb76b4
Containers:
app-deploy:
Image: nginx:latest
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
5.8验证
定义是7个副本,要必须保证9个可用;
5.minReadySeconds
minReadySeconds: 用户可用使用spec字段的minReadySeconds属性来控制应用升级的速度,新旧更替过程中,新创建的Pod对象一旦成功响应就绪探测即被视作可用,而后即开始下一轮的替换操作,而spec.minReadySeconds能够定义在新Pod对象创建后至少需要等待多久才将其视为就绪;在此期间,更新操作会被阻塞,因此,它可用用来让kubernetes在每次创建出Pod资源后都要等上一部分实践再开始下一轮的替换;因此为minReadySeconds设置一个合理的值,不仅仅能够减慢更新速度,还能够让Deployment发现一部分程序因为BUG而导致的升级故障。
6.RevisionHistoryLimit
Deployment控制器保留了一部分更新历史中旧版本的ReplicaSet对象,当我们执行回滚操作时,就直接使用旧版本的ReplicaSet,在Deployment资源保存历史版本数量有spec.revisionHistoryLimit属性进行定义。
7.progressDeadlineSeconds
滚动更新故障超时时长,默认为600秒,k8s在升级过程中有可能由于各种原因升级卡住(这个时候还没有明确的升级失败)比如在拉取被墙的镜像,权限不够等错误,如果配置progressDeadlineSeconds当达到了时间如果还卡着,则会上报这个异常情况,这个时候Deployment的状态就会被标记为false,并且注明原因,但是它不会阻止Deployment继续进行卡住后面的升级操作。