从OpenShift SDN切换到OVN-Kubernetes

OpenShift 4的版本从4.6开始GA了新的网络插件OVN-kubernetes,和原有的OpenShift SDN对比,实现不同如下

 

 整个Ovn的架构如下图

 

 

本文主要记录一下从传统的SDN网络切换到ovn-kubernetes的主要步骤

1.首先确认下目前的集群状态是否健康。

[root@bastion cluster-fe55]# oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-129-219.us-east-2.compute.internal   Ready    master   24m   v1.23.5+9ce5071
ip-10-0-152-235.us-east-2.compute.internal   Ready    worker   18m   v1.23.5+9ce5071
ip-10-0-166-169.us-east-2.compute.internal   Ready    master   24m   v1.23.5+9ce5071
ip-10-0-190-233.us-east-2.compute.internal   Ready    worker   11m   v1.23.5+9ce5071
ip-10-0-193-179.us-east-2.compute.internal   Ready    worker   18m   v1.23.5+9ce5071
ip-10-0-199-160.us-east-2.compute.internal   Ready    master   24m   v1.23.5+9ce5071

 

[root@bastion cluster-fe55]# oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.11   True        False         False      4m33s   
baremetal                                  4.10.11   True        False         False      22m     
cloud-controller-manager                   4.10.11   True        False         False      24m     
cloud-credential                           4.10.11   True        False         False      25m     
cluster-autoscaler                         4.10.11   True        False         False      22m     
config-operator                            4.10.11   True        False         False      23m     
console                                    4.10.11   True        False         False      2m15s   
csi-snapshot-controller                    4.10.11   True        False         False      23m     
dns                                        4.10.11   True        False         False      22m     
etcd                                       4.10.11   True        False         False      21m     
image-registry                             4.10.11   True        False         False      11m     
ingress                                    4.10.11   True        False         False      13m     
insights                                   4.10.11   True        False         False      10m     
kube-apiserver                             4.10.11   True        False         False      7m27s   
kube-controller-manager                    4.10.11   True        False         False      20m     
kube-scheduler                             4.10.11   True        False         False      20m     
kube-storage-version-migrator              4.10.11   True        False         False      23m     
machine-api                                4.10.11   True        False         False      19m     
machine-approver                           4.10.11   True        False         False      23m     
machine-config                             4.10.11   True        False         False      22m     
marketplace                                4.10.11   True        False         False      22m     
monitoring                                 4.10.11   True        False         False      10m     
network                                    4.10.11   True        False         False      24m 
node-tuning                                4.10.11   True        False         False      23m     
openshift-apiserver                        4.10.11   True        False         False      7m28s   
openshift-controller-manager               4.10.11   True        False         False      21m     
openshift-samples                          4.10.11   True        False         False      14m     
operator-lifecycle-manager                 4.10.11   True        False         False      23m     
operator-lifecycle-manager-catalog         4.10.11   True        False         False      23m     
operator-lifecycle-manager-packageserver   4.10.11   True        False         False      15m     
service-ca                                 4.10.11   True        False         False      23m     
storage                                    4.10.11   True        False         False      23m  

sdn命名空间下的pods

[root@bastion cluster-fe55]# oc get pods -n openshift-sdn 
NAME                   READY   STATUS    RESTARTS   AGE
sdn-bjnld              2/2     Running   0          20m
sdn-chgzt              2/2     Running   0          19m
sdn-controller-5lw7p   2/2     Running   0          25m
sdn-controller-bsqf9   2/2     Running   0          25m
sdn-controller-lgfcw   2/2     Running   0          25m
sdn-jjskf              2/2     Running   0          25m
sdn-k5ff9              2/2     Running   0          25m
sdn-mtf6h              2/2     Running   0          13m
sdn-vn9lg              2/2     Running   0          25m

备份当前的网络配置

oc get Network.config.openshift.io cluster -o yaml > cluster-openshift-sdn.yaml

2. 准备阶段,设置cluster network operator为migration状态

[root@bastion cluster-fe55]# oc patch Network.operator.openshift.io cluster --type='merge' --patch '{ "spec": { "migration": {"networkType": "OVNKubernetes" } } }'
network.operator.openshift.io/cluster patched

查看mcp以及machineconfig,确保更新完成,更新过程中集群会重启。

[root@bastion cluster-fe55]# oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-716bbe8d388b0c03c7220b92261ce3b7   False     True       False      3              0                   0                     0                      31m
worker   rendered-worker-19e997749a8169203d7633a41779dc70   False     True       False      3              0                   0                     0                      31m
[root@bastion cluster-fe55]# oc describe node | egrep "hostname|machineconfig"
                    kubernetes.io/hostname=ip-10-0-129-219.us-east-2.compute.internal
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-master-716bbe8d388b0c03c7220b92261ce3b7
                    machineconfiguration.openshift.io/desiredConfig: rendered-master-716bbe8d388b0c03c7220b92261ce3b7
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    kubernetes.io/hostname=ip-10-0-152-235.us-east-2.compute.internal
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-19e997749a8169203d7633a41779dc70
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-19e997749a8169203d7633a41779dc70
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    kubernetes.io/hostname=ip-10-0-166-169.us-east-2.compute.internal
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-master-716bbe8d388b0c03c7220b92261ce3b7
                    machineconfiguration.openshift.io/desiredConfig: rendered-master-716bbe8d388b0c03c7220b92261ce3b7
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    kubernetes.io/hostname=ip-10-0-190-233.us-east-2.compute.internal
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-19e997749a8169203d7633a41779dc70
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-19e997749a8169203d7633a41779dc70
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    kubernetes.io/hostname=ip-10-0-193-179.us-east-2.compute.internal
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-8a02c6921084c925662477a94e36d6d2
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-8a02c6921084c925662477a94e36d6d2
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    kubernetes.io/hostname=ip-10-0-199-160.us-east-2.compute.internal
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-master-716bbe8d388b0c03c7220b92261ce3b7
                    machineconfiguration.openshift.io/desiredConfig: rendered-master-53edd0e339dd68dc86dbcd4d60b244a6
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Working

直到全部更新

[root@bastion cluster-fe55]# oc get mcp 
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-53edd0e339dd68dc86dbcd4d60b244a6   True      False      False      3              3                   3                     0                      43m
worker   rendered-worker-8a02c6921084c925662477a94e36d6d2   True      False      False      3              3                   3                     0                      43m

 

确认已经准备完成

[root@bastion cluster-fe55]# oc get machineconfig rendered-master-53edd0e339dd68dc86dbcd4d60b244a6  -o yaml | grep ExecStart | grep OVNKubernetes
          ExecStart=/usr/local/bin/configure-ovs.sh OVNKubernetes
[root@bastion cluster-fe55]# oc get machineconfig rendered-worker-8a02c6921084c925662477a94e36d6d2  -o yaml | grep ExecStart | grep OVNKubernetes
          ExecStart=/usr/local/bin/configure-ovs.sh OVNKubernetes

3. 正式开始切换

[root@bastion cluster-fe55]# oc patch Network.config.openshift.io cluster --type='merge' --patch '{ "spec": { "networkType": "OVNKubernetes" } }'
network.config.openshift.io/cluster patched

首先是multus daemonset会更新到新的版本,基于命令查看

[root@bastion cluster-fe55]# oc -n openshift-multus rollout status daemonset/multus
daemon set "multus" successfully rolled out

4. 在multus的daemon roll out完成以后需要重启启动集群

如果不是在云环境中,或者可以ssh到每一个节点,可以基于下述脚本进行重启

cat << EOF > ~/reboot-nodes.sh
#!/bin/bash
 
for ip in $(oc get nodes  -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}')
do
  echo "reboot node $ip"
  ssh -o StrictHostKeyChecking=no core@\$ip sudo shutdown -r -t 3
done
EOF

如果在aws云环境,且helper节点无法连接到集群的情况,通过下述方式

[root@bastion ~]# oc debug node/ip-10-0-199-160.us-east-2.compute.internal 
Starting pod/ip-10-0-199-160us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.199.160
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# systemctl reboot
sh-4.4# 
Removing debug pod ...

 

重启完成后

[root@bastion ~]# oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-129-219.us-east-2.compute.internal   Ready    master   78m   v1.23.5+9ce5071
ip-10-0-152-235.us-east-2.compute.internal   Ready    worker   73m   v1.23.5+9ce5071
ip-10-0-166-169.us-east-2.compute.internal   Ready    master   78m   v1.23.5+9ce5071
ip-10-0-190-233.us-east-2.compute.internal   Ready    worker   66m   v1.23.5+9ce5071
ip-10-0-193-179.us-east-2.compute.internal   Ready    worker   73m   v1.23.5+9ce5071
ip-10-0-199-160.us-east-2.compute.internal   Ready    master   78m   v1.23.5+9ce5071

[root@bastion ~]# oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
OVNKubernetes

查看所有的co状态确保正常

查看ovn-kubernetes相关的pod状态信息

[root@bastion ~]# oc get  pods -n openshift-ovn-kubernetes  -o wide
NAME                   READY   STATUS    RESTARTS       AGE   IP             NODE                                         NOMINATED NODE   READINESS GATES
ovnkube-master-2s82h   6/6     Running   15 (19m ago)   38m   10.0.166.169   ip-10-0-166-169.us-east-2.compute.internal   <none>           <none>
ovnkube-master-hm5wj   6/6     Running   9              38m   10.0.129.219   ip-10-0-129-219.us-east-2.compute.internal   <none>           <none>
ovnkube-master-jnll2   6/6     Running   15 (19m ago)   38m   10.0.199.160   ip-10-0-199-160.us-east-2.compute.internal   <none>           <none>
ovnkube-node-bhnsg     5/5     Running   13 (19m ago)   38m   10.0.199.160   ip-10-0-199-160.us-east-2.compute.internal   <none>           <none>
ovnkube-node-lzprb     5/5     Running   8              10m   10.0.129.219   ip-10-0-129-219.us-east-2.compute.internal   <none>           <none>
ovnkube-node-m4jr5     5/5     Running   13             38m   10.0.190.233   ip-10-0-190-233.us-east-2.compute.internal   <none>           <none>
ovnkube-node-m7pww     5/5     Running   13             38m   10.0.193.179   ip-10-0-193-179.us-east-2.compute.internal   <none>           <none>
ovnkube-node-w4prc     5/5     Running   13 (19m ago)   38m   10.0.166.169   ip-10-0-166-169.us-east-2.compute.internal   <none>           <none>
ovnkube-node-zn9fx     5/5     Running   13             38m   10.0.152.235   ip-10-0-152-235.us-east-2.compute.internal   <none>           <none>

可以看到ovnkube-master都落在master节点上,ovnkube-node每个节点都有。

[root@bastion ~]# oc get nodes | grep master
ip-10-0-129-219.us-east-2.compute.internal   Ready    master   88m   v1.23.5+9ce5071
ip-10-0-166-169.us-east-2.compute.internal   Ready    master   87m   v1.23.5+9ce5071
ip-10-0-199-160.us-east-2.compute.internal   Ready    master   87m   v1.23.5+9ce5071

ovnkube-master内包含的container

 

 ovnkube-nodes下包含的container

 

[root@bastion ~]# oc get ds -n openshift-ovn-kubernetes
NAME             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                                                 AGE
ovnkube-master   3         3         3       3            3           beta.kubernetes.io/os=linux,node-role.kubernetes.io/master=   56m
ovnkube-node     6         6         6       6            6           beta.kubernetes.io/os=linux                                   56m

 

5. 修改operator的migration状态

[root@bastion ~]# oc patch Network.operator.openshift.io cluster --type='merge' --patch '{ "spec": { "migration": null } }'
network.operator.openshift.io/cluster patched

清除工作

oc patch Network.operator.openshift.io cluster --type='merge' \
  --patch '{ "spec": { "defaultNetwork": { "openshiftSDNConfig": null } } }'

oc delete namespace openshift-sdn

 

posted @ 2022-05-17 12:06  ericnie  阅读(850)  评论(0编辑  收藏  举报