极客时间运维进阶训练营第十六周作业
1、总结 Underlay 和 Overlay 网络的的区别及优缺点
-
Underlay 直接使用宿主机网络,无需封装解封装,性能好 基于宿主机网络物理网卡虚拟出多个网络接口(子接口),每个虚拟接口都拥有唯一的mac地址并可配置网卡子接口IP,强依赖物理网络
缺点:消耗的IP地址较多,子网要划分的足够大,报文广播也大
- Overlay 叠加网络,宿主机中封装容器网络,容器的mac封装到宿主机网络,利用宿主机网络,传输报文中的容器mac地址。效果就像L2的以太网帧在一个广播域中传输。 私有网络使用最多的网络之一 优点:对物理网络的兼容性比较好,没有额外的要求,可以实现pod的跨主机子网通信,calico和flannel等网络插件都支持overlay网络,私有云使用较多 缺点:有额外的封装与解封装性能开销
2、在 kubernetes 集群实现 underlay 网络
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
# 所有节点执行 apt-get -y update apt -y install apt-transport-https ca-certificates curl software-properties-common # 安装GPG证书 curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | apt-key add - # 写入软件源信息 add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" # 更新软件源 apt-get -y update # 查看可安装的Docker版 apt-cache madison docker-ce docker-ce-cli apt install -y docker-ce=5:20.10.23~3-0~ubuntu-jammy docker-ce-cli=5:20.10.23~3-0~ubuntu-jammy systemctl start docker && systemctl enable docker mkdir -p /etc/docker tee /etc/docker/daemon.json <<-'EOF' { "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors": ["https://9916w1ow.mirror.aliyuncs.com"] } EOF sudo systemctl daemon-reload && sudo systemctl restart docker # 安装cri-dockerd cd /usr/local/src/ wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.1/cri-dockerd-0.3.1.amd64.tgz tar xvf cri-dockerd-0.3.1.amd64.tgz cp cri-dockerd/cri-dockerd /usr/local/bin tee /lib/systemd/system/cri-docker.service << "EOF" [Unit] Description=CRI Interface for Docker Application Container Engine Documentation=https://docs.mirantis.com After=network-online.target firewalld.service docker.service Wants=network-online.target Requires=cri-docker.socket [Service] Type=notify ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9 ExecReload=/bin/kill -s HUP $MAINPID TimeoutSec=0 RestartSec=2 Restart=always StartLimitBurst=3 StartLimitInterval=60s LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF tee /etc/systemd/system/cri-docker.socket << "EOF" [Unit] Description=CRI Docker Socket for the API PartOf=cri-docker.service [Socket] ListenStream=%t/cri-dockerd.sock SocketMode=0660 SocketUser=root SocketGroup=docker [Install] WantedBy=sockets.target EOF systemctl daemon-reload && systemctl restart cri-docker && systemctl enable cri-docker && systemctl enable --now cri-docker.socket systemctl status cri-docker.service # 检查cri socket 文件 ls /var/run/cri-dockerd.sock # 安装kubeadmin ## 设置k8s镜像源 apt-get update && apt-get install -y apt-transport-https curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF ## 开始安装kubeadm: apt-get update apt-cache madison kubeadm apt-get install -y kubelet=1.24.10-00 kubeadm=1.24.10-00 kubectl=1.24.10-00 kubeadm config images list --kubernetes-version v1.24.10 tee /opt/images-download.sh << "EOF" #!/bin/bash docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.24.10 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.24.10 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.24.10 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.24.10 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.6-0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.6 EOF bash /opt/images-download.sh ''' 场景1:pod可以选择overlay或者underlay,SVC使用overlay,如果是underlay需要配置SVC使用宿主机的子网 比如以下场景是overlay网络、后期会用于overlay场景的pod,service会用于overlay的svc场景。 # kubeadm init --apiserver-advertise-address=172.31.6.201 --apiserver-bind-port=6443 --kubernetes-version=v1.24.4 --pod-network- cidr=10.200.0.0/16 --service-cidr=10.100.0.0/16 --service-dns-domain=cluster.local --image-repository=registry.cn- hangzhou.aliyuncs.com/google_containers --ignore-preflight-errors=swap --cri-socket unix:///var/run/cri-dockerd.sock 场景2:pod可以选择overlay或者underlay,SVC使用underlay underlay初始化,--pod-network-cidr=10.200.0.0/16会用于后期overlay的场景,underlay的网络CIDR后期单独指定,overlay会与underlay并存,-- service-cidr=172.31.5.0/24用于后期的underlay svc,通过SVC可以直接访问pod。 # 演示underlay初始化命令: # kubeadm init --apiserver-advertise-address=172.31.6.201 --apiserver-bind-port=6443 --kubernetes-version=v1.24.10 --pod-network- cidr=10.200.0.0/16 --service-cidr=172.31.5.0/24 --service-dns-domain=cluster.local --image-repository=registry.cn- hangzhou.aliyuncs.com/google_containers --ignore-preflight-errors=swap --cri-socket unix:///var/run/cri-dockerd.sock 注意:后期如果要访问SVC则需要在网络设备配置静态路由,因为SVC是iptables或者IPVS规则,不会进行arp报文广播: -A KUBE-SERVICES -d 172.31.5.148/32 -p tcp -m comment --comment "myserver/myserver-tomcat-app1-service-underlay:http cluster IP" -m tcp --dport 80 -j KUBE-SVC-DXPW2IL54XTPIKP5 -A KUBE-SVC-DXPW2IL54XTPIKP5 ! -s 10.200.0.0/16 -d 172.31.5.148/32 -p tcp -m comment --comment "myserver/myserver-tomcat-app1- service-underlay:http cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ Chain KUBE-POSTROUTING (1 references) pkts bytes target prot opt in out source destination 1260 83666 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match ! 0x4000/0x4000 5 312 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 MARK xor 0x4000 5 312 MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ rando ''' ## 初始化k8s-使用underlay网络 # 仅master执行 kubeadm init --apiserver-advertise-address=172.31.6.201 --apiserver-bind-port=6443 --kubernetes-version=v1.24.10 --pod-network-cidr=10.200.0.0/16 --service-cidr=172.31.5.0/24 --service-dns-domain=cluster.local --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --ignore-preflight-errors=swap --cri-socket unix:///var/run/cri-dockerd.sock ''' 初始化完成后的重要信息 [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.31.6.201:6443 --token t0ruut.j6toxlfjte31sngo \ --discovery-token-ca-cert-hash sha256:c78950990035913274f57d8e62f56c2502dab04ec2f578b6ccd58d788f3932c7 ''' mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config root@k8s-master:~# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master.iclinux.com NotReady control-plane 4m4s v1.24.10 # node 节点执行 ## 添加node节点,仅在node节点执行 kubeadm join 172.31.6.201:6443 --token t0ruut.j6toxlfjte31sngo --discovery-token-ca-cert-hash sha256:c78950990035913274f57d8e62f56c2502dab04ec2f578b6ccd58d788f3932c7 --cri-socket unix:///var/run/cri-dockerd.sock root@k8s-master:~# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master.iclinux.com NotReady control-plane 10m v1.24.10 k8s-node1.iclinux.com NotReady <none> 20s v1.24.10 k8s-node2.iclinux.com NotReady <none> 11s v1.24.10 k8s-node3.iclinux.com NotReady <none> 8s v1.24.10 ## 分发config master 执行 scp /root/.kube/config 172.31.6.204:/root/.kube # 基于helm部署网络组件 hybridnet , master 节点 ## 安装helm cd /usr/local/src && wget https://get.helm.sh/helm-v3.9.0-linux-amd64.tar.gz tar xvf helm-v3.9.0-linux-amd64.tar.gz mv linux-amd64/helm /usr/local/bin/ # 设置helm仓库 helm repo add hybridnet https://alibaba.github.io/hybridnet/ helm repo update ## 初始化网络组件 helm install hybridnet hybridnet/hybridnet -n kube-system --set init.cidr=10.200.0.0/16 # 注意 cidr为搭建k8s集群pod网段 ## 查看 root@k8s-master:/usr/local/src# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-typha-6f55876f98-hr7sm 1/1 Running 0 2m7s kube-system calico-typha-6f55876f98-kp47t 1/1 Running 0 2m7s kube-system calico-typha-6f55876f98-sdzmz 1/1 Running 0 2m7s kube-system coredns-7f74c56694-bfwqd 0/1 Pending 0 85m kube-system coredns-7f74c56694-blmcv 0/1 Pending 0 85m kube-system etcd-k8s-master.iclinux.com 1/1 Running 0 85m kube-system hybridnet-daemon-5c2fh 2/2 Running 0 2m7s kube-system hybridnet-daemon-5p4fn 0/2 Init:0/1 0 2m7s kube-system hybridnet-daemon-6hvz6 0/2 Init:0/1 0 2m7s kube-system hybridnet-daemon-tvb7x 2/2 Running 0 2m7s kube-system hybridnet-manager-6574dcc5fb-2wm76 0/1 Pending 0 2m7s kube-system hybridnet-manager-6574dcc5fb-ghxn6 0/1 Pending 0 2m7s kube-system hybridnet-manager-6574dcc5fb-l6gx6 0/1 Pending 0 2m7s kube-system hybridnet-webhook-76dc57b4bf-cf9mj 0/1 Pending 0 2m10s kube-system hybridnet-webhook-76dc57b4bf-klqzt 0/1 Pending 0 2m10s kube-system hybridnet-webhook-76dc57b4bf-wbsnj 0/1 Pending 0 2m10s kube-system kube-apiserver-k8s-master.iclinux.com 1/1 Running 0 85m kube-system kube-controller-manager-k8s-master.iclinux.com 1/1 Running 0 85m kube-system kube-proxy-864vx 1/1 Running 0 75m kube-system kube-proxy-h585r 1/1 Running 0 75m kube-system kube-proxy-m5wd5 1/1 Running 0 75m kube-system kube-proxy-vctbh 1/1 Running 0 85m kube-system kube-scheduler-k8s-master.iclinux.com 1/1 Running 0 85m # 设置选择器 kubectl label node k8s-node1.iclinux.com node-role.kubernetes.io/master= kubectl label node k8s-node2.iclinux.com node-role.kubernetes.io/master= kubectl label node k8s-node3.iclinux.com node-role.kubernetes.io/master= # 确保所有的pod都启动了 root@k8s-master:/usr/local/src# root@k8s-master:/usr/local/src# kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-typha-6f55876f98-hr7sm 1/1 Running 0 5m37s kube-system calico-typha-6f55876f98-kp47t 1/1 Running 0 5m37s kube-system calico-typha-6f55876f98-sdzmz 1/1 Running 0 5m37s kube-system coredns-7f74c56694-bfwqd 1/1 Running 0 89m kube-system coredns-7f74c56694-blmcv 1/1 Running 0 89m kube-system etcd-k8s-master.iclinux.com 1/1 Running 0 89m kube-system hybridnet-daemon-5c2fh 2/2 Running 1 (3m2s ago) 5m37s kube-system hybridnet-daemon-5p4fn 2/2 Running 1 (43s ago) 5m37s kube-system hybridnet-daemon-6hvz6 2/2 Running 1 (2m44s ago) 5m37s kube-system hybridnet-daemon-tvb7x 2/2 Running 1 (3m2s ago) 5m37s kube-system hybridnet-manager-6574dcc5fb-2wm76 1/1 Running 0 5m37s kube-system hybridnet-manager-6574dcc5fb-ghxn6 1/1 Running 0 5m37s kube-system hybridnet-manager-6574dcc5fb-l6gx6 1/1 Running 0 5m37s kube-system hybridnet-webhook-76dc57b4bf-cf9mj 1/1 Running 0 5m40s kube-system hybridnet-webhook-76dc57b4bf-klqzt 1/1 Running 0 5m40s kube-system hybridnet-webhook-76dc57b4bf-wbsnj 1/1 Running 0 5m40s kube-system kube-apiserver-k8s-master.iclinux.com 1/1 Running 0 89m kube-system kube-controller-manager-k8s-master.iclinux.com 1/1 Running 0 89m kube-system kube-proxy-864vx 1/1 Running 0 79m kube-system kube-proxy-h585r 1/1 Running 0 79m kube-system kube-proxy-m5wd5 1/1 Running 0 79m # 查看网络 root@k8s-node3:/usr/local/src# ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 ether 02:42:0e:b8:b1:9c txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.31.6.204 netmask 255.255.248.0 broadcast 172.31.7.255 inet6 fe80::20c:29ff:feda:8719 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:da:87:19 txqueuelen 1000 (Ethernet) RX packets 545289 bytes 695214333 (695.2 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 231360 bytes 42214413 (42.2 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0.vxlan4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 172.31.6.204 netmask 255.255.248.0 broadcast 172.31.7.255 inet6 fe80::20c:29ff:feda:8719 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:da:87:19 txqueuelen 0 (Ethernet) RX packets 27 bytes 2580 (2.5 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 12 bytes 1188 (1.1 KB) TX errors 0 dropped 1 overruns 0 carrier 0 collisions 0 hybr2f5133a0152: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet6 fe80::ecee:eeff:feee:eeee prefixlen 64 scopeid 0x20<link> ether ee:ee:ee:ee:ee:ee txqueuelen 0 (Ethernet) RX packets 124 bytes 12975 (12.9 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 134 bytes 33940 (33.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 hybrf4727d8f411: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet6 fe80::ecee:eeff:feee:eeee prefixlen 64 scopeid 0x20<link> ether ee:ee:ee:ee:ee:ee txqueuelen 0 (Ethernet) RX packets 124 bytes 12904 (12.9 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 131 bytes 33766 (33.7 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 4030 bytes 374007 (374.0 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 4030 bytes 374007 (374.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 root@k8s-node3:/usr/local/src# root@k8s-node3:/usr/local/src# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 172.31.0.2 0.0.0.0 UG 0 0 0 eth0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 172.31.0.0 0.0.0.0 255.255.248.0 U 0 0 0 eth0 root@k8s-node3:/usr/local/src# # 配置underlay 网络 ## 给支持underlay的节点打标签 kubectl label node k8s-node1.iclinux.com network=underlay-nethost kubectl label node k8s-node2.iclinux.com network=underlay-nethost kubectl label node k8s-node3.iclinux.com network=underlay-nethost ## 打标错误后的处理 kubectl label --overwrite node k8s-node1.iclinux.com network=underlay-nethost kubectl label --overwrite node k8s-node2.iclinux.com network=underlay-nethost kubectl label --overwrite node k8s-node3.iclinux.com network=underlay-nethost root@k8s-master:~/underlay-cases-files# cat 1.create-underlay-network.yaml --- apiVersion: networking.alibaba.com/v1 kind: Network metadata: name: underlay-network1 spec: netID: 0 type: Underlay nodeSelector: network: "underlay-nethost" --- apiVersion: networking.alibaba.com/v1 kind: Subnet metadata: name: underlay-network1 spec: network: underlay-network1 netID: 0 range: version: "4" # ipv4 cidr: "172.31.0.0/21" # 整个子网的网络规划 gateway: "172.31.0.2" # 外部网关地址 start: "172.31.6.1" end: "172.31.6.254" root@k8s-master:~/underlay-cases-files# kubectl apply -f 1.create-underlay-network.yaml network.networking.alibaba.com/underlay-network1 created subnet.networking.alibaba.com/underlay-network1 created root@k8s-master:~/underlay-cases-files# kubectl get network NAME NETID TYPE MODE V4TOTAL V4USED V4AVAILABLE LASTALLOCATEDV4SUBNET V6TOTAL V6USED V6AVAILABLE LASTALLOCATEDV6SUBNET init 4 Overlay 65534 2 65532 init 0 0 0 underlay-network1 0 Underlay 254 0 254 underlay-network1 0 0 0 # 验证 root@k8s-master:~/underlay-cases-files# kubectl create ns myserver namespace/myserver created k8s-master:~/underlay-cases-files# root@k8s-master:~/underlay-cases-files# cat 2.tomcat-app1-overlay.yaml kind: Deployment apiVersion: apps/v1 metadata: labels: app: myserver-tomcat-app1-deployment-overlay-label name: myserver-tomcat-app1-deployment-overlay namespace: myserver spec: replicas: 1 selector: matchLabels: app: myserver-tomcat-app1-overlay-selector template: metadata: labels: app: myserver-tomcat-app1-overlay-selector spec: #nodeName: k8s-node2.example.com containers: - name: myserver-tomcat-app1-container #image: tomcat:7.0.93-alpine image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/tomcat-app1:v1 imagePullPolicy: IfNotPresent ##imagePullPolicy: Always ports: - containerPort: 8080 protocol: TCP name: http env: - name: "password" value: "123456" - name: "age" value: "18" # resources: # limits: # cpu: 0.5 # memory: "512Mi" # requests: # cpu: 0.5 # memory: "512Mi" --- kind: Service apiVersion: v1 metadata: labels: app: myserver-tomcat-app1-service-overlay-label name: myserver-tomcat-app1-service-overlay namespace: myserver spec: type: NodePort ports: - name: http port: 80 protocol: TCP targetPort: 8080 nodePort: 30003 selector: app: myserver-tomcat-app1-overlay-selector root@k8s-master:~/underlay-cases-files# kubectl apply -f 2.tomcat-app1-overlay.yaml deployment.apps/myserver-tomcat-app1-deployment-overlay created service/myserver-tomcat-app1-service-overlay created root@k8s-master:~/underlay-cases-files# kubectl get pod -n myserver NAME READY STATUS RESTARTS AGE myserver-tomcat-app1-deployment-overlay-69dfff68d9-jjg45 1/1 Running 0 2m49s root@k8s-master:~/underlay-cases-files# root@k8s-master:~/underlay-cases-files# kubectl get pod -n myserver -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES myserver-tomcat-app1-deployment-overlay-69dfff68d9-jjg45 1/1 Running 0 3m26s 10.200.0.3 k8s-node2.iclinux.com <none> <none> # 此时发现 pod 默认为overlay 网络 root@k8s-master:~/underlay-cases-files# kubectl get svc -n myserver -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR myserver-tomcat-app1-service-overlay NodePort 172.31.5.162 <none> 80:30003/TCP 5m4s app=myserver-tomcat-app1-overlay-selector root@k8s-master:~/underlay-cases-files# # 可以通过http://172.31.6.203:30003/myapp/ 来访问验证 # 创建一个underlay 网络的pod root@k8s-master:~/underlay-cases-files# cat 3.tomcat-app1-underlay.yaml kind: Deployment #apiVersion: extensions/v1beta1 apiVersion: apps/v1 metadata: labels: app: myserver-tomcat-app1-deployment-underlay-label name: myserver-tomcat-app1-deployment-underlay namespace: myserver spec: replicas: 1 selector: matchLabels: app: myserver-tomcat-app1-underlay-selector template: metadata: labels: app: myserver-tomcat-app1-underlay-selector annotations: #使用Underlay或者Overlay网络 networking.alibaba.com/network-type: Underlay spec: #nodeName: k8s-node2.example.com containers: - name: myserver-tomcat-app1-container #image: tomcat:7.0.93-alpine image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/tomcat-app1:v2 imagePullPolicy: IfNotPresent ##imagePullPolicy: Always ports: - containerPort: 8080 protocol: TCP name: http env: - name: "password" value: "123456" - name: "age" value: "18" # resources: # limits: # cpu: 0.5 # memory: "512Mi" # requests: # cpu: 0.5 # memory: "512Mi" --- kind: Service apiVersion: v1 metadata: labels: app: myserver-tomcat-app1-service-underlay-label name: myserver-tomcat-app1-service-underlay namespace: myserver spec: # type: NodePort ports: - name: http port: 80 protocol: TCP targetPort: 8080 #nodePort: 40003 selector: app: myserver-tomcat-app1-underlay-selector root@k8s-master:~/underlay-cases-files# kubectl apply -f 3.tomcat-app1-underlay.yaml deployment.apps/myserver-tomcat-app1-deployment-underlay created service/myserver-tomcat-app1-service-underlay created root@k8s-master:~/underlay-cases-files# kubectl get pod -n myserver -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES myserver-tomcat-app1-deployment-overlay-69dfff68d9-hgr2h 1/1 Running 0 26m 10.200.0.4 k8s-node2.iclinux.com <none> <none> myserver-tomcat-app1-deployment-underlay-bd7cd59cf-nskp9 1/1 Running 0 54s 172.31.6.3 k8s-node1.iclinux.com <none> <none> root@k8s-master:~/underlay-cases-files# #验证地址:http://172.31.6.4:8080/myapp # 观察1节点网络 root@k8s-node1:~# ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 ether 02:42:af:7b:06:5e txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.31.6.202 netmask 255.255.248.0 broadcast 172.31.7.255 inet6 fe80::20c:29ff:fe76:cc0a prefixlen 64 scopeid 0x20<link> ether 00:0c:29:76:cc:0a txqueuelen 1000 (Ethernet) RX packets 1258413 bytes 1662158036 (1.6 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 495846 bytes 142946399 (142.9 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0.vxlan4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 172.31.6.202 netmask 255.255.248.0 broadcast 172.31.7.255 inet6 fe80::20c:29ff:fe76:cc0a prefixlen 64 scopeid 0x20<link> ether 00:0c:29:76:cc:0a txqueuelen 0 (Ethernet) RX packets 27 bytes 2594 (2.5 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 14 bytes 1394 (1.3 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 hybrcf33ee38b06: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::ecee:eeff:feee:eeee prefixlen 64 scopeid 0x20<link> ether ee:ee:ee:ee:ee:ee txqueuelen 0 (Ethernet) RX packets 25 bytes 2961 (2.9 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 35 bytes 3182 (3.1 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 25208 bytes 2007678 (2.0 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 25208 bytes 2007678 (2.0 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 root@k8s-node1:~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 172.31.0.2 0.0.0.0 UG 0 0 0 eth0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 172.31.0.0 0.0.0.0 255.255.248.0 U 0 0 0 eth0 # pod 地址可以固定, # 通过service来访问pod root@k8s-node1:~# kubectl get svc -n myserver -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR myserver-tomcat-app1-service-overlay NodePort 172.31.5.47 <none> 80:30003/TCP 36m app=myserver-tomcat-app1-overlay-selector myserver-tomcat-app1-service-underlay ClusterIP 172.31.5.86 <none> 80/TCP 10m app=myserver-tomcat-app1-underlay-selector # 正常不能直接访问,需要打通客户端到svc的通讯,生产环境需要网络同事添加路由,测试环境需要在测试机添加本地路由 ## 配置hybiridnet的默认网络行为从underlay修改为overlay helm upgrade hybridnet hybridnet/hybridnet -n kube-system --set defualtNetworkType=Overlay 或者: kubectl edit deploy hybridnet-webhook -n kube-system env: - name: DEFAULT_NETWORK_TYPE value: Overlay kubectl edit deploy hybridnet-manager -n kube-system env: - name: DEFAULT_NETWORK_TYPE value: Overlay
3、总结网络组件 flannel vxlan 模式的网络通信流程
1. 源pod发起请求,此时报文中源IP为pod的eth0的ip,源mac为pod的eth0的mac,目的ip为目的pod的ip,目的mac为网关(cni0)的mac 抓包命令:tcpdump -nn -vvv -i veth91d6f855 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 2. 数据报文通过veth peer 发送给网关cni0,检查目的mac就是发给自己的,cni0进行目标IP检查,如果是同一个网桥的报文就直接抓发,不是的话就发送给flannel.1,此时保卫会被修改 源IP:pod IP, 10.100.2.2 目的IP:Pod ip, 10.100.1.6 源mac:源pod mac 目的mac: cni mac 抓包:tcpdump -nn -vvv -i cni0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 3. 到达flannel.1,检查目的mac就是发给自己的,开始匹配路由表,先实现ov报文的内层封装(主要修改目的pod的对端flannel.1的mac,源mac为当前宿主机flannel.1的mac) bridge fdb show dev flannel.1 抓包命令:tcpdump -nn -vvv -i flannel.1 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 4. 源宿主机基于udp封装vxlan报文 udp 源端口:随机 udp目的端口:8472 源IP:源pod所在的宿主机的物理网卡IP 目的IP:目的pod所在的宿主机的物理网卡IP 源mac: 源pod所在的宿主机的物理网卡 目的mac:目的pod所在宿主机的物理网卡 抓包命令: tcpdump -nn -vvv -i eth0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 5. 报文到达目的宿主机物理网卡,开始解封装报文 外层目的IP为本机物理网卡,解开后发现里面还有一层目的ip和目的mac,发现目的ip为10.100.1.6,目的mac为xxx(目的flannel.1的mac),然后将报文发给flannel.1 抓包命令:tcpdump -nn -vvv -i eth0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 6. 报文到达目的宿主机flannel.1 flannel.1检查报文的目的ip,发现是去本机cni0的子网,将请求报文抓发给cnio 抓包:tcpdump -nn -vvv -i flannel.1 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 目的ip:10.100.1.6 目的pod 源IP:10.100.2.2 源pod 目的mac: 目的pod所在宿主机的flannel.1 mac 源mac:源pod所在宿主机的flannel.1 mac 7. 报文到达目的宿主机cni0 cni0 基于目的ip检查mac地址表,修改目的mac为目的pod mac后将将请求抓发给pod 源IP:源pod IP 目的IP: 目的pod ip 源mac:cni0的mac 目的mac:目的pod的mac 抓包: tcpdump -nn -vvv -i cni0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 -w 7.flannel-flannel-vxlan-cni0-in.pcap 8. 报文到达目的宿主机pod cnio 收到报文发现去往10.100.1.6,检查mac地址表发现是本地接口,然后通过网桥接口发给pod 目的IP: 目的pod IP 源IP:源POD IP 目的mac: 目的pod mac 源mac:cni0的mac 抓包:tcpdump -nn -vvv -i vethf38183ee -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 -w 8.flannel-vxlan-vethf38183ee-in.pcap
4、总结网络组件 calico IPIP 模式的网络通信流程
1. 源pod发起请求,报文到达宿主机与pod对应的网卡 2. 报文到达在宿主机与pid对应的网卡 抓包:tcpdump -nn -vvv -i cali2b2e7c9e43e -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 此时,下一跳的网关都为169.254.1.1,目的mac都为ee:ee:ee:ee:ee:ee 3. 报文到达宿主机tun0 抓包:tcpdump -nn -vvv -i cali2b2e7c9e43e -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 4. 报文到达 源宿主机eth0 抓包:# tcpdump -nn -vvv -i eth0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 and ! port 2380 and ! host 172.31.7.101 -w 3.eth0.pca 5. 报文到达目的宿主机eth0 此时收到的是源宿主机的IPinIP报文,外层为源宿主机和目的宿主机的源mac目的mac、源IP及目的IP,内部为源pod IP及目的pod的IP,没有使用mac地址,解封后发现是去往10.200.151.205 6. 报文到达目的宿主机tunl0 抓包:# tcpdump -nn -vvv -i tunl0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 and ! port 2380 and ! host 172.31.7.101 -w 5-tunl0.pca 7. 报文到达目的pod与目的宿主机对应的网卡 源IP为源pod ip 源mac为tunl0 mac 目的ip为pod ip 目的mac为目的pod mac 随后报文被转发被目的mac(目的 pod mac) 抓包:tcpdump -nn -vvv -i cali32ecf57bfbe -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 and ! port 2380 and ! host 172.31.7.101 -w 6-cali32ecf57bfbe.pca 8. 报文到达目的pod 抓包:tcpdump -i eth0 -vvv -nn -w 7-dst-pod.pcap 报文到达目的pod,目的pod接受请求并构建响应报文并原路返回给源pod