Kubernetes Cluster部署
1、基本环境说明
ip: 192.168.115.149 主机名:node1 CentOS Linux release 7.9.2009,内核版本为3.10.0-1160.81.1.el7.x86_64
ip: 192.168.115.151 主机名:node2 CentOS Linux release 7.9.2009,内核版本为3.10.0-1160.81.1.el7.x86_64
ip: 192.168.115.152 主机名:node3 CentOS Linux release 7.9.2009,内核版本为3.10.0-1160.81.1.el7.x86_64
2、安装说明
安装方式:yum方式安装部署
安装版本:kubelet-1.21.1 kubeadm-1.21.1 kubectl-1.21.1
3、准备工作
说明: k8s集群涉及到的3台机器都需要进行准备
1、检查ip和uuid:确保每个节点上 MAC 地址和 product_uuid 的唯一性
2、允许 iptables 检查桥接流量:确保 br_netfilter 模块被加载、iptables 能够正确地查看桥接流量、设置路由
3、关闭系统的selinux、防火墙、Swap
4、修改主机名,添加hosts
5、安装好docker: 注意docker和k8s的版本对应关系,并设置设置cgroup驱动,这里用systemd,否则后续kubeadm init会有相关warning。
docker安装参考文档:Linux下docker安装部署 - MeeSeeks-B - 博客园 (cnblogs.com)
1 ===1.检查ip和uuid 2 ip a 3 cat /sys/class/dmi/id/product_uuid 4 5 ===2.允许 iptables 检查桥接流量 6 1.确保 br_netfilter 模块被加载 7 #显示已载入系统的模块 8 lsmod | grep br_netfilter 9 #如果未加载则加载该模块 10 modprobe br_netfilter 11 2.iptables 能够正确地查看桥接流量 12 确保在sysctl 配置中将 net.bridge.bridge-nf-call-iptables 设置为 1 13 sysctl -a |grep net.bridge.bridge-nf-call-iptables 14 3.设置路由 15 cat <<EOF | tee /etc/modules-load.d/k8s.conf 16 br_netfilter 17 EOF 18 19 cat <<EOF | tee /etc/sysctl.d/k8s.conf 20 net.bridge.bridge-nf-call-ip6tables = 1 21 net.bridge.bridge-nf-call-iptables = 1 22 EOF 23 24 sysctl --system 25 26 ===3.关闭selinux 27 setenforce 0 //临时关闭selinux,重启后失效 28 #永久关闭,休修改文件后重启服务器即可 29 vim /etc/selinux/config 30 #修改SELINUX=enforcing 为 31 SELINUX=disabled 32 33 ===4.关闭防火墙 34 systemctl status firewalld 35 systemctl stop firewalld 36 37 ===5.关闭Swap 38 #临时关闭 39 swapoff -a 40 #永久关闭 41 vim /etc/fstab 42 注释掉 SWAP 的自动挂载 43 44 vim /etc/sysctl.d/k8s.conf (可选) 45 添加下面一行: 46 vm.swappiness=0 47 48 sysctl -p /etc/sysctl.d/k8s.conf
4、安装kubeadm,kubelet和kubectl
说明: k8s集群涉及到的3台机器都需要进行准备,安装的版本为kubelet-1.21.1 kubeadm-1.21.1 kubectl-1.21.1
#添加k8s阿里云YUM软件源 vim /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg #安装kubeadm,kubelet和kubectl,注意和docker版本对应 yum install -y kubelet-1.21.1 kubeadm-1.21.1 kubectl-1.21.1 #启动,注意master节点 systemctl start kubelet systemctl enable kubelet systemctl status kubelet
5、集群部署
说明: 主要是对master节点初始化和node节点的接入
#master节点部署初始化master节点 kubeadm init --apiserver-advertise-address=192.168.115.149 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.21.1 --service-cidr=10.140.0.0/16 --pod-network-cidr=10.240.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#node节点部署,根据kubeadm init执行成功生成的命令复制到node节点执行
kubeadm join 192.168.115.149:6443 --token swshsb.7yu37gx1929902tl \
--discovery-token-ca-cert-hash sha256:626728b1a039991528a031995ed6ec8069382b489c8ae1e61286f96fcd9a3bfc
#node节点加入后,可在master节点进行查看节点加入情况
kubectl get nodes
集群部署后查看集群状态的话还不是ready的状态,所以需要安装网络插件来完成k8s的集群创建的最后一步
6、安装网络插件
说明:master节点安装,可安装flannel插件也可安装安装calico插件,此处安装flannel插件
1 vim kube-flannel.yml 2 --- 3 apiVersion: policy/v1beta1 4 kind: PodSecurityPolicy 5 metadata: 6 name: psp.flannel.unprivileged 7 annotations: 8 seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default 9 seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default 10 apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default 11 apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default 12 spec: 13 privileged: false 14 volumes: 15 - configMap 16 - secret 17 - emptyDir 18 - hostPath 19 allowedHostPaths: 20 - pathPrefix: "/etc/cni/net.d" 21 - pathPrefix: "/etc/kube-flannel" 22 - pathPrefix: "/run/flannel" 23 readOnlyRootFilesystem: false 24 runAsUser: 25 rule: RunAsAny 26 supplementalGroups: 27 rule: RunAsAny 28 fsGroup: 29 rule: RunAsAny 30 allowPrivilegeEscalation: false 31 defaultAllowPrivilegeEscalation: false 32 allowedCapabilities: ['NET_ADMIN', 'NET_RAW'] 33 defaultAddCapabilities: [] 34 requiredDropCapabilities: [] 35 hostPID: false 36 hostIPC: false 37 hostNetwork: true 38 hostPorts: 39 - min: 0 40 max: 65535 41 seLinux: 42 rule: 'RunAsAny' 43 --- 44 kind: ClusterRole 45 apiVersion: rbac.authorization.k8s.io/v1 46 metadata: 47 name: flannel 48 rules: 49 - apiGroups: ['extensions'] 50 resources: ['podsecuritypolicies'] 51 verbs: ['use'] 52 resourceNames: ['psp.flannel.unprivileged'] 53 - apiGroups: 54 - "" 55 resources: 56 - pods 57 verbs: 58 - get 59 - apiGroups: 60 - "" 61 resources: 62 - nodes 63 verbs: 64 - list 65 - watch 66 - apiGroups: 67 - "" 68 resources: 69 - nodes/status 70 verbs: 71 - patch 72 --- 73 kind: ClusterRoleBinding 74 apiVersion: rbac.authorization.k8s.io/v1 75 metadata: 76 name: flannel 77 roleRef: 78 apiGroup: rbac.authorization.k8s.io 79 kind: ClusterRole 80 name: flannel 81 subjects: 82 - kind: ServiceAccount 83 name: flannel 84 namespace: kube-system 85 --- 86 apiVersion: v1 87 kind: ServiceAccount 88 metadata: 89 name: flannel 90 namespace: kube-system 91 --- 92 kind: ConfigMap 93 apiVersion: v1 94 metadata: 95 name: kube-flannel-cfg 96 namespace: kube-system 97 labels: 98 tier: node 99 app: flannel 100 data: 101 cni-conf.json: | 102 { 103 "name": "cbr0", 104 "cniVersion": "0.3.1", 105 "plugins": [ 106 { 107 "type": "flannel", 108 "delegate": { 109 "hairpinMode": true, 110 "isDefaultGateway": true 111 } 112 }, 113 { 114 "type": "portmap", 115 "capabilities": { 116 "portMappings": true 117 } 118 } 119 ] 120 } 121 net-conf.json: | 122 { 123 "Network": "10.244.0.0/16", 124 "Backend": { 125 "Type": "vxlan" 126 } 127 } 128 --- 129 apiVersion: apps/v1 130 kind: DaemonSet 131 metadata: 132 name: kube-flannel-ds 133 namespace: kube-system 134 labels: 135 tier: node 136 app: flannel 137 spec: 138 selector: 139 matchLabels: 140 app: flannel 141 template: 142 metadata: 143 labels: 144 tier: node 145 app: flannel 146 spec: 147 affinity: 148 nodeAffinity: 149 requiredDuringSchedulingIgnoredDuringExecution: 150 nodeSelectorTerms: 151 - matchExpressions: 152 - key: kubernetes.io/os 153 operator: In 154 values: 155 - linux 156 hostNetwork: true 157 priorityClassName: system-node-critical 158 tolerations: 159 - operator: Exists 160 effect: NoSchedule 161 serviceAccountName: flannel 162 initContainers: 163 - name: install-cni-plugin 164 image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0 165 command: 166 - cp 167 args: 168 - -f 169 - /flannel 170 - /opt/cni/bin/flannel 171 volumeMounts: 172 - name: cni-plugin 173 mountPath: /opt/cni/bin 174 - name: install-cni 175 image: rancher/mirrored-flannelcni-flannel:v0.18.1 176 command: 177 - cp 178 args: 179 - -f 180 - /etc/kube-flannel/cni-conf.json 181 - /etc/cni/net.d/10-flannel.conflist 182 volumeMounts: 183 - name: cni 184 mountPath: /etc/cni/net.d 185 - name: flannel-cfg 186 mountPath: /etc/kube-flannel/ 187 containers: 188 - name: kube-flannel 189 image: rancher/mirrored-flannelcni-flannel:v0.18.1 190 command: 191 - /opt/bin/flanneld 192 args: 193 - --ip-masq 194 - --kube-subnet-mgr 195 resources: 196 requests: 197 cpu: "100m" 198 memory: "50Mi" 199 limits: 200 cpu: "100m" 201 memory: "50Mi" 202 securityContext: 203 privileged: false 204 capabilities: 205 add: ["NET_ADMIN", "NET_RAW"] 206 env: 207 - name: POD_NAME 208 valueFrom: 209 fieldRef: 210 fieldPath: metadata.name 211 - name: POD_NAMESPACE 212 valueFrom: 213 fieldRef: 214 fieldPath: metadata.namespace 215 - name: EVENT_QUEUE_DEPTH 216 value: "5000" 217 volumeMounts: 218 - name: run 219 mountPath: /run/flannel 220 - name: flannel-cfg 221 mountPath: /etc/kube-flannel/ 222 - name: xtables-lock 223 mountPath: /run/xtables.lock 224 volumes: 225 - name: run 226 hostPath: 227 path: /run/flannel 228 - name: cni-plugin 229 hostPath: 230 path: /opt/cni/bin 231 - name: cni 232 hostPath: 233 path: /etc/cni/net.d 234 - name: flannel-cfg 235 configMap: 236 name: kube-flannel-cfg 237 - name: xtables-lock 238 hostPath: 239 path: /run/xtables.lock 240 type: FileOrCreate
#修改net-conf.json 下面的网段为上面初始化master pod-network-cidr的网段地址 sed -i 's/10.244.0.0/10.240.0.0/' kube-flannel.yml #执行 kubectl apply -f kube-flannel.yml #执行查看安装的状态 kubectl get pods --all-namespaces #查看集群的状态是否为ready kubectl get nodes
===补充卸载flannel================
1、在master节点,找到flannel路径,删除flannel
kubectl delete -f kube-flannel.yml
2、在node节点清理flannel网络留下的文件
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -f /etc/cni/net.d/*
执行完上面的操作,重启kubelet
7、测试kubernetes集群
说明:创建一个pod,开放对外端口访问,这里会随机映射一个端口,不指定ns,会默认创建在default下
kubectl create deployment nginx --image=nginx kubectl expose deployment nginx --port=80 --type=NodePort
8、问题总结
8.1、master节点启动kubelet异常
查看kubelet状态有如下报错属正常现象,正常进行master初始化即可
8.2、master初始化问题处理
执行kubeadm init --apiserver-advertise-address=192.168.115.149 --kubernetes-version v1.21.1 --service-cidr=10.140.0.0/16 --pod-network-cidr=10.240.0.0/16
报错如下:
原因分析:由于国内网络原因,kubeadm init
会卡住不动,一卡就是很长时间,然后报出这种问题,kubeadm init未设置镜像地址,就默认下载k8s.gcr.io的docker镜像,但是国内连不上https://k8s.gcr.io/v2/
解决方案:kubeadm init添加镜像地址,执行kubeadm init --apiserver-advertise-address=192.168.115.149 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.21.1 --service-cidr=10.140.0.0/16 --pod-network-cidr=10.240.0.0/16
报错如下:
原因分析:拉取 registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.0镜像失败
解决方案:可查询需要下载的镜像,手动拉取镜像修改tag
#查询需要下载的镜像 kubeadm config images list
#查询镜像 docker images
发现已经有coredns:v1.8.0镜像但是tag不一样,修改
docker tag registry.aliyuncs.com/google_containers/coredns:v1.8.0 registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.0
再次执行kubeadm init --apiserver-advertise-address=192.168.115.149 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.21.1 --service-cidr=10.140.0.0/16 --pod-network-cidr=10.240.0.0/16
成功!!!!!
8.3 kernel:NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [ksoftirqd/1:14]
大量高负载程序,造成cpu soft lockup。
Soft lockup就是内核软死锁,这个bug没有让系统彻底死机,但是若干个进程(或者kernel thread)被锁死在了某个状态(一般在内核区域),很多情况下这个是由于内核锁的使用的问题。
解决方案:
https://blog.csdn.net/qq_44710568/article/details/104843432
https://blog.csdn.net/JAVA_LuZiMaKei/article/details/120140987