kubeadm安装k8s
一、简介
Kubernetes有两种方式,第一种是二进制的方式,可定制但是部署复杂容易出错;第二种是kubeadm工具安装,部署简单,不可定制化。本次我们部署kubeadm版.
服务器配置至少是2G2核的。如果不是则可以在集群初始化后面增加 --ignore-preflight-errors=NumCPU
二、部署规划
1、版本规划
软件 | 版本 |
---|---|
Centos | 7.5版本及以上 |
Docker | 19.03及以上 |
Kubernetse | V1.20.5及以上 |
Flanner | V0.13.0及以上 |
Kernel-lm | kernel-lt-4.4.245-1.el7.elrepo.x86_64.rpm及以上 |
Kernel-lm-devel | kernel-lt-devel-4.4.245-1.el7.elrepo.x86_64.rpm |
2、节点规划
Hostname | Ip | 内核版本 |
---|---|---|
k8s-master-01 | 192.168.15.31 | 5.0以上 |
k8s-node-01 | 192.168.15.32 | 5.0以上 |
k8s-node-02 | 192.168.15.33 | 5.0以上 |
三、修改网络及(三台主机)
1、修改虚拟网络编辑器
2、克隆主机
192.168.15.31 k8s-m-01 m1
192.168.15.32 k8s-n-01 n1
192.168.15.33 k8s-n-02 n2
3、修改三台实例IP网关
内网eth1也要不同IP否则三台机器IP冲突。修改完成后重启网卡,ping baidu.com 查看网络是否畅通
四、修改主机名及解析(三台节点)
1、修改主机名
hostnamectl set-hostname k8s-master-01
hostnamectl set-hostname k8s-node-01
hostnamectl set-hostname k8s-node-02
2、添加host解析
cat /etc/hosts
192.168.15.31 k8s-master-01 m1
192.168.15.32 k8s-node-01 n1
192.168.15.33 k8s-node-02 n2
3、添加DNS解析
[root@k8s-master-01 ~]# vim /etc/resolv.conf
# Generated by NetworkManager
nameserver 223.5.5.5
nameserver 114.114.114.114
五、系统优化(三个节点全做)
1、关闭selinux
# 永久关闭
sed -i 's#enforcing#disabled#g' /etc/selinux/config
#临时关闭
setenforce 0
2、关闭防火墙
systemctl disable --now firewalld
3、关闭swap分区
# 关闭swap分区
swapoff -a
# kubelet忽略swap
echo 'KUBELET_EXTRA_ARGS="--fail-swap-on=false"' > /etc/sysconfig/kubelet
# 注释swap分区
vim /etc/fstab
4、做免密登录(主节点做)
[root@k8s-master-01 ~]# rm -rf /root/.ssh [root@k8s-master-01 ~]# ssh-keygen 交互式直接全部回车 [root@k8s-master-01 ~]# cd /root/.ssh/ [root@k8s-master-01 ~/.ssh]# mv id_rsa.pub authorized_keys [root@k8s-master-01 ~/.ssh]# scp -r /root/.ssh 192.168.15.32:/root [root@k8s-master-01 ~/.ssh]# scp -r /root/.ssh 192.168.15.33:/root
5、同步集群时间
echo '#Timing synchronization time' >>/var/spool/cron/root #给定时任务加上注释 echo '0 */1 * * * /usr/sbin/ntpdate ntp1.aliyun.com &>/dev/null' >>/var/spool/cron/root #设置定时任务 crontab -l #检查结果
6、更新yum源
rm -rf /etc/yum.repos.d/* curl -o /etc/yum.repos.d/CentOS-Base.repo https://repo.huaweicloud.com/repository/conf/CentOS-7-reg.repo yum remove epel-release rm -rf /var/cache/yum/x86_64/6/epel/ yum install -y https://repo.huaweicloud.com/epel/epel-release-latest-7.noarch.rpm sed -i "s/#baseurl/baseurl/g" /etc/yum.repos.d/epel.repo sed -i "s/metalink/#metalink/g" /etc/yum.repos.d/epel.repo sed -i "s@https\?://download.fedoraproject.org/pub@https://repo.huaweicloud.com@g" /etc/yum.repos.d/epel.repo yum clean all yum makecache
7、更新系统软件(排除内核)
yum update -y --exclud=kernel*
8、安装基础常用软件
yum install wget expect vim net-tools ntp bash-completion ipvsadm ipset jq iptables conntrack sysstat libseccomp -y
9、更新系统内核(docker 对系统内核要求比较高,最好使用4.4+)
主节点操作
[root@k8s-master-01 ~]# wget https://elrepo.org/linux/kernel/el7/x86_64/RPMS/kernel-lt-5.4.107-1.el7.elrepo.x86_64.rpm [root@k8s-master-01 ~]# wget https://elrepo.org/linux/kernel/el7/x86_64/RPMS/kernel-lt-devel-5.4.107-1.el7.elrepo.x86_64.rpm [root@k8s-master-01 ~]# for i in n1 n2 m1 ; do scp kernel-lt-* $i:/opt; done
三个节点操作
#安装 yum localinstall -y /opt/kernel-lt* #调到默认启动 grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg #查看当前默认启动的内核 grubby --default-kernel #重启系统 reboot
10、安装IPVS
1)yum安装
yum install -y conntrack-tools ipvsadm ipset conntrack libseccomp
2)加载IPVS模块
cat > /etc/sysconfig/modules/ipvs.modules <<EOF #!/bin/bash ipvs_modules="ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack" for kernel_module in \${ipvs_modules}; do /sbin/modinfo -F filename \${kernel_module} > /dev/null 2>&1 if [ $? -eq 0 ]; then /sbin/modprobe \${kernel_module} fi done EOF chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep ip_vs
11、修改内核启动参数优化
cat > /etc/sysctl.d/k8s.conf << EOF net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp.keepaliv.probes = 3 net.ipv4.tcp_keepalive_intvl = 15 net.ipv4.tcp.max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp.max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.top_timestamps = 0 net.core.somaxconn = 16384 EOF # 立即生效 sysctl --system
12、安装docker(三台节点都要做)
1)卸载之前的docker
yum remove docker docker-common docker-selinux docker-engine -y
2)安装docker所需安装包
yum install -y yum-utils device-mapper-persistent-data lvm2
3)安装docker yum源
wget -O /etc/yum.repos.d/docker-ce.repo https://repo.huaweicloud.com/docker-ce/linux/centos/docker-ce.repo
4)安装docker
yum install docker-ce -y
# 修改配置:驱动与kubelet保持一致,否则会后期无法启动kubelet { "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors":["https://reg-mirror.qiniu.com/"] }
不成功多执行几次
5)启动并设置开机自启
systemctl enable --now docker.service
六、安装k8s
1、安装kubelet(所有节点都要装)
1)安装kebenetes yum 源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
2)安装并启动且开机自启
[root@k8s-m-01 ~]# yum install -y kubelet kubeadm kubectl [root@k8s-m-01 ~]# systemctl enable --now kubelet
2、主节点操作(node节点不执行)
1)初始化master节点(仅在master节点上执行)
kubeadm init \ --image-repository=registry.cn-hangzhou.aliyuncs.com/k8sos \ --kubernetes-version=v1.20.2 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16
# 也可以使用 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers
kubeadm init过程中常见问题处理
(1)缺少默认路由
route add default gw xxx.xxx.xxx.xxx dev 网卡名
(2)提示warning信息说proxy代理问题,会影响安装,在scheduler等组件里会报Forbidden错误,去掉该warnning信息如下
# 步骤1 vim /etc/profile export no_proxy=127.0.0.1,本机ip地址 # 步骤2 source /etc/profile
(3)kubeadm默认镜像拉取地址为k8s.gcr.io,请使用指定仓库
(4)驱动不匹配问题,systemd
步骤1、修改docker配置,并重复服务
步骤2、修改kubelet配置
步骤3、重复docker与kubelet
步骤5、最后、检查
[root@jsswx191 ~]# docker info|grep "Cgroup Driver"
是否输出 Cgroup Driver: systemd
[root@xxx ~]# ps aux |grep /usr/bin/kubelet |grep -v grep
(5)kubeadm init拉起集群超时,pause容器不停地created,有可能docker的host网络缺失问题,排错方法
docker container ls -a |grep pause
docker inspect 容器ID
如果发现docker的host网络缺失问题,导致pause容器创建失败,后续的组件容器都拉不起来,需要手动重建docker网络
(6)kubectl命令不可用
(7)kubeadm init创建完集群后,有pod一直是pending状态
(8)etcd容器挂掉,进而导致apiserver挂掉:建议先docker inspect 容器ID查看,一个可能的原因是
(9)硬盘资源不够用
(10)报错,提示端口范围问题
(11)kubectl get pods时报错,指向一个未知的ip地址
(12)排错命令汇总:
(13)清理集群,然后重新kubeadm init ...
(14)k8s端口参考:
2)建立用户集群权限
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config #如果是root用户,则可以使用: echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile source ~/.bash_profile
3)安装集群网络插件(flannel.yaml见附件)
[root@k8s-master-01 ~]#kubectl apply -f
--- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: psp.flannel.unprivileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default spec: privileged: false volumes: - configMap - secret - emptyDir - hostPath allowedHostPaths: - pathPrefix: "/etc/cni/net.d" - pathPrefix: "/etc/kube-flannel" - pathPrefix: "/run/flannel" readOnlyRootFilesystem: false # Users and groups runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny # Privilege Escalation allowPrivilegeEscalation: false defaultAllowPrivilegeEscalation: false # Capabilities allowedCapabilities: ['NET_ADMIN', 'NET_RAW'] defaultAddCapabilities: [] requiredDropCapabilities: [] # Host namespaces hostPID: false hostIPC: false hostNetwork: true hostPorts: - min: 0 max: 65535 # SELinux seLinux: # SELinux is unused in CaaSP rule: 'RunAsAny' --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel rules: - apiGroups: ['extensions'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: ['psp.flannel.unprivileged'] - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux hostNetwork: true priorityClassName: system-node-critical tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN", "NET_RAW"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg
4)将工作节点加入集群
[root@k8s-master-01 ~]#kubeadm token create --print-join-command kubeadm join 192.168.15.31:6443 --token zvyidd.gxnw8v1zdv3pdlzf --discovery-token-ca-cert-hash sha256:05b6946a4de3f0e6900291118cf25de7bcdce3bcd19aa53eaaa8ffa86d67e440 ## 注:将上方生成的token复制到node节点上执行。
5)检查集群状态
## 第一种方式 [root@k8s-master-01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-01 Ready control-plane,master 11m v1.20.5 k8s-node-01 Ready <none> 3m13s v1.20.5 k8s-node-02 Ready <none> 3m9s v1.20.5 # 第二种方式 [root@k8s-master-01 ~]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-f68b4c98f-mmxkc 1/1 Running 0 11m coredns-f68b4c98f-nvp6b 1/1 Running 0 11m etcd-k8s-master-01 1/1 Running 0 11m kube-apiserver-k8s-master-01 1/1 Running 0 11m kube-controller-manager-k8s-master-01 1/1 Running 0 11m kube-flannel-ds-25kk5 1/1 Running 0 4m49s kube-flannel-ds-9zkkl 1/1 Running 0 3m22s kube-flannel-ds-sx57n 1/1 Running 0 3m26s kube-proxy-2gsrl 1/1 Running 0 11m kube-proxy-jkdbs 1/1 Running 0 3m22s kube-proxy-wqrc2 1/1 Running 0 3m26s kube-scheduler-k8s-master-01 1/1 Running 0 11m # 第三种方式:直接验证集群DNS [root@k8s-master-01 ~]# kubectl run test -it --rm --image=busybox:1.28.3 If you don't see a command prompt, try pressing enter. / # nslookup kubernetes Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: kubernetes Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
附件:
--- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: psp.flannel.unprivileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default spec: privileged: false volumes: - configMap - secret - emptyDir - hostPath allowedHostPaths: - pathPrefix: "/etc/cni/net.d" - pathPrefix: "/etc/kube-flannel" - pathPrefix: "/run/flannel" readOnlyRootFilesystem: false # Users and groups runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny # Privilege Escalation allowPrivilegeEscalation: false defaultAllowPrivilegeEscalation: false # Capabilities allowedCapabilities: ['NET_ADMIN', 'NET_RAW'] defaultAddCapabilities: [] requiredDropCapabilities: [] # Host namespaces hostPID: false hostIPC: false hostNetwork: true hostPorts: - min: 0 max: 65535 # SELinux seLinux: # SELinux is unused in CaaSP rule: 'RunAsAny' --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel rules: - apiGroups: ['extensions'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: ['psp.flannel.unprivileged'] - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux hostNetwork: true priorityClassName: system-node-critical tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN", "NET_RAW"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg