k8s集群安装
一、安装要求
1.1 服务器节点要求
部署Kubernetes集群机器需要满足以下几个条件:
- 一台或多台机器,操作系统 CentOS7.x-86_x64
- 硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
- 可以访问外网,需要拉取镜像,如果服务器不能上网,需要提前下载镜像并导入节点
- 禁止swap分区
1.2对于kubeEdge
KubeEdge由云和边缘组成。它建立在Kubernetes之上,为联网、应用部署和云与边缘之间的元数据同步提供核心基础设施支持。所以如果我们想要设置KubeEdge,我们需要设置Kubernetes集群(可以使用现有的集群),云端和边缘端。
在cloud side, 我们需要安装:
- Docker,
- Kubernetes cluster
- cloudcore
在 edge side, 我们需要安装:
- Docker,
- MQTT (We can also use internal MQTT broker) (配置可以选用,不是一定需要)
- edgecore
2.准备环境
角色 | IP | 工作负载 |
---|---|---|
master(云端) | 192.168.16.100 | k8s、docker、cloudcore |
node(边缘端) | 192.168.16.x | docker、edgecore |
1) 关闭防火墙:
systemctl stop firewalld
设置开机禁用防火墙:
systemctl disable firewalld
2) 关闭selinux:
临时禁用:setenforce 0
永久禁用:
vi /etc/selinux/config # 或者修改/etc/sysconfig/selinux
SELINUX=disabled
查看:getenforce
3) 关闭swap:(K8S 1.8版本后必须关闭)
临时关闭:swapoff -a
永久关闭:vi /etc/fstab,注释掉swap行
# sed -i 's/.*swap.*/#&/' /etc/fstab #/dev/mapper/centos-swap swap swap defaults 0 0
或
[root@master-node]# cp -p /etc/fstab /etc/fstab.bak$(date '+%Y%m%d%H%M%S') [root@master-node]# swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab [root@master-node]# systemctl daemon-reload [root@master-node]# systemctl restart kubelet
通过free查看swap开关情况:
4)根据规划设置主机名
hostnamectl set-hostname <hostname>
5) 在master添加hosts
cat >> /etc/hosts << EOF 192.168.16.100 master EOF
6) 时区配置和时间同步
查看:date +"%Z %z" 查看当前时区信息
设置:tzselect根据提示进行设置
时间同步:
yum install ntpdate -y ntpdate
time.windows.com
7)将桥接的IPV4流量传递到iptables 的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
# 生效
[root@master-node kubeedge]# sysctl --system
3.安装Docker
所有节点安装Docker
1) 更新yum
yum update
2) 安装 yum-utils,它提供了 yum-config-manager,可用来管理yum源
yum install -y yum-utils wget
3)添加yum源
方式一:
sudo yum-config-manager \ --add-repo \ http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
方式二:
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
3) 安装docker
sudo yum install -y docker-ce
yum -y install docker-ce-18.06.1.ce-3.el7
4) 启动docker
设置开机启动:systemctl enable docker
立即启动服务:systemctl start docker
查看docker状态:
systemctl status docker
5)配置/etc/docker/daemon.json
[root@master-node kubeedge]cat > /etc/docker/daemon.json << EOF { "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"], "insecure-registries": ["harbor.domain.io"] } EOF [root@master-node kubeedge]systemctl daemon-reload && systemctl restart docker 注:registry-mirrors(国内镜像仓库)、insecure-registries(私有镜像仓库)
二、K8S集群主节点部署
K8S集群部署:介绍两种集群部署方式:
1、基于Kubeadm方式搭建K8s集群
2、基于边缘计算 kubeedge部署k8s集群
1、k8s master主服务部署
master(云端)节点安装/kubeadm/kubelet,Kubernetes默认CRI(容器运行时)为Docker,因此先安装Docker
1.1 K8s安装
1)添加阿里云YUM软件源
cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
2)安装kubeadm,kubelet和kubectl
由于版本更新频繁,这里指定版本号部署:
yum install -y kubelet-1.19.10 kubeadm-1.19.10 kubectl-1.19.10
启动kubelet并设置开机启动服务:
#重新加载配置文件 systemctl daemon-reload #启动kubelet systemctl start kubelet #查看kubelet启动状态 systemctl status kubelet #没启动成功,报错先不管,后面的kubeadm init会拉起 #设置开机自启动 systemctl enable kubelet #查看kubelet开机启动状态 enabled:开启, disabled:关闭 systemctl is-enabled kubelet #查看日志 journalctl -xefu kubelet
#查看kubelet的日志
journalctl -xef -u kubelet -n 20
注:用kubeadm的方法安装kubelet后,运行systemctl status kubelet
发现kubelet服务启动失败,错误代码255。
kubelet.service: main process exited, code=exited, status=255/n/a
运行journalctl -xefu kubelet
命令查看systemd日志才发现,真正的错误是:
unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
这个错误在运行kubeadm init
生成CA证书后会被自动解决,此处可先忽略。
3)部署Kubernetes Master
使用kubeadm init命令进行集群初始化
在192.168.16.100(Master)执行
kubeadm init \ --apiserver-advertise-address=192.168.16.100 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.19.10 \ --service-cidr=10.96.0.0/20 \ --pod-network-cidr=10.244.0.0/16 或 kubeadm init --apiserver-advertise-address=0.0.0.0 \ --apiserver-cert-extra-sans=127.0.0.1 \ --image-repository=registry.aliyuncs.com/google_containers \ --ignore-preflight-errors=all \ --kubernetes-version=v1.20.5 \ --service-cidr=10.10.0.0/16 \ --pod-network-cidr=10.18.0.0/16 \ --v=5 //不添加这个可能存在版本要求大于等于5的报错
参数说明:
# –apiserver-advertise-address # 集群通告地址(master 机器IP,这里用的万兆网)
# –image-repository # 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
# –kubernetes-version #K8s版本,与上面安装的一致
# –service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
# –pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数
由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
4)使用kubectl工具:
配置kubectl命令执行环境
a.未配置环境前,执行kubectl get nodes指令,会显示如下结果
b.配置kubectl执行环境:
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
此时再执行kubectl get nodes或kubectl get pod --all-namespaces 会得到如下结果:
此时查看node节点为NotReady状态,因为coredns pod没有启动,缺少网络pod
5) 部署CNI网络插件(flannel)
a.下载kube-flannel.yml
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
若出现以下错误:
修改hosts文件vi /etc/hosts
# GitHub Start
199.232.28.133 raw.githubusercontent.com
再执行wget命令,可以得到如下结果:
b.执行kube-flannel部署操作
kubectl apply -f kube-flannel.yml 或 kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
通过命令查看系统pods运行情况,查看kube-flannel是否正常运行
kubectl get pods -n kube-system
注意:若flannel出现ImagePullBackOff状态,可尝试下面方法:
查看kube-flannel.yml内容
cat kube-flannel.yml |grep image|uniq
根据结果,手动拉取flannel的docker镜像
docker pull quay.io/coreos/flannel:v0.15.1
查看master节点状态
补充:calico网络插件(目前Kubernetes主流的网络方案 calico):
curl https://docs.projectcalico.org/manifests/calico.yaml -O curl https://docs.projectcalico.org/v3.18/manifests/calico.yaml -O #指定版本 执行:(等待一会儿才会变成running) kubectl apply -f calico.yaml 如果安装失败了,可以通过下面命令清理环境重新安装: kubeadm reset
6)kubeadm reset重置集群
在重置后在Master重复步骤4
#在master节点之外的节点进行操作 kubeadm reset systemctl stop kubelet systemctl stop docker rm -rf /var/lib/cni/ rm -rf /var/lib/kubelet/* rm -rf /etc/cni/ ifconfig cni0 down ifconfig flannel.1 down ifconfig docker0 down ip link delete cni0 ip link delete flannel.1 ##重启kubelet systemctl restart kubelet ##重启docker systemctl restart docker
问题1:kubeadm reset
后kubeadm init
出现的问题
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")`
问题1解决办法:
I got the same error while running $ kubectl get nodes as a root user. I fixed it by exporting kubelet.conf to environment variable.
设置环境变量
export KUBECONFIG=/etc/kubernetes/kubelet.conf
kubectl get nodes
三、Kubeadm方式搭建K8s集群
1、在所有node节点安装kubeadm,kubelet和kubectl
由于版本更新频繁,这里指定版本号部署:
yum install -y kubelet-1.19.10 kubeadm-1.19.10 kubectl-1.19.10
启动kubelet并设置开机启动服务:
#重新加载配置文件
systemctl daemon-reload
#启动kubelet
systemctl start kubelet
#查看kubelet启动状态
systemctl status kubelet
#没启动成功,报错先不管,后面的kubeadm init会拉起
#设置开机自启动
systemctl enable kubelet
#查看kubelet开机启动状态 enabled:开启, disabled:关闭
systemctl is-enabled kubelet
#查看日志
journalctl -xefu kubelet
#查看kubelet的日志
journalctl -xef -u kubelet -n 20
2、将Node节点添加到集群
在master节点赋值初始化输出的token信息,在node节点执行:
kubeadm join 16.32.15.200:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:6839bcc102c7ab089554871dd1a8f3d4261e1482ff13eafdf32fc092ebaf9f7e
如果忘记可以在master使用以下命令创建并查看token:
kubeadm token create --print-join-command
3、加入之后再在主节点查看集群中节点的状态(必须要都为Ready
状态):
[root@node1 home]# kubectl get nodes NAME STATUS ROLES AGE VERSION node1 Ready control-plane,master 63m v1.23.0 node2 Ready <none> 3m57s v1.23.0 node3 Ready <none> 29s v1.23.0
如果所有的节点STATUS
都为Ready
的话,那么到此,所有的子节点加入完成!
4、给node节点打上标签
在master节点执行:
kubectl label nodes node-1 node-role.kubernetes.io/work=work kubectl label nodes node-2 node-role.kubernetes.io/work=work ....
5、删除子节点(在master主节点上操作)
# kubectl drain <node name> --delete-local-data --force --ignore-daemonsets # 其中 <node name> 是在k8s集群中使用 <kubectl get nodes> 查询到的节点名称 # 假设这里删除 node3 子节点 [root@node1 home]# kubectl drain node3 --delete-local-data --force --ignore-daemonsets [root@node1 home]# kubectl delete node node3
#删除子节点后,要在子节点重置K8S,同时删除配置信息
kubeadm reset
四、kubeedge方式搭建k8s集群
参考:https://blog.csdn.net/weixin_43216249/article/details/129780122
1、k8s集群部署及基础服务提供(了解)
1.1 k8s集群部署
由于kubeedge支持k8s版本版本较低,不建议使用k8s 1.24、1.25、1.26集群版本。可参考github提供的版本支持依据。
https://github.com/kubeedge/kubeedge
1.2 基础服务提供 负载均衡器 metallb
由于需要为cloudcore与edgecore提供通信地址,建议使用LB为cloudcore提供公网IP或K8S集群节点相同网段IP地址,实际生产中使用的是公网IP地址。
kubectl edit configmap -n kube-system kube-proxy apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: "ipvs" ipvs: strictARP: true kubectl rollout restart daemonset kube-proxy -n kube-system
# kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.5/config/manifests/metallb-native.yaml
创建全局IP地址池 [root@k8s-master01 ~]# vim first-ippool.yaml [root@k8s-master01 ~]# cat first-ippool.yaml apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: metallb-system spec: addresses: - 192.168.10.200-192.168.10.210
验证是否创建 [root@k8s-master01 ~]# kubectl get ipaddresspool -n metallb-system NAME AGE first-pool 23s
开启二层转发,实现在k8s集群节点外访问 [root@k8s-master01 ~]# vim l2forward.yaml [root@k8s-master01 ~]# cat l2forward.yaml apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: example namespace: metallb-system
2、kubeedge架构
3、master节点kubeedge cloudcore部署
云端部署cloudcore:
安装有两种方式,一种源码编译手动安装,还有一种是使用kubeedge提供的工具-keadm。手动安装比较繁琐,主要是编译X509经常是个噩梦,有些编译报错网上搜索不到解决方案。对于新手而言是一种很大的心智负担,所以我还是用keadm这个工具去安装。
但是,这种方式安装最大的问题就是国内的墙的问题会导致很多某些资源无法下载,比如说 CRD的yaml,cloudcore启动的service,所以我会结合两者,采用半手动半工具的安装方式去帮助大家顺利的完成ke的集群搭建
https://github.com/kubeedge/kubeedge
3.1 下载 keadm
wget https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/keadm-v1.7.0-linux-amd64.tar.gz
下载下来之后执行
tar -zxvf keadm-v1.7.0-linux-amd64.tar.gz # 解压keadm的tar.gz的包 cd keadm-v1.7.0-linux-amd64/keadm/ cp keadm /usr/sbin/ #将其配置进入环境变量,方便使用
3.2 使用keadm部署cloudcore
keadm init --advertise-address=192.168.16.100 --kubeedge-version=1.7.0
或
keadm init --kubeedge-version=1.7.0 --kube-config=$HOME/.kube/config
keadm init --advertise-address=192.168.16.100 --set iptablesManager.mode="external" --profile version=v1.7.0
注意:
–advertise-address=xxx.xx.xx.xx 这里的xxx.xx.xx.xx换成你master机器的ip,可以是内网地址,也可以是公网ip地址,–kubeedge-version=1.7.0 意思是指定安装的kubeEdge的版本,如果你默认不指定那么keadm会自动去下载最新的版本
3.3 keadm init
失败解决方案
网络被墙了,特别是在公有云的虚拟机上,出现这种问题的概率特别大
1)在/etc/hosts下添加如下内容
解决keadm初始化过程中可能无法解析raw.githubusercontent.com的问题
# GitHub Start 52.74.223.119 github.com 192.30.253.119 gist.github.com 54.169.195.247 api.github.com 185.199.111.153 assets-cdn.github.com 151.101.76.133 raw.githubusercontent.com 151.101.108.133 user-images.githubusercontent.com 151.101.76.133 gist.githubusercontent.com 151.101.76.133 cloud.githubusercontent.com 151.101.76.133 camo.githubusercontent.com 151.101.76.133 avatars0.githubusercontent.com 151.101.76.133 avatars1.githubusercontent.com 151.101.76.133 avatars2.githubusercontent.com 151.101.76.133 avatars3.githubusercontent.com 151.101.76.133 avatars4.githubusercontent.com 151.101.76.133 avatars5.githubusercontent.com 151.101.76.133 avatars6.githubusercontent.com 151.101.76.133 avatars7.githubusercontent.com 151.101.76.133 avatars8.githubusercontent.com # GitHub End
2)半手动安装
错误一:
devices_v1alpha2_device.yaml(用于设备接入的CRD)下载失败,
F0608 11:40:15.689702 5530 keadm.go:27] failed to exec 'bash -c cd /etc/kubeedge/crds/devices && wget -k --no-check-certificate --progress=bar:force https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_device.yaml', err: --2021-06-08 11:39:54-- https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_device.yaml
手动下载
wget https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_device.yaml
mkdir -p /etc/kubeedge/crds/devices && mkdir -p /etc/kubeedge/crds/reliablesyncs
cp devices_v1alpha2_device.yaml /etc/kubeedge/crds/devices/
错误二:
devices_v1alpha2_devicemodel.yaml下载失败,
F0608 14:10:11.700467 42153 keadm.go:27] failed to exec 'bash -c cd /etc/kubeedge/crds/devices && wget -k --no-check-certificate --progress=bar:force https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_devicemodel.yaml', err: --2021-06-08 14:10:11-- https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_devicemodel.yaml
手动下载
wget https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_devicemodel.yaml
cp devices_v1alpha2_devicemodel.yaml /etc/kubeedge/crds/devices/
错误三:
其他xxx..yaml下载失败
将下载文件放到/etc/kubeedge/crds/
里面对应的文件夹
https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/reliablesyncs/cluster_objectsync_v1alpha1.yaml#例如,由该网站可知,文件应该共享到reliablesyncs https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/reliablesyncs/objectsync_v1alpha1.yaml https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/router/router_v1_rule.yaml https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/router/router_v1_ruleEndpoint.yaml
错误四:
使用keadm安装cloudcore,关于yaml的配置文件配置完了,现在要配置cloudcore.service
,此处提醒文件应该存在/etc/kubeedge
里面,不是crds
里面
F0608 14:27:07.887553 45073 keadm.go:27] fail to download service file,error:{failed to exec 'bash -c cd /etc/kubeedge/ && sudo -E wget -t 5 -k --no-check-certificate https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/cloudcore.service', err: --2021-06-08 14:27:07-- https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/cloudcore.service
手动下载
下载文件放到/etc/kubeedge
https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/cloudcore.service https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz
3.4 再次初始化成功
keadm init --advertise-address=192.168.16.100 --kubeedge-version=1.7.0
输出:
Kubernetes version verification passed, KubeEdge installation will start... Expected or Default KubeEdge version 1.7.0 is already downloaded and will checksum for it. kubeedge-v1.7.0-linux-amd64.tar.gz checksum: checksum_kubeedge-v1.7.0-linux-amd64.tar.gz.txt content: kubeedge-v1.7.0-linux-amd64.tar.gz in your path checksum failed and do you want to delete this file and try to download again? [y/N]: 此处输入N,checksum失败不影响配置,不必担心。选择y通过不了。
安装成功:
3.5 检查cloudcore是否启动
ps -ef|grep cloudcore
未启动执行:
nohup cloudcore > /var/log/kubeedge/cloudcore.log 2>&1 &
检查是否报错
journalctl -xe
5.6获取云端令牌token
keadm gettoken
获取的token,在部署edgecore时需要使用
4、node节点kubeedge edgecore部署(边缘节点)
4.1、edge端的Mosquitto安装
可选,不安装也不影响配置
1)添加EPEL软件库
yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
下载mosquitto
yum -y install mosquitto
具备Docker环境即可安装
4.2、下载keadm
wget https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/keadm-v1.7.0-linux-amd64.tar.gz
解压
tar -zxvf keadm-v1.7.0-linux-amd64.tar.gz
进入目录
cd keadm-v1.7.0-linux-amd64/keadm
mv keadm-v1.7.0-linux-amd64/keadm/keadm /usr/local/bin/
加入cloud
./keadm join --cloudcore-ipport=192.168.16.100:10000 --edgenode-name=node --kubeedge-version=1.7.0 --token=3dc13e89ee6b907f7346786d018d0fa4c1efa7ddb0017607c7512bc1b4926449.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2MjM5OTg0ODd9.hTQMyupZd5d_e5uOVtz3RVsfe9H_BSFnwuLzPRy2ZUg
上面keadm gettoken里面的返回内容
输出:
F0608 15:03:13.805669 3177 keadm.go:27] failed to exec 'bash -c cd /etreleases/download/v1.7.0/kubeedge-v1.7.0-linux-F0608 15:05:37.624302 3197 keadm.go:27] failed to exec 'bash -c cd /etc/kubeedge/ && wget -k --no-check-certificate --progress=bar:force https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz', err: --2021-06-08 15:05:37-- https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz 正在解析主机 github.com (github.com)... 13.250.177.223 正在连接 github.com (github.com)|13.250.177.223|:443... 已连接。 无法建立 SSL 连接。 已转换了 0 个文件,用时 0 秒。 amd64.tar.gz', err: --2021md64.tar.gz 正在解析主机 github.com (github.com)... 13.229.188.59 正在连接 github.com (github.com)|13.229.188.59|:443... 已连接。 无法建立 SSL 连接。 已转换了 0 个文件,用时 0 秒。
同样类似于cloudcore,将下面对应文件共享到配置目录/etc/kubeedge
https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/edgecore.service
启动edgecore:
nohup edgecore &
配置edgecore开机自启动服务:
#查看edgecore是否开机启动 systemctl is-enabled edgecore #如果没有设置则设置 开启自启动 如果没有设置则设置 开启自启动 vi /etc/systemd/system/edgecore.service 添加如下代码:
[Unit] Description=edgecore.service [Service] Type=simple ExecStart=/usr/local/bin/edgecore Restart=always RestartSec=10 [Install] WantedBy=multi-user.target
#添加文件权限并启动edgecore chmod +x /etc/systemd/system/edgecore.service #重新加载配置文件 systemctl daemon-reload #启动edgecore systemctl start edgecore #设置开机自启 systemctl enable edgecore.service #查看edgecore开机启动状态 enabled:开启, disabled:关闭 systemctl is-enabled edgecore #查看状态 systemctl status edgecore
检查edgecore是否安装成功并启动:
ps -ef|grep edgecore
edgecore启动失败定位原因:
journalctl -xe
journalctl -u edgecore.service -f
4.3、删除边缘结点
# cloud $ keadm reset # --kube-config=$HOME/.kube/config # edge $ keadm reset # 节点 keadm reset 将停止 edgecore ,并且不会卸载/删除任何先决条件。 # 逐个停止进程,删除相关文件依赖 rm -rf /var/lib/kubeedge /var/lib/edged /etc/kubeedge rm -rf /etc/systemd/system/edgecore.service rm -rf /usr/local/bin/edgecoreps
4.4、检查Edge节点配置是否成功
在Master上验证是否成功,使用
输入:
kubectl get nodes
或
kubectl get nodes -owide
输出:
如果出现版本中带有kubeedge的node,则说明部署成功
4.5、测试 kubernetes 集群
在 Kubernetes 集群中创建一个 pod,验证是否正常运行:
$ kubectl create deployment nginx --image=nginx
$ kubectl expose deployment nginx --port=80 --type=NodePort
$ kubectl get pod,svc
访问地址:http://NodeIP:Port
五、证书过期重新生成
报如下错误:
unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory #在前边我们就可以看到这个报错,如果你对k8s的认证比较了解的话,就会知道bootstrap-kubelet.conf是k8s API的引导令牌(Bootstrap Tokens)认证相关的文件。该机制会根据证书生成token,然后将信息写在这个文件里边。 Failed while requesting a signed certificate from the master: cannot create certificate signing request #"请求签名证书时失败:无法创建证书签名请求。"已经说的很清楚了。
检测证书是否过期:
[root@master-node pki]# kubeadm alpha certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [check-expiration] Error reading configuration from the Cluster. Falling back to default configuration W1101 17:04:12.738187 1107 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Oct 31, 2024 08:58 UTC 364d no apiserver Oct 31, 2024 08:58 UTC 364d ca no apiserver-etcd-client Oct 31, 2024 08:58 UTC 364d etcd-ca no apiserver-kubelet-client Oct 31, 2024 08:58 UTC 364d ca no controller-manager.conf Oct 31, 2024 08:58 UTC 364d no etcd-healthcheck-client Oct 31, 2024 08:58 UTC 364d etcd-ca no etcd-peer Oct 31, 2024 08:58 UTC 364d etcd-ca no etcd-server Oct 31, 2024 08:58 UTC 364d etcd-ca no front-proxy-client Oct 31, 2024 08:58 UTC 364d front-proxy-ca no scheduler.conf Oct 31, 2024 08:58 UTC 364d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Jan 16, 2032 01:46 UTC 8y no etcd-ca Jan 16, 2032 01:46 UTC 8y no front-proxy-ca Jan 16, 2032 01:46 UTC 8y
解决:
- 备份并重新生成证书
1、备份原证书 # cd /etc/kubernetes/pki/ # mkdir backup # mv apiserver.crt apiserver-etcd-client.key apiserver-kubelet-client.crt front-proxy-ca.crt front-proxy-client.crt front-proxy-client.key front-proxy-ca.key apiserver-kubelet-client.key apiserver.key apiserver-etcd-client.crt backup #下面命令会在/etc/kubernetes/pki目录生成新文件, kubeadm init phase certs all #有时候因为网络变更,导致证书校验ip失败,需要指定ip kubeadm init phase certs all --apiserver-advertise-address 10.233.0.1 --apiserver-cert-extra-sans 192.168.1.71 #更新证书; kubeadm alpha certs renew all
- 备份并重新生成配置文件
1、备份配置文件
# cd /etc/kubernetes/ # mkdir backup # mv admin.conf controller-manager.conf kubelet.conf,scheduler.conf backup
2、重新生成配置文件 # kubeadm init phase kubeconfig all
3、更新配置文件
# kubeadm init phase upload-config kubeadm --config kubeadm.yaml
- 拷贝用户权限文件(一般不用)
一般不用,除非你的kubectl令出错了
# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
证书覆盖:
# 因为服务使用的是ssl这个文件路径,所以要覆盖回去;
cp -r /etc/kubernetes/pki /etc/kubernetes/ssl
重启kubelet服务:
systemctl restart kubelet && journalctl -xefu kubelet
六、k8s 增加 ip 并重新生成证书
查看证书内的 ip:
[root@master-node kubernetes]# for i in $(find /etc/kubernetes/pki -type f -name "*.crt");do echo ${i} && openssl x509 -in ${i} -text | grep 'DNS:';done /etc/kubernetes/pki/ca.crt /etc/kubernetes/pki/apiserver.crt DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:master-node, IP Address:10.96.0.1, IP Address:192.168.0.122 /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/front-proxy-ca.crt /etc/kubernetes/pki/front-proxy-client.crt /etc/kubernetes/pki/etcd/ca.crt /etc/kubernetes/pki/etcd/server.crt DNS:localhost, DNS:master-node, IP Address:192.168.0.122, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1 /etc/kubernetes/pki/etcd/peer.crt DNS:localhost, DNS:master-node, IP Address:192.168.0.122, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1 /etc/kubernetes/pki/etcd/healthcheck-client.crt /etc/kubernetes/pki/apiserver-etcd-client.crt
生成集群配置:
kubeadm config view > /root/kubeadm.yaml
增加ip:
apiServer: extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s # 增加下面的配置 certSANs: - 192.168.11.131 - 192.168.11.134 - 192.168.11.136 # 增加上面的配置 apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: lb-vip:6443 controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd # 增加下面的配置 serverCertSANs: - 192.168.11.131 - 192.168.11.135 - 192.168.11.136 peerCertSANs: - 192.168.11.131 - 192.168.11.135 - 192.168.11.136 # 增加上面的配置 imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: v1.17.3 networking: dnsDomain: cluster.local podSubnet: 172.10.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {}
重新生成证书:
kubeadm init phase certs all --config /root/kubeadm.yaml
更新配置文件:
kubeadm init phase upload-config kubeadm --config kubeadm.yaml
11、卸载docker和K8S
卸载docker:
yum remove docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-selinux \ docker-engine-selinux \ docker-engine
卸载podman:
yum erase podman buildah
卸载k8s:
yum remove -y kubelet kubeadm kubectl kubeadm reset -f modprobe -r ipip lsmod rm -rf ~/.kube/ rm -rf /etc/kubernetes/ rm -rf /etc/systemd/system/kubelet.service.d rm -rf /etc/systemd/system/kubelet.service rm -rf /usr/bin/kube* rm -rf /etc/cni rm -rf /opt/cni rm -rf /var/lib/etcd rm -rf /var/etcd