k8s集群安装

一、安装要求

1.1 服务器节点要求
部署Kubernetes集群机器需要满足以下几个条件:

  • 一台或多台机器,操作系统 CentOS7.x-86_x64
  • 硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
  • 可以访问外网,需要拉取镜像,如果服务器不能上网,需要提前下载镜像并导入节点
  • 禁止swap分区

1.2对于kubeEdge
KubeEdge由云和边缘组成。它建立在Kubernetes之上,为联网、应用部署和云与边缘之间的元数据同步提供核心基础设施支持。所以如果我们想要设置KubeEdge,我们需要设置Kubernetes集群(可以使用现有的集群),云端和边缘端。

在cloud side, 我们需要安装:

  • Docker,
  • Kubernetes cluster
  • cloudcore

在 edge side, 我们需要安装:

  • Docker,
  • MQTT (We can also use internal MQTT broker) (配置可以选用,不是一定需要)
  • edgecore

2.准备环境

角色IP工作负载
master(云端) 192.168.16.100 k8s、docker、cloudcore
node(边缘端) 192.168.16.x docker、edgecore

1) 关闭防火墙:

systemctl stop firewalld

设置开机禁用防火墙:

systemctl disable firewalld

2) 关闭selinux:

临时禁用:setenforce 0

永久禁用: 

vi /etc/selinux/config    # 或者修改/etc/sysconfig/selinux

SELINUX=disabled

查看:getenforce

3) 关闭swap:(K8S 1.8版本后必须关闭)

临时关闭:swapoff -a

永久关闭:vi /etc/fstab,注释掉swap行

# sed -i 's/.*swap.*/#&/' /etc/fstab
#/dev/mapper/centos-swap swap           swap   defaults     0 0

[root@master-node]# cp -p /etc/fstab /etc/fstab.bak$(date '+%Y%m%d%H%M%S')
[root@master-node]# swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
[root@master-node]# systemctl daemon-reload
[root@master-node]# systemctl restart kubelet

通过free查看swap开关情况:

4)根据规划设置主机名

hostnamectl set-hostname <hostname>

5) 在master添加hosts

cat >> /etc/hosts << EOF 192.168.16.100 master EOF

6) 时区配置和时间同步

查看:date +"%Z %z" 查看当前时区信息

 设置:tzselect根据提示进行设置

时间同步:

yum install ntpdate -y ntpdate

time.windows.com

7)将桥接的IPV4流量传递到iptables 的链

cat > /etc/sysctl.d/k8s.conf << EOF

net.bridge.bridge-nf-call-ip6tables = 1

net.bridge.bridge-nf-call-iptables = 1

EOF

# 生效

[root@master-node kubeedge]# sysctl --system

3.安装Docker

所有节点安装Docker

1) 更新yum

yum update

2) 安装 yum-utils,它提供了 yum-config-manager,可用来管理yum源

yum install -y yum-utils  wget

3)添加yum源

方式一:

sudo yum-config-manager \

--add-repo \

http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

 方式二:

wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo

3) 安装docker

sudo yum install -y docker-ce

yum -y install docker-ce-18.06.1.ce-3.el7

4) 启动docker

设置开机启动:systemctl enable docker

立即启动服务:systemctl start docker

查看docker状态:

systemctl status docker

 

5)配置/etc/docker/daemon.json

[root@master-node kubeedge]cat > /etc/docker/daemon.json << EOF
{
  "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
  "insecure-registries": ["harbor.domain.io"]
}
EOF
[root@master-node kubeedge]systemctl daemon-reload && systemctl restart docker


注:registry-mirrors(国内镜像仓库)、insecure-registries(私有镜像仓库)

二、K8S集群主节点部署

K8S集群部署:介绍两种集群部署方式:

1、基于Kubeadm方式搭建K8s集群 

2、基于边缘计算 kubeedge部署k8s集群

1、k8s master主服务部署

master(云端)节点安装/kubeadm/kubelet,Kubernetes默认CRI(容器运行时)为Docker,因此先安装Docker

1.1  K8s安装

1)添加阿里云YUM软件源

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

2)安装kubeadm,kubelet和kubectl

 由于版本更新频繁,这里指定版本号部署:

yum install -y kubelet-1.19.10 kubeadm-1.19.10 kubectl-1.19.10

启动kubelet并设置开机启动服务:

#重新加载配置文件
systemctl daemon-reload

#启动kubelet
systemctl start kubelet

#查看kubelet启动状态
systemctl status kubelet
#没启动成功,报错先不管,后面的kubeadm init会拉起

#设置开机自启动
systemctl enable kubelet

#查看kubelet开机启动状态 enabled:开启, disabled:关闭
systemctl is-enabled kubelet

#查看日志
journalctl -xefu kubelet

  #查看kubelet的日志
  journalctl -xef -u kubelet -n 20

注:用kubeadm的方法安装kubelet后,运行systemctl status kubelet 发现kubelet服务启动失败,错误代码255。

kubelet.service: main process exited, code=exited, status=255/n/a

运行journalctl -xefu kubelet 命令查看systemd日志才发现,真正的错误是:

unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

这个错误在运行kubeadm init 生成CA证书后会被自动解决,此处可先忽略。

3)部署Kubernetes Master

使用kubeadm init命令进行集群初始化

在192.168.16.100(Master)执行

kubeadm init \
  --apiserver-advertise-address=192.168.16.100 \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.19.10 \
  --service-cidr=10.96.0.0/20 \
  --pod-network-cidr=10.244.0.0/16

或
kubeadm init --apiserver-advertise-address=0.0.0.0 \
--apiserver-cert-extra-sans=127.0.0.1 \
--image-repository=registry.aliyuncs.com/google_containers \
--ignore-preflight-errors=all \
--kubernetes-version=v1.20.5 \
--service-cidr=10.10.0.0/16 \
--pod-network-cidr=10.18.0.0/16 \
--v=5      //不添加这个可能存在版本要求大于等于5的报错  

参数说明:

# –apiserver-advertise-address # 集群通告地址(master 机器IP,这里用的万兆网)
# –image-repository # 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
# –kubernetes-version #K8s版本,与上面安装的一致
# –service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
# –pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数

由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址

4)使用kubectl工具:

配置kubectl命令执行环境

a.未配置环境前,执行kubectl get nodes指令,会显示如下结果

 

b.配置kubectl执行环境:

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

此时再执行kubectl get nodes或kubectl get pod --all-namespaces 会得到如下结果:

此时查看node节点为NotReady状态,因为coredns pod没有启动,缺少网络pod

5) 部署CNI网络插件(flannel)

a.下载kube-flannel.yml

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

若出现以下错误:

 

修改hosts文件vi /etc/hosts

# GitHub Start

199.232.28.133  raw.githubusercontent.com

再执行wget命令,可以得到如下结果:

 

b.执行kube-flannel部署操作

kubectl apply -f kube-flannel.yml

或

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

通过命令查看系统pods运行情况,查看kube-flannel是否正常运行

kubectl get pods -n kube-system

 注意:若flannel出现ImagePullBackOff状态,可尝试下面方法:

查看kube-flannel.yml内容

cat kube-flannel.yml |grep image|uniq

根据结果,手动拉取flannel的docker镜像

docker pull quay.io/coreos/flannel:v0.15.1

查看master节点状态

 补充:calico网络插件(目前Kubernetes主流的网络方案 calico):

curl https://docs.projectcalico.org/manifests/calico.yaml -O

curl https://docs.projectcalico.org/v3.18/manifests/calico.yaml -O #指定版本
执行:(等待一会儿才会变成running)
kubectl apply -f calico.yaml

如果安装失败了,可以通过下面命令清理环境重新安装:
kubeadm reset

6)kubeadm reset重置集群

在重置后在Master重复步骤4

#在master节点之外的节点进行操作
kubeadm reset
systemctl stop kubelet
systemctl stop docker
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ifconfig docker0 down
ip link delete cni0
ip link delete flannel.1
##重启kubelet
systemctl restart kubelet
##重启docker
systemctl restart docker

问题1:kubeadm resetkubeadm init出现的问题

Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")`

问题1解决办法:

I got the same error while running $ kubectl get nodes as a root user. I fixed it by exporting kubelet.conf to environment variable.

设置环境变量
export KUBECONFIG=/etc/kubernetes/kubelet.conf
kubectl get nodes

三、Kubeadm方式搭建K8s集群 

1、在所有node节点安装kubeadm,kubelet和kubectl

 由于版本更新频繁,这里指定版本号部署:

yum install -y kubelet-1.19.10 kubeadm-1.19.10 kubectl-1.19.10

启动kubelet并设置开机启动服务:

#重新加载配置文件
systemctl daemon-reload

#启动kubelet
systemctl start kubelet

#查看kubelet启动状态
systemctl status kubelet
#没启动成功,报错先不管,后面的kubeadm init会拉起

#设置开机自启动
systemctl enable kubelet

#查看kubelet开机启动状态 enabled:开启, disabled:关闭
systemctl is-enabled kubelet

#查看日志
journalctl -xefu kubelet

  #查看kubelet的日志
  journalctl -xef -u kubelet -n 20

2、将Node节点添加到集群

在master节点赋值初始化输出的token信息,在node节点执行:

kubeadm join 16.32.15.200:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:6839bcc102c7ab089554871dd1a8f3d4261e1482ff13eafdf32fc092ebaf9f7e

如果忘记可以在master使用以下命令创建并查看token:

kubeadm token create --print-join-command

3、加入之后再在主节点查看集群中节点的状态(必须要都为Ready状态):

[root@node1 home]# kubectl get nodes
NAME         STATUS     ROLES                  AGE     VERSION
node1        Ready      control-plane,master   63m     v1.23.0
node2        Ready      <none>                 3m57s   v1.23.0
node3        Ready      <none>                 29s     v1.23.0

如果所有的节点STATUS都为Ready的话,那么到此,所有的子节点加入完成!

4、给node节点打上标签

在master节点执行:

kubectl label nodes node-1 node-role.kubernetes.io/work=work
kubectl label nodes node-2 node-role.kubernetes.io/work=work
....

5、删除子节点(在master主节点上操作)

# kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
# 其中 <node name> 是在k8s集群中使用 <kubectl get nodes> 查询到的节点名称
# 假设这里删除 node3 子节点
[root@node1 home]# kubectl drain node3 --delete-local-data --force --ignore-daemonsets
[root@node1 home]# kubectl delete node node3

#删除子节点后,要在子节点重置K8S,同时删除配置信息
kubeadm reset

四、kubeedge方式搭建k8s集群

参考:https://blog.csdn.net/weixin_43216249/article/details/129780122

1、k8s集群部署及基础服务提供(了解)

1.1 k8s集群部署

由于kubeedge支持k8s版本版本较低,不建议使用k8s 1.24、1.25、1.26集群版本。可参考github提供的版本支持依据。

https://github.com/kubeedge/kubeedge

 

1.2 基础服务提供 负载均衡器 metallb

由于需要为cloudcore与edgecore提供通信地址,建议使用LB为cloudcore提供公网IP或K8S集群节点相同网段IP地址,实际生产中使用的是公网IP地址。

kubectl edit configmap -n kube-system kube-proxy


apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  strictARP: true



kubectl rollout restart daemonset kube-proxy -n kube-system

 

# kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.5/config/manifests/metallb-native.yaml
创建全局IP地址池
[root@k8s-master01 ~]# vim first-ippool.yaml
[root@k8s-master01 ~]# cat first-ippool.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: first-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.10.200-192.168.10.210
验证是否创建
[root@k8s-master01 ~]# kubectl get ipaddresspool -n metallb-system
NAME         AGE
first-pool   23s

 

开启二层转发,实现在k8s集群节点外访问
[root@k8s-master01 ~]# vim l2forward.yaml
[root@k8s-master01 ~]# cat l2forward.yaml
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: example
  namespace: metallb-system

 2、kubeedge架构

 

3、master节点kubeedge cloudcore部署

云端部署cloudcore:

     安装有两种方式,一种源码编译手动安装,还有一种是使用kubeedge提供的工具-keadm。手动安装比较繁琐,主要是编译X509经常是个噩梦,有些编译报错网上搜索不到解决方案。对于新手而言是一种很大的心智负担,所以我还是用keadm这个工具去安装。

但是,这种方式安装最大的问题就是国内的墙的问题会导致很多某些资源无法下载,比如说 CRD的yaml,cloudcore启动的service,所以我会结合两者,采用半手动半工具的安装方式去帮助大家顺利的完成ke的集群搭建

https://github.com/kubeedge/kubeedge

 

3.1 下载 keadm

wget https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/keadm-v1.7.0-linux-amd64.tar.gz 

下载下来之后执行

tar -zxvf keadm-v1.7.0-linux-amd64.tar.gz # 解压keadm的tar.gz的包
cd keadm-v1.7.0-linux-amd64/keadm/
cp keadm /usr/sbin/ #将其配置进入环境变量,方便使用

3.2 使用keadm部署cloudcore

keadm init --advertise-address=192.168.16.100 --kubeedge-version=1.7.0


keadm init --kubeedge-version=1.7.0 --kube-config=$HOME/.kube/config
keadm init --advertise-address=192.168.16.100 --set iptablesManager.mode="external" --profile version=v1.7.0

注意:

–advertise-address=xxx.xx.xx.xx 这里的xxx.xx.xx.xx换成你master机器的ip,可以是内网地址,也可以是公网ip地址,–kubeedge-version=1.7.0 意思是指定安装的kubeEdge的版本,如果你默认不指定那么keadm会自动去下载最新的版本

3.3 keadm init失败解决方案

网络被墙了,特别是在公有云的虚拟机上,出现这种问题的概率特别大

1)在/etc/hosts下添加如下内容

解决keadm初始化过程中可能无法解析raw.githubusercontent.com的问题

# GitHub Start
52.74.223.119 github.com
192.30.253.119 gist.github.com
54.169.195.247 api.github.com
185.199.111.153 assets-cdn.github.com
151.101.76.133 raw.githubusercontent.com
151.101.108.133 user-images.githubusercontent.com
151.101.76.133 gist.githubusercontent.com
151.101.76.133 cloud.githubusercontent.com
151.101.76.133 camo.githubusercontent.com
151.101.76.133 avatars0.githubusercontent.com
151.101.76.133 avatars1.githubusercontent.com
151.101.76.133 avatars2.githubusercontent.com
151.101.76.133 avatars3.githubusercontent.com
151.101.76.133 avatars4.githubusercontent.com
151.101.76.133 avatars5.githubusercontent.com
151.101.76.133 avatars6.githubusercontent.com
151.101.76.133 avatars7.githubusercontent.com
151.101.76.133 avatars8.githubusercontent.com
# GitHub End

2)半手动安装

错误一:

devices_v1alpha2_device.yaml(用于设备接入的CRD)下载失败,

F0608 11:40:15.689702    5530 keadm.go:27] failed to exec 'bash -c cd /etc/kubeedge/crds/devices && wget -k --no-check-certificate --progress=bar:force https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_device.yaml', err: --2021-06-08 11:39:54--  https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_device.yaml

手动下载

wget https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_device.yaml

mkdir -p /etc/kubeedge/crds/devices && mkdir -p /etc/kubeedge/crds/reliablesyncs

cp devices_v1alpha2_device.yaml /etc/kubeedge/crds/devices/

错误二:

devices_v1alpha2_devicemodel.yaml下载失败,

F0608 14:10:11.700467   42153 keadm.go:27] failed to exec 'bash -c cd /etc/kubeedge/crds/devices && wget -k --no-check-certificate --progress=bar:force https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_devicemodel.yaml', err: --2021-06-08 14:10:11--  https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_devicemodel.yaml

手动下载

wget https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/devices/devices_v1alpha2_devicemodel.yaml

cp devices_v1alpha2_devicemodel.yaml /etc/kubeedge/crds/devices/

错误三:

其他xxx..yaml下载失败

将下载文件放到/etc/kubeedge/crds/里面对应的文件夹

https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/reliablesyncs/cluster_objectsync_v1alpha1.yaml#例如,由该网站可知,文件应该共享到reliablesyncs

https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/reliablesyncs/objectsync_v1alpha1.yaml

https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/router/router_v1_rule.yaml

https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/crds/router/router_v1_ruleEndpoint.yaml

错误四:

使用keadm安装cloudcore,关于yaml的配置文件配置完了,现在要配置cloudcore.service,此处提醒文件应该存在/etc/kubeedge里面,不是crds里面

F0608 14:27:07.887553   45073 keadm.go:27] fail to download service file,error:{failed to exec 'bash -c cd /etc/kubeedge/ && sudo -E wget -t 5 -k --no-check-certificate https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/cloudcore.service', err: --2021-06-08 14:27:07--  https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/cloudcore.service

手动下载

下载文件放到/etc/kubeedge

https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/cloudcore.service

https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz

3.4 再次初始化成功

keadm init --advertise-address=192.168.16.100 --kubeedge-version=1.7.0

输出:

Kubernetes version verification passed, KubeEdge installation will start...
Expected or Default KubeEdge version 1.7.0 is already downloaded and will checksum for it. 
kubeedge-v1.7.0-linux-amd64.tar.gz checksum: 
checksum_kubeedge-v1.7.0-linux-amd64.tar.gz.txt content: 
kubeedge-v1.7.0-linux-amd64.tar.gz in your path checksum failed and do you want to delete this file and try to download again? 
[y/N]: 

此处输入N,checksum失败不影响配置,不必担心。选择y通过不了。

安装成功:

 3.5 检查cloudcore是否启动

ps -ef|grep cloudcore
未启动执行:
nohup cloudcore > /var/log/kubeedge/cloudcore.log 2>&1 &

 检查是否报错

journalctl -xe

 5.6获取云端令牌token

keadm gettoken

 获取的token,在部署edgecore时需要使用

 4、node节点kubeedge edgecore部署(边缘节点)

4.1、edge端的Mosquitto安装

可选,不安装也不影响配置

1)添加EPEL软件库

yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

下载mosquitto

yum -y install mosquitto

具备Docker环境即可安装

4.2、下载keadm

wget https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/keadm-v1.7.0-linux-amd64.tar.gz 

解压

tar -zxvf keadm-v1.7.0-linux-amd64.tar.gz 

进入目录

cd keadm-v1.7.0-linux-amd64/keadm

  mv keadm-v1.7.0-linux-amd64/keadm/keadm /usr/local/bin/

加入cloud

./keadm join --cloudcore-ipport=192.168.16.100:10000 --edgenode-name=node --kubeedge-version=1.7.0 --token=3dc13e89ee6b907f7346786d018d0fa4c1efa7ddb0017607c7512bc1b4926449.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2MjM5OTg0ODd9.hTQMyupZd5d_e5uOVtz3RVsfe9H_BSFnwuLzPRy2ZUg
上面keadm gettoken里面的返回内容

输出:

F0608 15:03:13.805669    3177 keadm.go:27] failed to exec 'bash -c cd /etreleases/download/v1.7.0/kubeedge-v1.7.0-linux-F0608 15:05:37.624302    3197 keadm.go:27] failed to exec 'bash -c cd /etc/kubeedge/ && wget -k --no-check-certificate --progress=bar:force https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz', err: --2021-06-08 15:05:37--  https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz
正在解析主机 github.com (github.com)... 13.250.177.223
正在连接 github.com (github.com)|13.250.177.223|:443... 已连接。
无法建立 SSL 连接。
已转换了 0 个文件,用时 0 秒。
amd64.tar.gz', err: --2021md64.tar.gz
正在解析主机 github.com (github.com)... 13.229.188.59
正在连接 github.com (github.com)|13.229.188.59|:443... 已连接。
无法建立 SSL 连接。
已转换了 0 个文件,用时 0 秒。

同样类似于cloudcore,将下面对应文件共享到配置目录/etc/kubeedge

https://github.com/kubeedge/kubeedge/releases/download/v1.7.0/kubeedge-v1.7.0-linux-amd64.tar.gz

https://raw.githubusercontent.com/kubeedge/kubeedge/release-1.7/build/tools/edgecore.service

启动edgecore:

nohup edgecore &

配置edgecore开机自启动服务:

#查看edgecore是否开机启动
systemctl is-enabled edgecore

#如果没有设置则设置 开启自启动
如果没有设置则设置 开启自启动
vi /etc/systemd/system/edgecore.service 
添加如下代码:
[Unit] Description
=edgecore.service [Service] Type=simple ExecStart=/usr/local/bin/edgecore Restart=always RestartSec=10 [Install] WantedBy=multi-user.target

#添加文件权限并启动edgecore chmod
+x /etc/systemd/system/edgecore.service #重新加载配置文件 systemctl daemon-reload #启动edgecore systemctl start edgecore #设置开机自启 systemctl enable edgecore.service #查看edgecore开机启动状态 enabled:开启, disabled:关闭 systemctl is-enabled edgecore #查看状态 systemctl status edgecore

检查edgecore是否安装成功并启动:

 ps -ef|grep edgecore

 edgecore启动失败定位原因:

journalctl -xe
journalctl -u edgecore.service -f

4.3、删除边缘结点

# cloud
$ keadm reset # --kube-config=$HOME/.kube/config

# edge
$ keadm reset # 节点 keadm reset 将停止 edgecore ,并且不会卸载/删除任何先决条件。
# 逐个停止进程,删除相关文件依赖
rm -rf /var/lib/kubeedge /var/lib/edged /etc/kubeedge
rm -rf /etc/systemd/system/edgecore.service
rm -rf /usr/local/bin/edgecoreps 

4.4、检查Edge节点配置是否成功

在Master上验证是否成功,使用

输入:

kubectl get nodes

kubectl get nodes -owide

输出:

如果出现版本中带有kubeedge的node,则说明部署成功

4.5、测试 kubernetes 集群

在 Kubernetes 集群中创建一个 pod,验证是否正常运行:
$ kubectl create deployment nginx --image=nginx
$ kubectl expose deployment nginx --port=80 --type=NodePort
$ kubectl get pod,svc
访问地址:http://NodeIP:Port

五、证书过期重新生成

报如下错误:

unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory

#在前边我们就可以看到这个报错,如果你对k8s的认证比较了解的话,就会知道bootstrap-kubelet.conf是k8s API的引导令牌(Bootstrap Tokens)认证相关的文件。该机制会根据证书生成token,然后将信息写在这个文件里边。

Failed while requesting a signed certificate from the master: cannot create certificate signing request
#"请求签名证书时失败:无法创建证书签名请求。"已经说的很清楚了。

检测证书是否过期:

[root@master-node pki]# kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

W1101 17:04:12.738187    1107 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Oct 31, 2024 08:58 UTC   364d                                    no      
apiserver                  Oct 31, 2024 08:58 UTC   364d            ca                      no      
apiserver-etcd-client      Oct 31, 2024 08:58 UTC   364d            etcd-ca                 no      
apiserver-kubelet-client   Oct 31, 2024 08:58 UTC   364d            ca                      no      
controller-manager.conf    Oct 31, 2024 08:58 UTC   364d                                    no      
etcd-healthcheck-client    Oct 31, 2024 08:58 UTC   364d            etcd-ca                 no      
etcd-peer                  Oct 31, 2024 08:58 UTC   364d            etcd-ca                 no      
etcd-server                Oct 31, 2024 08:58 UTC   364d            etcd-ca                 no      
front-proxy-client         Oct 31, 2024 08:58 UTC   364d            front-proxy-ca          no      
scheduler.conf             Oct 31, 2024 08:58 UTC   364d                                    no      

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Jan 16, 2032 01:46 UTC   8y              no      
etcd-ca                 Jan 16, 2032 01:46 UTC   8y              no      
front-proxy-ca          Jan 16, 2032 01:46 UTC   8y           

解决:

  • 备份并重新生成证书
1、备份原证书
# cd /etc/kubernetes/pki/
# mkdir backup
# mv  apiserver.crt apiserver-etcd-client.key apiserver-kubelet-client.crt front-proxy-ca.crt front-proxy-client.crt front-proxy-client.key front-proxy-ca.key apiserver-kubelet-client.key apiserver.key apiserver-etcd-client.crt backup


#下面命令会在/etc/kubernetes/pki目录生成新文件,
kubeadm init phase certs all 
#有时候因为网络变更,导致证书校验ip失败,需要指定ip
kubeadm init phase certs all --apiserver-advertise-address 10.233.0.1 --apiserver-cert-extra-sans 192.168.1.71

#更新证书;
kubeadm alpha certs renew all
  • 备份并重新生成配置文件
1、备份配置文件
# cd /etc/kubernetes/ # mkdir backup # mv admin.conf controller-manager.conf kubelet.conf,scheduler.conf backup
2、重新生成配置文件 # kubeadm init phase kubeconfig all
3、更新配置文件
# kubeadm init phase upload-config kubeadm --config kubeadm.yaml
  • 拷贝用户权限文件(一般不用)
一般不用,除非你的kubectl令出错了
# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

 证书覆盖:

# 因为服务使用的是ssl这个文件路径,所以要覆盖回去;
cp -r /etc/kubernetes/pki /etc/kubernetes/ssl

重启kubelet服务:

systemctl restart kubelet && journalctl -xefu kubelet

六、k8s 增加 ip 并重新生成证书

查看证书内的 ip:

[root@master-node kubernetes]# for i in $(find /etc/kubernetes/pki -type f -name "*.crt");do echo ${i} && openssl x509 -in ${i} -text | grep 'DNS:';done
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/apiserver.crt
                DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:master-node, IP Address:10.96.0.1, IP Address:192.168.0.122
/etc/kubernetes/pki/apiserver-kubelet-client.crt
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/front-proxy-client.crt
/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd/server.crt
                DNS:localhost, DNS:master-node, IP Address:192.168.0.122, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1
/etc/kubernetes/pki/etcd/peer.crt
                DNS:localhost, DNS:master-node, IP Address:192.168.0.122, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1
/etc/kubernetes/pki/etcd/healthcheck-client.crt
/etc/kubernetes/pki/apiserver-etcd-client.crt

生成集群配置:

kubeadm config view > /root/kubeadm.yaml

增加ip:

apiServer:
  extraArgs:
    authorization-mode: Node,RBAC
  timeoutForControlPlane: 4m0s
  # 增加下面的配置
  certSANs:
  - 192.168.11.131
  - 192.168.11.134
  - 192.168.11.136
  # 增加上面的配置
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: lb-vip:6443
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
    # 增加下面的配置
    serverCertSANs:
    - 192.168.11.131
    - 192.168.11.135
    - 192.168.11.136
    peerCertSANs:
    - 192.168.11.131
    - 192.168.11.135
    - 192.168.11.136
    # 增加上面的配置
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.3
networking:
  dnsDomain: cluster.local
  podSubnet: 172.10.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}

重新生成证书:

kubeadm init phase certs all --config /root/kubeadm.yaml

更新配置文件:

kubeadm init phase upload-config kubeadm --config kubeadm.yaml

 11、卸载docker和K8S

 卸载docker:

yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine

卸载podman:

yum erase podman buildah

卸载k8s:

yum remove -y kubelet kubeadm kubectl
kubeadm reset -f
modprobe -r ipip
lsmod
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /etc/systemd/system/kubelet.service
rm -rf /usr/bin/kube*
rm -rf /etc/cni
rm -rf /opt/cni
rm -rf /var/lib/etcd
rm -rf /var/etcd

 

posted on 2021-11-24 16:00  uestc2007  阅读(827)  评论(0编辑  收藏  举报

导航