Kubernetes(K8S)安装
Kubermetes (K8S) 安装
参考文档:
官方文档:
https://kubernetes.io/zh-cn/docs/concepts/overview/components/#node-components
安装参考文档:
尚硅谷
https://www.yuque.com/leifengyang/oncloud/ghnb83
https://www.bilibili.com/video/BV13Q4y1C7hS?p=32&spm_id_from=pageDriver&vd_source=a68414cd60fe26e829ce1cdd4d75a9e6
易文档:
https://k8s.easydoc.net/docs/dRiQjyTY/28366845/6GiNOzyZ/9EX8Cp45
集群安装
裸机搭建(Bare Metal)
二台服务器centos7.6
一台做master
两台做node
- 安装要求
在开始之前,部署Kubernetes集群机器需要满足以下几个条件:
一台或多台机器,操作系统 CentOS7.x-86_x64
硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
可以访问外网,需要拉取镜像,如果服务器不能上网,需要提前下载镜像并导入节点
安装docker
# 移除以前docker相关包
sudo yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
# 安装yum工具包
sudo yum install -y yum-utils
#配置docker的yum地址
sudo yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#安装指定版本
sudo yum install -y docker-ce-20.10.7 docker-ce-cli-20.10.7 containerd.io-1.4.6
# 启动&开机启动docker
systemctl enable docker --now
# docker加速配置
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://82m9ar63.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
systemctl status docker
docker info
1、k8s基础环境
所有机器执行以下操作
# 每个节点分别设置对应主机名
hostnamectl set-hostname master
hostnamectl set-hostname node1
hostnamectl set-hostname node2
# 查看主机名
hostname
# 所有节点都修改 hosts 以自己服务器的ip为准
cat >> /etc/hosts << EOF
192.168.0.111 master
192.168.0.112 node1
192.168.0.113 node2
127.0.0.1 localhost
EOF
# 时间同步
yum install ntpdate -y
/usr/sbin/ntpdate -u pool.ntp.org
# 将 SELinux 设置为 permissive 模式(相当于将其禁用)
setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
### 关闭swap 禁止swap分区
swapoff -a
sed -i "s/\/dev\/mapper\/centos-swap/\#\/dev\/mapper\/centos-swap/g" /etc/fstab
#允许 iptables 检查桥接流量
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system
# 所有节点确保防火墙关闭
systemctl stop firewalld
systemctl disable firewalld
# 检查防火墙状态
firewall-cmd --state
# 关闭规则
systemctl stop iptables
systemctl disable iptables
systemctl status iptables
### 配置节点间ssh互信(可以跳过)
配置ssh互信,那么节点之间就能无密访问,方便日后执行自动化部署
# 在master主节点上执行
ssh-keygen # 每台机器执行这个命令, 一路回车即可
ssh-copy-id node1 # 要主节点上执行从master上拷贝公钥到其他node1节点(这里node1为从节点的主机名,也可以写ip地址),这里需要输入 yes和密码
# 从主节登陆到从节点
ssh node1
# 在node1从节点上执行
ssh-keygen
ssh-copy-id master # 从node1上拷贝公钥到其他master节点这里需要输入 yes和密码
# 在从节点上登录到主节点
ssh master
安装kubelet、kubeadm、kubectl
所有机器执行以下操作
# 添加 k8s 安装源
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
# 安装所需组件
sudo yum install -y kubelet-1.20.9 kubeadm-1.20.9 kubectl-1.20.9 --disableexcludes=kubernetes
sudo systemctl enable --now kubelet
## 启动 kubelet、docker,并设置开机启动(所有节点)
#重新加载服务的配置文件
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
systemctl status kubelet
systemctl restart kubelet(跳过)
# 查看版本
kubelet --version
kubeadmin version
使用kubeadm引导集群
1. 下载各个机器需要的镜像
所有服务器都执行
sudo tee ./images.sh <<-'EOF'
#!/bin/bash
images=(
kube-apiserver:v1.20.9
kube-proxy:v1.20.9
kube-controller-manager:v1.20.9
kube-scheduler:v1.20.9
coredns:1.7.0
etcd:3.4.13-0
pause:3.2
)
for imageName in ${images[@]} ; do
docker pull registry.cn-hangzhou.aliyuncs.com/lfy_k8s_images/$imageName
done
EOF
chmod +x ./images.sh && ./images.sh
2. 初始化主节点
#所有机器添加master域名映射,以下需要修改为自己的master节点的ip
echo "192.168.31.110 cluster-endpoint" >> /etc/hosts
# 失败了可以用 kubeadm reset 重置
# 重新初始化
kubeadm reset # 先重置
y #确认
#下面命令仅在master主节执行,点初始化 advertise-address为自己的master节点的ip,其它不用改
kubeadm init \
--apiserver-advertise-address=192.168.0.111 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.20.9 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--ignore-preflight-errors=all
# 查看日志
journalctl -xeu kubelet
journalctl -fu kubelet
# 解决:k8s[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with
# 参考:
# https://blog.csdn.net/leisurelen/article/details/117392370
初使化报错
报错一:
你这个错误 curl -sSL http://localhost:10248/healthz,本地配host了么
解决:配置本地hosts
127.0.0.1 localhost
报错二:
7月 03 17:41:45 master kubelet[115473]: E0703 17:41:45.730879 115473 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotRe
7月 03 17:41:50 master kubelet[115473]: W0703 17:41:50.460817 115473 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
解决方法:
docker pull quay.io/coreos/flannel:v0.10.0-amd64
mkdir -p /etc/cni/net.d/
cat <<EOF> /etc/cni/net.d/10-flannel.conf
{"name":"cbr0","type":"flannel","delegate": {"isDefaultGateway": true}}
EOF
mkdir /usr/share/oci-umount/oci-umount.d -p
mkdir /run/flannel/
cat <<EOF> /run/flannel/subnet.env
FLANNEL_NETWORK=172.100.0.0/16
FLANNEL_SUBNET=172.100.1.0/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
EOF
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
报错三:
The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决:
原因:kubernetes master没有与本机绑定,集群初始化的时候没有绑定,此时设置在本机的环境变量即可解决问题。
source /etc/profile
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile
报错四:
初始化K8S master时报错
The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
或者一直报下面的错
7月 05 00:36:34 master kubelet[106102]: E0705 00:36:34.627900 106102 controller.go:144] failed to ensure lease exists, will retry in 200ms, error: Get "https://192.168.0.111:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/master?timeout=10s": dial tcp 192.168.0.111:6443: connect: connection refused
解决 参考下面的连接:
原文链接:https://blog.csdn.net/qq_26129413/article/details/122207954
我正好相反,我把`"exec-opts": ["native.cgroupdriver=systemd"]` 从 `vim /etc/docker/daemon.json` 里删除,
再重启docker
sudo systemctl daemon-reload
sudo systemctl restart docker
systemctl status docker
docker info
然后再
返回这个成功
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.0.111:6443 --token pgshih.lqxd8nxo0stohgzt \
--discovery-token-ca-cert-hash sha256:39b46cd80f5810e06fa255debf442d5a5880f97fdb2ca1b48a680c42cee36a48
上面kubeadm join 192.168.31.110:6443 --token 09umch.j47h1kyxx44znbtc ...
就是把工作节点(从节点)加到master主节点的token(24小时内有效)
新令牌
kubeadm token create --print-join-command
按上面的提示在master主节点执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
查看是否成功
[root@master home]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 28m v1.20.9
至此,master已经安装成功了
安装网络插件 calico是k8s集群安装所用到的网络组件,calico与fannel相比具有访问控制功能,复杂性也更高些。
# master执行下载
# 如果下载不了,可以在浏览器里打开下载,上传到服务器上
wget https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
报错:
error: unable to recognize "calico.yaml": no matches for kind "PodDisruptionBudget" in version "policy/v1"
解决:
k8s-v1.20.9支持的最新版calico是v3.21
故正确获取calico的yaml文件应该用:
wget http://docs.projectcalico.org/archive/v3.21/manifests/calico.yaml --no-check-certificate
然后再执行
kubectl apply -f calico.yaml
另外, 查看calico支持的k8s对应版本可以在calico官网查看
https://projectcalico.docs.tigera.io/archive/v3.20/getting-started/kubernetes/requirements
常用命令:
master上使执行
#查看集群所有节点
kubectl get nodes
#根据配置文件,给集群创建资源
kubectl apply -f xxxx.yaml
#查看集群部署了哪些应用?
docker ps === kubectl get pods -A
# 运行中的应用在docker里面叫容器,在k8s里面叫Pod
kubectl get pods -A
# 每一秒刷新一次
watch -n 1 kubectl get pod -A
# 查看pod详情
kubectl describe pod calico-node-rsmm8 --namespace=kube-system
kubectl describe pod calico-kube-controllers-5cdbdf5595-dxnzd --namespace=kube-system
#kubectl delete 常用命令
#https://www.cjavapy.com/article/2420/
# 通过yaml文件删除
kubectl delete -f calico_v3.21.yaml
# 删除单个pod
kubectl delete pod calico-node-x2bfz -n kube-system
# 报错
[root@master home]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5cdbdf5595-dxnzd 0/1 ImagePullBackOff 0 18m
kube-system calico-node-rsmm8 0/1 Init:ImagePullBackOff 0 18m
# 查看详情
kubectl describe pod calico-node-rsmm8 --namespace=kube-system
kubectl describe pod calico-kube-controllers-5cdbdf5595-dxnzd --namespace=kube-system
kubectl describe pod calico-node-xlpxv --namespace=kube-system
kubectl describe pod calico-kube-controllers-5cdbdf5595-gkpfq --namespace=kube-system
kubectl describe pod calico-node-5bp25 --namespace=kube-system
kubectl describe pod calico-node-fd5v7 --namespace=kube-system
kubectl describe pod coredns-7f89b7bc75-98dz4 --namespace=kube-system
kubectl describe pod coredns-7f89b7bc75-xvr57 --namespace=kube-system
Warning Failed 10m (x4 over 19m) kubelet Error: ErrImagePull
Normal BackOff 9m48s (x8 over 19m) kubelet Back-off pulling image "docker.io/calico/cni:v3.21.5"
Warning Failed 4m54s (x22 over 19m) kubelet Error: ImagePullBackOff
Warning Failed 9m35s (x4 over 17m) kubelet Error: ErrImagePull
Normal BackOff 9m7s (x7 over 17m) kubelet Back-off pulling image "docker.io/calico/kube-controllers:v3.21.5"
Warning Failed 23s (x29 over 17m) kubelet Error: ImagePullBackOff
Normal BackOff 49s (x2 over 2m55s) kubelet Back-off pulling image "docker.io/calico/pod2daemon-flexvol:v3.21.5"
Warning Failed 49s (x2 over 2m55s) kubelet Error: ImagePullBackOff
Normal BackOff 8m34s (x53 over 46m) kubelet Back-off pulling image "docker.io/calico/node:v3.21.5"
Warning Failed 3m36s (x67 over 46m) kubelet Error: ImagePullBackOff
# 从上面可以看到拉取镜像超时了,
哎,明天换个网再操作 无奈
终于pull到这两个镜像
[root@master home]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
calico/kube-controllers v3.21.5 86014674e404 2 months ago 131MB
calico/cni v3.21.5 a830f00c4814 2 months ago 235MB
# 通过yaml文件删除
kubectl delete -f calico_v3.21.yaml
# 再启动pod
kubectl apply -f calico_v3.21.yaml