使用kubeadm部署k8s集群[v1.18.0]
0. 基础环境
IP地址 | 主机名 | 节点 |
---|---|---|
10.0.0.63 | k8s-master1 | master1 |
10.0.0.65 | k8s-node1 | node1 |
10.0.0.66 | k8s-node2 | node2 |
1. 简要
kubeadm是官方社区推出的快速部署kubernetes集群工具
部署环境适用于学习和使用k8s相关软件和功能
2. 安装要求
3台纯净centos虚拟机,版本为7.x及以上
机器配置 2核4G以上 x3台
服务器网络互通
禁止swap分区
3. 学习目标
学会使用kubeadm来安装一个集群,便于学习k8s相关知识
4. 环境准备
# 1. 关闭防火墙功能
systemctl stop firewalld
systemctl disable firewalld
# 2.关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
# 3. 关闭swap
swapoff -a # 临时
sed -ri 's/.*swap.*/#&/' /etc/fstab # 永久
# 4. 服务器规划
cat > /etc/hosts << EOF
10.0.0.63 k8s-master1
#10.0.0.64 k8s-master2
10.0.0.65 k8s-node1
10.0.0.66 k8s-node2
EOF
#5. 临时主机名配置方法:
hostnamectl set-hostname k8s-master1
bash
#6. 时间同步配置
yum install -y ntpdate
ntpdate time.windows.com
#开启转发
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
#7. 时间同步
echo '*/5 * * * * /usr/sbin/ntpdate -u ntp.api.bz' >>/var/spool/cron/root
systemctl restart crond.service
crontab -l
# 以上可以全部复制粘贴直接运行,但是主机名配置需要重新修改
5. docker安装[所有节点都需要安装]
#源添加
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
wget -P /etc/yum.repos.d/ http://mirrors.aliyun.com/repo/epel-7.repo
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum clean all
yum install -y bash-completion.noarch
# 安装指定版版本
yum -y install docker-ce-18.09.9-3.el7
#也可以查看版本安装
yum list docker-ce --showduplicates | sort -r
#启动docker
systemctl enable docker
systemctl start docker
systemctl status docker
6. docker配置cgroup驱动[所有节点]
rm -f /etc/docker/*
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://ajvcw8qn.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
systemctl enable docker.service
拉取flanel镜像:
docker pull lizhenliang/flannel:v0.11.0-amd64
7. 镜像加速[所有节点]
curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io
systemctl restart docker
#如果源太多容易出错. 错了就删除一个.bak源试试看
#保留 curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io
这个是阿里云配置的加速,直接添加阿里云加速源就可以了.
https://cr.console.aliyun.com/cn-hangzhou/instances/mirrors
8.kubernetes源配置[所有节点]
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
9. 安装kubeadm,kubelet和kubectl[所有节点]
yum install -y kubelet-1.18.0 kubeadm-1.18.0 kubectl-1.18.0
systemctl enable kubelet
10. 部署Kubernetes Master [ master 10.0.0.63]
kubeadm init \
--apiserver-advertise-address=10.0.0.63 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.18.0 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16
#成功后加入环境变量[master]:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
初始化后获取到token:
kubeadm join 10.0.0.63:6443 --token 2cdgi6.79j20fhly6xpgfud
--discovery-token-ca-cert-hash sha256:3d847b858ed649244b4110d4d60ffd57f43856f42ca9c22e12ca33946673ccb4
记住token,后面使用
注意:
W0507 00:43:52.681429 3118 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [init] Using Kubernetes version: v1.18.0 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2 [preflight] If you know what you are doing, you can make a check non-fatal with
--ignore-preflight-errors=... To see the stack trace of this error execute with --v=5 or higher
10.1 报错处理
报错1: 需要修改docker驱动为systemd /etc/docker/daemon.json 文件中加入: "exec-opts": ["native.cgroupdriver=systemd"]
报错2: [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
出现该报错,是cpu有限制,将cpu修改为2核4G以上配置即可
报错2: 出现该报错,是cpu有限制,将cpu修改为2核4G以上配置即可
报错3: 加入集群出现报错:
W0507 01:19:49.406337 26642 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
[root@k8s-master2 yum.repos.d]# kubeadm join 10.0.0.63:6443 --token q8bfij.fipmsxdgv8sgcyq4 \
> --discovery-token-ca-cert-hash sha256:26fc15b6e52385074810fdbbd53d1ba23269b39ca2e3ec3bac9376ed807b595c
> --discovery-token-ca-cert-hash sha256:26fc15b6e52385074810fdbbd53d1ba23269b39ca2e3ec3bac9376ed807b595c
W0507 01:20:26.246981 26853 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决办法:
执行: kubeadm reset 重新加入
10.2. kubectl命令工具配置[master]
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#获取节点信息
# kubectl get nodes
[root@k8s-master1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 NotReady master 2m59s v1.18.0
k8s-node1 NotReady <none> 86s v1.18.0
k8s-node2 NotReady <none> 85s v1.18.0
#可以获取到其他主机的状态信息,证明集群完毕,另一台k8s-master2 没有加入到集群中,是因为要做多master,这里就不加了.
10.2. 安装网络插件[master]
[直在master上操作]上传kube-flannel.yaml,并执行:
kubectl apply -f kube-flannel.yaml
kubectl get pods -n kube-system
下载地址:
https://www.chenleilei.net/soft/k8s/kube-flannel.yaml
[必须全部运行起来,否则有问题.]
[root@k8s-master1 ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-5dq4s 1/1 Running 0 13m
coredns-7ff77c879f-v68pc 1/1 Running 0 13m
etcd-k8s-master1 1/1 Running 0 13m
kube-apiserver-k8s-master1 1/1 Running 0 13m
kube-controller-manager-k8s-master1 1/1 Running 0 13m
kube-flannel-ds-amd64-2ktxw 1/1 Running 0 3m45s
kube-flannel-ds-amd64-fd2cb 1/1 Running 0 3m45s
kube-flannel-ds-amd64-hb2zr 1/1 Running 0 3m45s
kube-proxy-4vt8f 1/1 Running 0 13m
kube-proxy-5nv5t 1/1 Running 0 12m
kube-proxy-9fgzh 1/1 Running 0 12m
kube-scheduler-k8s-master1 1/1 Running 0 13m
[root@k8s-master1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 14m v1.18.0
k8s-node1 Ready <none> 12m v1.18.0
k8s-node2 Ready <none> 12m v1.18.0
11. 将node1 node2 加入master
node1 node2加入集群配置
在要加入的节点种执行以下命令来加入:
kubeadm join 10.0.0.63:6443 --token fs0uwh.7yuiawec8tov5igh \
--discovery-token-ca-cert-hash sha256:471442895b5fb77174103553dc13a4b4681203fbff638e055ce244639342701d
#这个配置在安装master的时候有过提示,请注意首先要配置cni网络插件
#加入成功后,master节点检测:
[root@k8s-master1 docker]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master1 Ready master 14m v1.18.0
k8s-node1 Ready <none> 12m v1.18.0
k8s-node2 Ready <none> 12m v1.18.0
12 token创建和查询
默认token会保存24消失,过期后就不可用,如果需要重新建立token,可在master节点使用以下命令重新生成:
kubeadm token create
kubeadm token list
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
结果:
3d847b858ed649244b4110d4d60ffd57f43856f42ca9c22e12ca33946673ccb4
新token加入集群方法:
kubeadm join 10.0.0.63:6443 --discovery-token nuja6n.o3jrhsffiqs9swnu --discovery-token-ca-cert-hash 3d847b858ed649244b4110d4d60ffd57f43856f42ca9c22e12ca33946673ccb4
13. 安装dashboard界面
wget https://www.chenleilei.net/soft/k8s/dashboard.yaml
kubectl apply -f dashboard.yaml
[root@k8s-master1 ~]# kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.1.94.43 <none> 8000/TCP 7m58s
kubernetes-dashboard NodePort 10.1.187.162 <none> 443:30001/TCP 7m58s
13.1 访问测试
10.0.0.63 10.0.0.64 10.0.0.65 集群任意一个角色访问30001端口都可以访问到dashboard页面.
13.2 获取dashboard token, 也就是创建service account并绑定默认cluster-admin管理员集群角色
# kubectl create serviceaccount dashboard-admin -n kubernetes-dashboard
# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:dashboard-admin
# kubectl describe secrets -n kubernetes-dashboard $(kubectl -n kubernetes-dashboard get secret | awk '/dashboard-admin/{print $1}')
将复制的token 填写到 上图中的 token选项,并选择token登录
14. 验证集群是否工作正常
验证集群状态是否正常有三个方面:
1. 能否正常部署应用
2. 集群网络是否正常
3. 集群内部dns解析是否正常
14.1 验证部署应用和日志查询
#创建一个nginx应用
kubectl create deployment k8s-status-checke --image=nginx
#暴露80端口
kubectl expose deployment k8s-status-checke --port=80 --target-port=80 --type=NodePort
#删除这个deployment
kubectl delete deployment k8s-status-checke
#查询日志:
[root@k8s-master1 ~]# kubectl logs -f nginx-f89759699-m5k5z
14.2 验证集群网络是否正常
1. 拿到一个应用地址
[root@k8s-master1 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED READINESS
pod/nginx 1/1 Running 0 25h 10.244.2.18 k8s-node2 <none> <none>
2. 通过任意节点ping这个应用ip
[root@k8s-node1 ~]# ping 10.244.2.18
PING 10.244.2.18 (10.244.2.18) 56(84) bytes of data.
64 bytes from 10.244.2.18: icmp_seq=1 ttl=63 time=2.63 ms
64 bytes from 10.244.2.18: icmp_seq=2 ttl=63 time=0.515 ms
3. 访问节点
[root@k8s-master1 ~]# curl -I 10.244.2.18
HTTP/1.1 200 OK
Server: nginx/1.17.10
Date: Sun, 10 May 2020 13:19:02 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 14 Apr 2020 14:19:26 GMT
Connection: keep-alive
ETag: "5e95c66e-264"
Accept-Ranges: bytes
4. 查询日志
[root@k8s-master1 ~]# kubectl logs -f nginx
10.244.1.0 - - [10/May/2020:13:14:25 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36" "-"
14.3 验证集群内部dns解析是否正常
检查DNS:
[root@k8s-master1 ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-5dq4s 1/1 Running 1 4d #有时dns会出问题
coredns-7ff77c879f-v68pc 1/1 Running 1 4d #有时dns会出问题
etcd-k8s-master1 1/1 Running 4 4d
kube-apiserver-k8s-master1 1/1 Running 3 4d
kube-controller-manager-k8s-master1 1/1 Running 3 4d
kube-flannel-ds-amd64-2ktxw 1/1 Running 1 4d
kube-flannel-ds-amd64-fd2cb 1/1 Running 1 4d
kube-flannel-ds-amd64-hb2zr 1/1 Running 4 4d
kube-proxy-4vt8f 1/1 Running 4 4d
kube-proxy-5nv5t 1/1 Running 2 4d
kube-proxy-9fgzh 1/1 Running 2 4d
kube-scheduler-k8s-master1 1/1 Running 4 4d
#有时dns会出问题,解决方法:
1. 导出yaml文件
kubectl get deploy coredns -n kube-system -o yaml >coredns.yaml
2. 删除coredons
kubectl delete -f coredns.yaml
检查:
[root@k8s-master1 ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
etcd-k8s-master1 1/1 Running 4 4d
kube-apiserver-k8s-master1 1/1 Running 3 4d
kube-controller-manager-k8s-master1 1/1 Running 3 4d
kube-flannel-ds-amd64-2ktxw 1/1 Running 1 4d
kube-flannel-ds-amd64-fd2cb 1/1 Running 1 4d
kube-flannel-ds-amd64-hb2zr 1/1 Running 4 4d
kube-proxy-4vt8f 1/1 Running 4 4d
kube-proxy-5nv5t 1/1 Running 2 4d
kube-proxy-9fgzh 1/1 Running 2 4d
kube-scheduler-k8s-master1 1/1 Running 4 4d
coredns已经删除了
3. 重建coredns
kubectl apply -f coredns.yaml
[root@k8s-master1 ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-5mmjg 1/1 Running 0 13s
coredns-7ff77c879f-t74th 1/1 Running 0 13s
etcd-k8s-master1 1/1 Running 4 4d
kube-apiserver-k8s-master1 1/1 Running 3 4d
kube-controller-manager-k8s-master1 1/1 Running 3 4d
kube-flannel-ds-amd64-2ktxw 1/1 Running 1 4d
kube-flannel-ds-amd64-fd2cb 1/1 Running 1 4d
kube-flannel-ds-amd64-hb2zr 1/1 Running 4 4d
kube-proxy-4vt8f 1/1 Running 4 4d
kube-proxy-5nv5t 1/1 Running 2 4d
kube-proxy-9fgzh 1/1 Running 2 4d
kube-scheduler-k8s-master1 1/1 Running 4 4d
日志复查:
coredns-7ff77c879f-5mmjg:
[root@k8s-master1 ~]# kubectl logs coredns-7ff77c879f-5mmjg -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
linux/amd64, go1.13.6, da7f65b
coredns-7ff77c879f-t74th:
[root@k8s-master1 ~]# kubectl logs coredns-7ff77c879f-t74th -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
linux/amd64, go1.13.6, da7f65b
#k8s创建一个容器验证dns
[root@k8s-master1 ~]# kubectl run -it --rm --image=busybox:1.28.4 sh
/ # nslookup kubernetes
Server: 10.1.0.10
Address 1: 10.1.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes
Address 1: 10.1.0.1 kubernetes.default.svc.cluster.local
#通过 nslookup来解析 kubernetes 能够出现解析,说明dns正常工作
15. 集群证书问题处理 [kuberadm部署的解决方案]
1. 删除默认的secret,使用自签证书创建新的secret
kubectl delete secret kubernetes-dashboard-certs -n kubernetes-dashboard
kubectl create secret generic kubernetes-dashboard-certs \
--from-file=/etc/kubernetes/pki/apiserver.key --from-file=/etc/kubernetes/pki/apiserver.crt -n kubernetes-dashboard
使用二进制部署的这里的证书需要根据自己当时存储的路径进行修改即可.
2. 证书配置后需要修改dashboard.yaml文件,重新构建dashboard
wget https://www.chenleilei.net/soft/k8s/recommended.yaml
vim recommended.yaml
找到: kind: Deployment,找到这里之后再次查找 args 看到这两行:
- --auto-generate-certificates
- --namespace=kubernetes-dashboard
改为[中间插入两行证书地址]:
- --auto-generate-certificates
- --tls-key-file=apiserver.key
- --tls-cert-file=apiserver.crt
- --namespace=kubernetes-dashboard
[已修改的,可直接使用: wget https://www.chenleilei.net/soft/k8s/dashboard.yaml]
3. 修改完毕后重新应用 recommended.yaml
kubectl apply -f recommended.yaml
应用后,可以看到触发了一次滚动更新,然后重新打开浏览器发现证书已经正常显示,不会提示不安全了.
[root@k8s-master1 ~]# kubectl get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-694557449d-r9h5r 1/1 Running 0 2d1h
kubernetes-dashboard-5d8766c7cc-trdsv 1/1 Running 0 93s <---滚动更新.
4. 查看新的访问端口:
kubectl get svc -n kubernetes-dashboard
[root@k8s-master1 ~]# kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.1.187.60 <none> 8000/TCP 6m34s
kubernetes-dashboard NodePort 10.1.242.240 <none> 443:31761/TCP 6m34s
5. 使用谷歌浏览器打开会发现已经可以打开了
#1.注意,如果你忘记了登录token,重新生成
# kubectl create serviceaccount dashboard-admin -n kubernetes-dashboard
# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:dashboard-admin
#2.还可以查询以前的token 进行登录
kubectl describe secrets -n kubernetes-dashboard $(kubectl -n kubernetes-dashboard get secret | awk '/dashboard-admin/{print $1}')
15.1 证书更换前后浏览器截图:
更换前:
更换后:
更换证书后,这会多出一个继续前往,但是不安全的提示,没有更换证书则没有该提示.
15.2 报错处理:
15.2.1 问题1 k8s-node节点加入时报错:
k8s-node节点加入时报错:
W0315 22:16:20.123204 5795 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
处理办法:
echo "1" >/proc/sys/net/bridge/bridge-nf-call-iptables
增加后重新加入:
kubeadm join 10.0.0.63:6443 --token 0dr1pw.ejybkufnjpalb8k6 --discovery-token-ca-cert-hash sha256:ca1aa9cb753a26d0185e3df410cad09d8ec4af4d7432d127f503f41bc2b14f2a
这里的token由kubadm服务器生成.
15.2.2 问题2: web页面无法访问处理:
重建dashboard
#删除:
kubectl delete -f dashboard.yaml
#删除后创建:
kubectl create -f dashboard.yaml
#创建账户:
kubectl create serviceaccount dashboard-admin -n kubernetes-dashboard
#查看密码:
kubectl describe secrets -n kubernetes-dashboard $(kubectl -n kubernetes-dashboard get secret | awk '/dashboard-admin/{print $1}')
重新打开登录即可
通过下面命令查看分配到了那个节点:
[root@k8s-master1 ~]# kubectl get pods -n kubernetes-dashboard -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dashboard-metrics-scraper-694557449d-vnrvt 1/1 Running 0 8m56s 10.244.1.13 k8s-node1 <none> <none>
kubernetes-dashboard-85fc8fbf64-t4cdw 1/1 Running 0 3m8s 10.244.2.18 k8s-node2 <none> <none>
15.2.3 问题3: 部署dashboard失败
有可能是网络问题,需要切换一个别的网络,比如vpn,然后重新部署.
1. 或者复制以下内容保存为 dashboard.yaml 删除原来的dashboard重新部署
2. 或者从我个人服务器下载: wget https://www.chenleilei.net/soft/k8s/recommended.yaml
3. 检查状态是否存在问题,如镜像是否下载成功
kubectl get pods -n kubernetes-dashboard -o wide
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: v1
kind: Namespace
metadata:
name: kubernetes-dashboard
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
nodePort: 30001
selector:
k8s-app: kubernetes-dashboard
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kubernetes-dashboard
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-csrf
namespace: kubernetes-dashboard
type: Opaque
data:
csrf: ""
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-key-holder
namespace: kubernetes-dashboard
type: Opaque
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-settings
namespace: kubernetes-dashboard
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
rules:
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster", "dashboard-metrics-scraper"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
verbs: ["get"]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
rules:
# Allow Metrics Scraper to get metrics from the Metrics server
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
image: kubernetesui/dashboard:v2.0.0-beta8
imagePullPolicy: Always
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
- --tls-key-file=apiserver.key
- --tls-cert-file=apiserver.crt
- --namespace=kubernetes-dashboard
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
nodeSelector:
"beta.kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
ports:
- port: 8000
targetPort: 8000
selector:
k8s-app: dashboard-metrics-scraper
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: dashboard-metrics-scraper
template:
metadata:
labels:
k8s-app: dashboard-metrics-scraper
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'runtime/default'
spec:
containers:
- name: dashboard-metrics-scraper
image: kubernetesui/metrics-scraper:v1.0.1
ports:
- containerPort: 8000
protocol: TCP
livenessProbe:
httpGet:
scheme: HTTP
path: /
port: 8000
initialDelaySeconds: 30
timeoutSeconds: 30
volumeMounts:
- mountPath: /tmp
name: tmp-volume
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
serviceAccountName: kubernetes-dashboard
nodeSelector:
"beta.kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
volumes:
- name: tmp-volume
emptyDir: {}
16. 在k8s中部署一个nginx
[root@k8s-master1 ~]# kubectl expose deployment nginx --port=80 --target-port=80 --type=NodePort
service/nginx exposed
[root@k8s-master1 ~]# kubectl get pod,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-f89759699-dnfmg 0/1 ImagePullBackOff 0 3m41s
ImagePullBackOff报错:
检查k8s日志: kubectl describe pod nginx-f89759699-dnfmg
结果:
Normal Pulling 3m27s (x4 over 7m45s) kubelet, k8s-node2 Pulling image "nginx"
Warning Failed 2m55s (x2 over 6m6s) kubelet, k8s-node2 Failed to pull image "nginx": rpc error: code = Unknown desc = Get https://registry-1.docker.io/v2/library/nginx/manifests/sha256:cccef6d6bdea671c394956e24b0d0c44cd82dbe83f543a47fdc790fadea48422: net/http: TLS handshake timeout
可以看到是因为docker下载镜像报错,需要更新别的docker源
[root@k8s-master1 ~]# cat /etc/docker/daemon.json
{
"registry-mirrors": ["https://ajvcw8qn.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
使用其中一个node节点docker来pull nginx:
然后发现了错误:
[root@k8s-node1 ~]# docker pull nginx
Using default tag: latest
latest: Pulling from library/nginx
54fec2fa59d0: Pulling fs layer
4ede6f09aefe: Pulling fs layer
f9dc69acb465: Pulling fs layer
Get https://registry-1.docker.io/v2/: net/http: TLS handshake timeout #源没有修改
重新修改源后:
[root@k8s-master1 ~]# docker pull nginx
Using default tag: latest
latest: Pulling from library/nginx
54fec2fa59d0: Pull complete
4ede6f09aefe: Pull complete
f9dc69acb465: Pull complete
Digest: sha256:86ae264c3f4acb99b2dee4d0098c40cb8c46dcf9e1148f05d3a51c4df6758c12
Status: Downloaded newer image for nginx:latest
docker.io/library/nginx:latest
再次运行:
kubectl delete pod,svc nginx
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --target-port=80 --type=NodePort
这是一个k8s拉取镜像失败的排查过程:
1. k8s部署nginx失败,检查节点 kubectl get pod,svc
2. 检查k8s日志: Failed to pull image "nginx": rpc error: code = Unknown desc = Get https://registry-
...net/http: TLS handshake timeout [出现这个故障可以看到是源没有更换]
3. 修改docker源为阿里云的.然后重新启动docker
cat /etc/docker/daemon.json
{
"registry-mirrors": ["https://ajvcw8qn.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
systemctl restart docker.service
4. 再次使用docker pull 来下载一个nginx镜像, 发现已经可以拉取成功
5. 删除docker下载好的nginx镜像 docker image rm -f [镜像名]
6. k8s中删除部署失败的nginx kubectl delete deployment nginx
7. 重新创建镜像 kubectl create deployment nginx --image=nginx
8. k8s重新部署应用: kubectl expose deployment nginx --port=80 --target-port=80 --type=NodePort
17. 暴露应用
1.创建镜像
kubectl create deployment nginx --image=nginx
2.暴露应用
kubectl expose deployment nginx --port=80 --target-port=80 --type=NodePort
18. 优化: k8s自动补全工具
yum install -y bash-completion
source <(kubectl completion bash)
source /usr/share/bash-completion/bash_completion
19. 本节问题点:
一. token过期处理办法:
每隔24小时,之前创建的token就会过期,这样会无法登录集群的dashboard页面,此时需要重新生成token
生成命令:
kubeadm token create
kubeadm token list
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
查询token
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
3d847b858ed649244b4110d4d60ffd57f43856f42ca9c22e12ca33946673ccb4
然后使用新的token让新服务器加入:
kubeadm join 10.0.0.63:6443 --token 0dr1pw.ejybkufnjpalb8k6 --discovery-token-ca-cert-hash sha256:3d847b858ed649244b4110d4d60ffd57f43856f42ca9c22e12ca33946673ccb4
二. dashboard登录密码获取
kubectl describe secrets -n kubernetes-dashboard $(kubectl -n kubernetes-dashboard get secret | awk '/dashboard-admin/{print $1}')
三. k8s拉取镜像失败的排查过程
1. k8s部署nginx失败,检查节点 kubectl get pod,svc
2. 检查k8s日志: Failed to pull image "nginx": rpc error: code = Unknown desc = Get https://registry-
...net/http: TLS handshake timeout [出现这个故障可以看到是源没有更换]
3. 修改docker源为阿里云的.然后重新启动docker
cat /etc/docker/daemon.json
{
"registry-mirrors": ["https://ajvcw8qn.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
systemctl restart docker.service
4. 再次使用docker pull 来下载一个nginx镜像, 发现已经可以拉取成功
5. 删除docker下载好的nginx镜像 docker image rm -f [镜像名]
6. k8s中删除部署失败的nginx kubectl delete deployment nginx
7. 重新创建镜像 kubectl create deployment nginx --image=nginx
8. k8s重新部署应用: kubectl expose deployment nginx --port=80 --target-port=80 --type=NodePort
20. YAML附件[请保存为 .yaml 为后缀]
http://www.chenleilei.net/soft/kubeadm快速部署一个Kubernetes集群yaml.zip

微信赞赏

支付宝赞赏

【推荐】还在用 ECharts 开发大屏?试试这款永久免费的开源 BI 工具!
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 理解Rust引用及其生命周期标识(下)
· 从二进制到误差:逐行拆解C语言浮点运算中的4008175468544之谜
· .NET制作智能桌面机器人:结合BotSharp智能体框架开发语音交互
· 软件产品开发中常见的10个问题及处理方法
· .NET 原生驾驭 AI 新基建实战系列:向量数据库的应用与畅想
· 2025成都.NET开发者Connect圆满结束
· 后端思维之高并发处理方案
· 在 VS Code 中,一键安装 MCP Server!
· 千万级大表的优化技巧
· 10年+ .NET Coder 心语 ── 继承的思维:从思维模式到架构设计的深度解析
2018-05-07 rsync 备份服务搭建(完成)