k8s及周边服务(详细总结篇)
1.业务部署说明
我们是做算法的,每个算法会被封装成一个镜像,比如:image1(入侵算法),image2(安全带识别算)
结合k8s流程: ingress-nginx(做了hostNetwork: true 网络模式,省去了service的一成转发),直接可以访问ingress-nginx的域名和端口——>客户通过ingress发布的host+path+业务自定义的接口地址拼接成访问url——>ingress-nginx根据对应的匹配规则转发 到后端service——>deployment——>pods(pvc——>pv)
最后的流程就是:客户通过固定的接口格式,访问ingress-nginx根据匹配结果代理到后端不同的service上提供算力能力
2.完整卸载k8s
# 首先清理运行到k8s群集中的pod,使用
kubectl delete node --all
# 使用脚本停止所有k8s服务
for service in kube-apiserver kube-controller-manager kubectl kubelet etcd kube-proxy kube-scheduler;
do
systemctl stop $service
done
# 使用命令卸载k8s
kubeadm reset -f
# 卸载k8s相关程序
yum -y remove kube*
# 删除相关的配置文件
modprobe -r ipip
lsmod
# 然后手动删除配置文件和flannel网络配置和flannel网口:
rm -rf /etc/cni
rm -rf /root/.kube
# 删除cni网络
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
# 删除残留的配置文件
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /etc/systemd/system/kubelet.service
rm -rf /etc/systemd/system/multi-user.target.wants/kubelet.service
rm -rf /var/lib/kubelet
rm -rf /usr/libexec/kubernetes/kubelet-plugins
rm -rf /usr/bin/kube*
rm -rf /opt/cni
rm -rf /var/lib/etcd
rm -rf /var/etcd
# 更新镜像
yum clean all
yum makecache
3.安装k8s准备工作
3.1 服务器分配
k8master: 172.16.4.58 #master使用虚拟机部署,不部署业务
k8node1: 172.16.3.199 #物理机(GPU 2060),部署业务ai
操作系统:Centos7.8
k8s版本:v1.23.0
docker版本:19.03.8
ingress版本:
3.2 配置主机名解析(所有节点)
[root@k8master ~]# cat /etc/hosts
#添加
172.16.4.58 k8master
172.16.3.199 k8node1
3.3 设置hostname(所有节点)
#k8master节点执行
hostnamectl set-hostname k8master
#k8node1节点执行
hostnamectl set-hostname k8snode1
3.4 安装时间服务器(所有节点)
yum -y install bash-completion chrony iotop sysstat
## 启动时间服务
cat > /etc/chrony.conf <<EOF
server ntp.aliyun.com iburst
stratumweight 0
driftfile /var/lib/chrony/drift
rtcsync
makestep 10 3
bindcmdaddress 127.0.0.1
bindcmdaddress ::1
keyfile /etc/chrony.keys
commandkey 1
generatecommandkey
logchange 0.5
logdir /var/log/chrony
EOF
systemctl enable chronyd
systemctl start chronyd
3.5 禁用SELinux和Firewalld服务(所有节点)
#关闭firewalld
systemctl stop firewalld
systemctl disable firewalld
#禁用selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config # 重启后生效
3.6 禁用swap分区
#临时关闭
swapoff -a
#永久关闭
sed -i 's/.*swap.*/#&/' /etc/fstab
3.7 添加网桥过滤和地址转发功能(所有节点)
cat > /etc/sysctl.d/kubernetes.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
# 然后执行,生效
sysctl --system
3.8 安装docker(所有节点)
#安装依赖
yum install -y yum-utils device-mdataer-persistent-data lvm2
#安装docker库
yum-config-manager --add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
#安装docker ce(业务使用版本,可以自行选择对应版本)
yum install -y containerd.io-1.2.13 docker-ce-19.03.8 docker-ce-cli-19.03.8
#创建docker目录
mkdir /etc/docker
#配置docker daemon 文件
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"graph": "/data/docker_storage",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"insecure-registries" : ["172.168.4.99:7090","152.199.254.168:7090"],
"registry-mirrors": ["https://g427vmjy.mirror.aliyuncs.com"],
"live-restore": true
}
EOF
#打开docker的api监听端口
sed -i 's/^ExecStart.*/#&/' /lib/systemd/system/docker.service
sed -i '15i ExecStart=/usr/bin/dockerd -H tcp://localhost:2375 -H unix://var/run/docker.sock -H fd:// --containerd=/run/containerd/containerd.sock' /lib/systemd/system/docker.service
#启动docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker
{
// 注意daemon.json一定要要添加这行,指定cgroup的驱动程序,其他可按照业务自行配置
"exec-opts": ["native.cgroupdriver=systemd"],
}
3.9 kubernetes镜像切换成国内源(所有节点)
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
3.10 安装指定版本 kubeadm,kubelet 和 kubectl(我这里选择1.23.0
版本的)
yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0
# 设置kubelet开机启动
systemctl enable kubelet
3.11 更改kubelet的容器路径(可以不改,直接跳过)(所有节点)
#创建目录
mkdir /data/kubelet
vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf #添加--root-dir=/data/kubelet/,指定自己的目录 [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --root-dir=/data/kubelet/ --kubeconfig=/etc/kubernetes/kubelet.conf"
#配置生效
systemctl daemon-reload
systemctl restart docker
systemctl restart kubelet
4.部署kubernetes集群
4.1 覆盖k8s的镜像地址(只需要在master节点上操作初始化命令)
(1)首先要覆盖kubeadm的镜像地址,因为这个是外网的无法访问,需要替换成国内的镜像地址,使用此命令列出集群在配置过程中需要哪些镜像
[root@k8master ~]# kubeadm config images list
I0418 18:26:04.047449 19242 version.go:255] remote version is much newer: v1.27.1; falling back to: stable-1.23
k8s.gcr.io/kube-apiserver:v1.23.17
k8s.gcr.io/kube-controller-manager:v1.23.17
k8s.gcr.io/kube-scheduler:v1.23.17
k8s.gcr.io/kube-proxy:v1.23.17
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
(2)更改为阿里云的镜像地址
[root@k8master ~]# kubeadm config images list --image-repository registry.aliyuncs.com/google_containers
I0418 18:28:18.740057 20021 version.go:255] remote version is much newer: v1.27.1; falling back to: stable-1.23
registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.17
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.17
registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.17
registry.aliyuncs.com/google_containers/kube-proxy:v1.23.17
registry.aliyuncs.com/google_containers/pause:3.6
registry.aliyuncs.com/google_containers/etcd:3.5.1-0
registry.aliyuncs.com/google_containers/coredns:v1.8.6
(3)然后将镜像手动拉取下来,这样在初始化的时候回更快一些(还有一个办法就是直接在docker上把镜像pull下来,docker只要配置一下国内源即可快速的将镜像pull下来)
[root@k8master ~]# kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
I0418 18:28:31.795554 20088 version.go:255] remote version is much newer: v1.27.1; falling back to: stable-1.23
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.6
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.1-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.8.6
(4)初始化kubernetes(只需要在master节点上操作初始化命令)
# 初始化 Kubernetes,指定网络地址段 和 镜像地址(后续的子节点可以使用join命令进行动态的追加)
kubeadm init \ --apiserver-advertise-address=172.16.4.58 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.23.0 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 \ --ignore-preflight-errors=all
# –apiserver-advertise-address # 集群通告地址(master 机器IP)
# –image-repository # 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
# –kubernetes-version #K8s版本,与上面安装的一致
# –service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
# –pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数
#执行完之后要手动执行一些参数(尤其是 加入集群的join命令 需要复制记录下载)
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.4.58:6443 --token nnzdrq.1mqngk9jnh88nkyn \
--discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
(5)执行参数
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
vim /root/.bash_profile
#最后添加
# 超级用户变量
export KUBECONFIG=/etc/kubernetes/admin.conf
# 设置别名
alias k=kubectl
# 设置kubectl命令补齐功能
source <(kubectl completion bash)
#执行生效
source /root/.bash_profile
(6)这段要复制记录下来(来自k8s初始化成功之后出现的join命令,需要先配置完calico或者Flannel才能加入子节点),后续子节点加入master节点需要执行这段命令
kubeadm join 172.16.4.58:6443 --token nnzdrq.1mqngk9jnh88nkyn \
--discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
4.2 k8s网络部署,calico插件(只需要k8master上执行)
(1)下载calico.yaml文件
wget https://docs.projectcalico.org/v3.20/manifests/calico.yaml --no-check-certificate
(2)calico.yaml文件修改
vim /data/calico.yaml
#calico增加、修改内容解释
- name: IP_AUTODETECTION_METHOD
value: interface=ens192
作用:告诉 Calico 插件使用 ens192 网络接口来自动检测容器的 IP 地址。这意味着 Calico 将尝试在指定的网络接口上查找可用的 IP 地址,并将该 IP 地址用于容器的网络通信
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
作用:这表示 Calico 将使用 IP 地址范围 10.244.0.0 到 10.244.255.255(CIDR /16 表示 16 位的网络前缀)来分配给容器,这个值要和:
kubeadm init \
--apiserver-advertise-address=172.16.4.58 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--ignore-preflight-errors=all
中的 --pod-network-cidr=10.244.0.0/16 保持一致
apiVersion: policy/v1beta1
修改为:
apiVersion: policy/v1
目的:policy/v1 引入了 NetworkPolicy 资源的稳定版本,并提供了更强大的网络策略功能。这个版本更加成熟和稳定,因此推荐在较新的 Kubernetes 版本中使用它
(3)应用calico网络
kubectl apply -f calico.yaml
(4)查看calico是否运行成功
[root@k8master data]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5b9cd88b65-vg4gq 1/1 Running 3 (17h ago) 21h
calico-node-86l8c 1/1 Running 2 (17h ago) 21h
calico-node-lg2mg 1/1 Running 0 21h
coredns-6d8c4cb4d-wm8d2 1/1 Running 0 23h
coredns-6d8c4cb4d-xxdmm 1/1 Running 0 23h
(5)下载calicoctl工具
#github地址
https://github.com/projectcalico/calicoctl/releases/tag/v3.20.6
mv calicoctl-linux-amd64 calicoctl chmod +x calicoctl mv calicoctl /usr/bin/
#执行命令calicoctl node status,看到up就说明已经启动 [root@k8master data]# calicoctl node status Calico process is running. IPv4 BGP status +--------------+-------------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +--------------+-------------------+-------+----------+-------------+ | 172.16.3.199 | node-to-node mesh | up | 12:15:08 | Established | +--------------+-------------------+-------+----------+-------------+ IPv6 BGP status No IPv6 peers found.
至此k8s的master节点全部部署完成!!!
5.k8node1从节点加入集群(以下操作在k8node1,也就是从节点执行)
5.1 安装nvidia驱动(需要根据自己的业务选择合适的驱动安装)
# 禁用系统Nouveau驱动
sed -i "s/blacklist nvidiafb/#&/" /usr/lib/modprobe.d/dist-blacklist.conf
cat >> /usr/lib/modprobe.d/dist-blacklist.conf <<EOF
blacklist nouveau
options nouveau modeset=0
EOF
# 备份系统initramfs镜像
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)
reboot
# 重启后查看系统Nouveau是否被禁用(没有任何输出)
lsmod | grep nouveau
# 安装驱动(--kernel-source-path手动补全,禁止复制)
sh /data/nvidia-drive/NVIDIA-Linux-x86_64-440.82.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.102.1.el7.x86_64/ -k $(uname -r)
#注意: kernel-source-path 需要手动查看目录
# 接下来界面化操作(略)
5.2 安装nvidia-docker2支持k8s
# 安装nvidia-container-runtime && nvidia-container-runtime
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
yum install -y nvidia-container-toolkit nvidia-container-runtime
# 安装nvidia-docker2,使k8s可以使用显卡驱动
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
# 安装nvidia-docker2,重载Docker daemon configuration
# 在执行过程中,会覆盖/etc/docker/daemon.json的内容,此时注意备份,可以把daemon.json的内容和新生成的合成一体。
yum install -y nvidia-docker2
5.3 daemon.json文件合并
[root@k8node1 ~]# cat /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"graph": "/data/docker_storage",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"insecure-registries" : ["172.168.4.90:8090","152.188.254.169:8090"],
"registry-mirrors": ["https://g427vmjy.mirror.aliyuncs.com"],
"live-restore": true,
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
5.4 查看是否可以调用gpu,看到下图则为调用成功
[root@k8node1 aibox-ai-server]# nvidia-smi
Thu Nov 9 14:17:29 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 Off | 00000000:B3:00.0 Off | N/A |
| 0% 38C P8 3W / 160W | 0MiB / 5934MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
5.5 重启docker、kublete服务
systemctl restart docker
systemctl restart kubelet
5.6 从k8node1节点加入k8master主节点(k8node1从节点执行)
kubeadm join 172.16.4.58:6443 --token nnzdrq.1mqngk9jnh88nkyn --discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
5.7 查看加入是否成功(k8master主节点执行)
#给k8node1设置一个标签(在k8master上执行)
kubectl label nodes k8node1 node-role.kubernetes.io/work=work
[root@k8master data]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8master Ready control-plane,master 23h v1.23.0 172.16.4.58 <none> CentOS Linux 7 (Core) 3.10.0-1127.el7.x86_64 docker://19.3.8
k8node1 Ready work 23h v1.23.0 172.16.3.199 <none> CentOS Linux 7 (Core) 3.10.0-1160.102.1.el7.x86_64 docker://19.3.8)
可以看到k8node1已经加入了
5.8 删除子节点(在master主节点上操作)
# kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
# 其中 <node name> 是在k8s集群中使用 <kubectl get nodes> 查询到的节点名称
# 假设这里删除 node3 子节点
[root@node1 home]# kubectl drain node3 --delete-local-data --force --ignore-daemonsets
[root@node1 home]# kubectl delete node node3
[root@node3 home]# # 子节点重置k8s
[root@node3 home]# kubeadm reset
6.部署k8s dashboard(这里使用Kubepi)
Kubepi是一个简单高效的k8s集群图形化管理工具,方便日常管理K8S集群,高效快速的查询日志定位问题的工具
6.1 部署KubePI(随便在哪个节点部署,我这里在主节点部署):
[root@k8master ~]# docker pull kubeoperator/kubepi-server [root@k8master ~]# # 运行容器 [root@k8master ~]# docker run --privileged -itd --restart=unless-stopped --name kube_dashboard -v /home/docker-mount/kubepi/:/var/lib/kubepi/ -p 8000:80 kubeoperator/kubepi-server
地址: http://172.16.4.58:8000 默认用户名:admin 默认密码:kubepi
6.2 填写集群名称,默认认证模式,填写apisever
地址及token
6.3 获取登录需要用到的ip地址和登录token
[root@k8master ~]# # 在 k8s 主节点上创建用户,并获取token [root@k8master ~]# kubectl create sa k8admin --namespace kube-system serviceaccount/k8admin created [root@k8master ~]# kubectl create clusterrolebinding k8admin --clusterrole=cluster-admin --serviceaccount=kube-system:k8admin clusterrolebinding.rbac.authorization.k8s.io/k8admin created [root@k8master ~]# [root@k8master ~]# # 在主节点上获取新建的用户 k8admin 的 token [root@k8master ~]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep k8admin | awk '{print $1}') | grep token: | awk '{print $2}' eyJhbGciOiJSUzI1NiIsImtpZCI6IkhVeUtyc1BpU1JvRnVacXVqVk1PTFRkaUlIZm1KQTV6Wk9WSExSRllmd0kifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcGktdXNlci10b2tlbi10cjVsMiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcGktdXNlciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjJiYzlhZDRjLWVjZTItNDE2Mi04MDc1LTA2NTI0NDg0MzExZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcGktdXNlciJ9.QxkR1jBboqTYiVUUVO4yGhfWmlLDA5wHLo_ZnjAuSLZQDyVevCgBluL6l7y7UryRdId6FmBZ-L0QitvOuTsurcjGL2QHxPE_yZsNW7s9K7eikxJ8q-Q_yOvnADtAueH_tcMGRGW9Zyec2TlmcGTZCNaNUme84TfMlWqX7oP3GGJGMbMGN7H4fPXh-Qqrdp-0MJ3tP-dk3koZUEu3amrq8ExSmjIAjso_otrgFWbdSOMkCXKsqb9yuZzaw7u5Cy18bH_HW6RbNCRT5jGs5aOwzuMAd0HQ5iNm-5OISI4Da6jGdjipLXejcC1H-xWgLlJBx0RQWu41yoPNF57cG1NubQ [root@k8master ~]# [root@k8master ~]# # 在主节点上获取 apiserver 地址 [root@k8master ~]# cat ~/.kube/config | grep server: | awk '{print $2}' https://172.16.4.58:6443
6.4 确认之后就可以看到
7.安装metrics k8s集群监控插件
https://zhuanlan.zhihu.com/p/572406293
8.k8s整体部署方式参考文档:
https://zhuanlan.zhihu.com/p/627310856?utm_id=0
至此k8s相关内容已经部署完成!!!
8.业务相关服务部署
8.1 部署内容和流程
创建pv——创建pvc——创建aiserver服务——创建service——创建ingress代理
8.2 流程yaml文件展示
(1)pv.yaml
[root@k8master new]# cat pv.yaml apiVersion: v1 kind: PersistentVolume metadata: name: pv-aimodel labels: pv: aimodel spec: capacity: storage: 20Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /data/aibox-common/aimodel --- apiVersion: v1 kind: PersistentVolume metadata: name: pv-common labels: pv: common spec: capacity: storage: 100Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /data/aibox-common/common --- apiVersion: v1 kind: PersistentVolume metadata: name: pv-ai-logs labels: pv: ai-logs spec: capacity: storage: 50Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /data/aibox-common/ai-server/logs --- apiVersion: v1 kind: PersistentVolume metadata: name: pv-ai-dmi labels: pv: ai-dmi spec: capacity: storage: 5Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: /sys/firmware/dmi/ ---
(2)pvc.yaml
[root@k8master new]# cat pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-aimodel spec: accessModes: - ReadWriteMany resources: requests: storage: 20Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-common spec: accessModes: - ReadWriteMany resources: requests: storage: 100Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-ai-logs spec: accessModes: - ReadWriteMany resources: requests: storage: 50Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-ai-dmi spec: accessModes: - ReadWriteMany resources: requests: storage: 5Gi ---
(3)ai.yaml + service.yaml (将ai服务部署到k8node1节点,因为这个节点是业务服务部署节点,有GPU)
[root@k8master new]# cat ai.yaml --- apiVersion: v1 kind: Service metadata: name: ai-svc labels: app: ai spec: type: NodePort ports: - port: 28865 targetPort: 28865 nodePort: 31000 selector: app: ai --- apiVersion: apps/v1 kind: Deployment metadata: name: ai #namespace: kube-fjyd spec: replicas: 5 selector: matchLabels: app: ai template: metadata: labels: app: ai spec: nodeName: k8node1 #调用到k8node1进行部署,因为业务需要,有GPU containers: - name: ai image: 172.168.4.60:8090/rz4.5.0.0/aiserver:v4.5.3.0019_v4.5.15.16 #自己业务镜像,下载不下来 imagePullPolicy: IfNotPresent ports: - containerPort: 28865 volumeMounts: - name: logs mountPath: /home/nvidia/aibox/logs - name: aimodel mountPath: /home/nvidia/aibox/aimodel - name: common mountPath: /home/nvidia/aibox/common - name: dmi mountPath: /mnt/sys/firmware/dmi - name: localtime mountPath: /etc/localtime readOnly: true resources: limits: nvidia.com/gpu: 1 # 请求使用1个GPU volumes: - name: logs persistentVolumeClaim: claimName: pvc-ai-logs - name: aimodel persistentVolumeClaim: claimName: pvc-aimodel - name: common persistentVolumeClaim: claimName: pvc-common - name: dmi persistentVolumeClaim: claimName: pvc-ai-dmi - name: localtime hostPath: path: /etc/localtime type: "" restartPolicy: Always ---
9. ingress部署
9.1 什么是 Ingress
Ingress 是对集群中服务的外部访问进行管理的 API 对象,典型的访问方式是 HTTP。Ingress 可以提供负载均衡
Ingress 公开了从集群外部到集群内 服务的 HTTP 和 HTTPS 路由。 流量路由由 Ingress 资源上定义的规则控制。
下面是一个将所有流量都发送到同一 Service 的简单 Ingress 示例:
9.2 部署 Ingress-nginx controller
deploy.yaml 坑点:
Ingress-nginx 官网 https://kubernetes.github.io/ingress-nginx/ 提到了 deploy.yaml 文件
Ingress-nginx 新版本的 depoly.yaml 有些不同,需要拉取下面2个镜像
k8s.gcr.io/ingress-nginx/controller:v1.1.2
k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.1.1
多半是下载不到的,所以需要自己替换一下 ,可以去docker hub 上找到对应的 镜像文件,比如下边这两个:
[root@k8master ingress]# docker images | egrep "longjianghu|liangjw" longjianghu/ingress-nginx-controller v1.1.2 7e5c1cecb086 20 months ago 286MB #k8s.gcr.io/ingress-nginx/controller:v1.1.2 liangjw/kube-webhook-certgen v1.1.1 c41e9fcadf5a 2 years ago 47.7MB #k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.1.1
9.3 总结 坑点:
(1)新版本中 提供了 IngressClass ,需要在编写 Ingress 的时候指定
(2)deploy.yaml中的Image 加载不到,替换成上边的两个镜像
(3)ingress-nginx-controller 使用 hostNetwork: true 进行部署 比 NodePort 减少一层转发,但是需要指定 选择打了标签的 node nodeSelector: app: ingress
9.4 deploy.yaml文件自己修改后的,可以参考
[root@k8master ingress]# cat deploy.yaml #代码太多,已经折叠
apiVersion: v1 kind: Namespace metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx name: ingress-nginx --- apiVersion: v1 automountServiceAccountToken: true kind: ServiceAccount metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx namespace: ingress-nginx --- apiVersion: v1 kind: ServiceAccount metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx namespace: ingress-nginx rules: - apiGroups: - "" resources: - namespaces verbs: - get - apiGroups: - "" resources: - configmaps - pods - secrets - endpoints verbs: - get - list - watch - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io resources: - ingressclasses verbs: - get - list - watch - apiGroups: - "" resourceNames: - ingress-controller-leader resources: - configmaps verbs: - get - update - apiGroups: - "" resources: - configmaps verbs: - create - apiGroups: - "" resources: - events verbs: - create - patch --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission namespace: ingress-nginx rules: - apiGroups: - "" resources: - secrets verbs: - get - create --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx rules: - apiGroups: - "" resources: - configmaps - endpoints - nodes - pods - secrets - namespaces verbs: - list - watch - apiGroups: - "" resources: - nodes verbs: - get - apiGroups: - "" resources: - services verbs: - get - list - watch - apiGroups: - networking.k8s.io resources: - ingresses verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - create - patch - apiGroups: - networking.k8s.io resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io resources: - ingressclasses verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission rules: - apiGroups: - admissionregistration.k8s.io resources: - validatingwebhookconfigurations verbs: - get - update --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- apiVersion: v1 data: allow-snippet-annotations: "true" kind: ConfigMap metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller namespace: ingress-nginx --- apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller namespace: ingress-nginx spec: externalTrafficPolicy: Local ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - appProtocol: http name: http port: 80 protocol: TCP targetPort: http - appProtocol: https name: https port: 443 protocol: TCP targetPort: https selector: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx type: LoadBalancer --- apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller-admission namespace: ingress-nginx spec: ports: - appProtocol: https name: https-webhook port: 443 targetPort: webhook selector: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx type: ClusterIP --- apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-controller namespace: ingress-nginx spec: minReadySeconds: 0 revisionHistoryLimit: 10 selector: matchLabels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx template: metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx spec: hostNetwork: true #修改 ingress-nginx-controller 为 hostNetwork模式 nodeSelector: #选择 node label 中有 app=ingress的节点进行部署 app: ingress containers: - args: - /nginx-ingress-controller - --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller - --election-id=ingress-controller-leader - --controller-class=k8s.io/ingress-nginx - --ingress-class=nginx - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller - --validating-webhook=:8443 - --validating-webhook-certificate=/usr/local/certificates/cert - --validating-webhook-key=/usr/local/certificates/key env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: LD_PRELOAD value: /usr/local/lib/libmimalloc.so image: longjianghu/ingress-nginx-controller:v1.1.2 #修改镜像地址 imagePullPolicy: IfNotPresent lifecycle: preStop: exec: command: - /wait-shutdown livenessProbe: failureThreshold: 5 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: controller ports: - containerPort: 80 name: http protocol: TCP - containerPort: 443 name: https protocol: TCP - containerPort: 8443 name: webhook protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: requests: cpu: 100m memory: 90Mi securityContext: allowPrivilegeEscalation: true capabilities: add: - NET_BIND_SERVICE drop: - ALL runAsUser: 101 volumeMounts: - mountPath: /usr/local/certificates/ name: webhook-cert readOnly: true dnsPolicy: ClusterFirst nodeSelector: kubernetes.io/os: linux serviceAccountName: ingress-nginx terminationGracePeriodSeconds: 300 volumes: - name: webhook-cert secret: secretName: ingress-nginx-admission --- apiVersion: batch/v1 kind: Job metadata: annotations: helm.sh/hook: pre-install,pre-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-create namespace: ingress-nginx spec: template: metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-create spec: containers: - args: - create - --host=ingress-nginx-controller-admission,ingress-nginx-controller-admission.$(POD_NAMESPACE).svc - --namespace=$(POD_NAMESPACE) - --secret-name=ingress-nginx-admission env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: liangjw/kube-webhook-certgen:v1.1.1 #修改镜像地址 imagePullPolicy: IfNotPresent name: create securityContext: allowPrivilegeEscalation: false nodeSelector: kubernetes.io/os: linux restartPolicy: OnFailure securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 2000 serviceAccountName: ingress-nginx-admission --- apiVersion: batch/v1 kind: Job metadata: annotations: helm.sh/hook: post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-patch namespace: ingress-nginx spec: template: metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission-patch spec: containers: - args: - patch - --webhook-name=ingress-nginx-admission - --namespace=$(POD_NAMESPACE) - --patch-mutating=false - --secret-name=ingress-nginx-admission - --patch-failure-policy=Fail env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace image: liangjw/kube-webhook-certgen:v1.1.1 #修改镜像地址 imagePullPolicy: IfNotPresent name: patch securityContext: allowPrivilegeEscalation: false nodeSelector: kubernetes.io/os: linux restartPolicy: OnFailure securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 2000 serviceAccountName: ingress-nginx-admission --- apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: nginx spec: controller: k8s.io/ingress-nginx --- apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.1.2 helm.sh/chart: ingress-nginx-4.0.18 name: ingress-nginx-admission webhooks: - admissionReviewVersions: - v1 clientConfig: service: name: ingress-nginx-controller-admission namespace: ingress-nginx path: /networking/v1/ingresses failurePolicy: Fail matchPolicy: Equivalent name: validate.nginx.ingress.kubernetes.io rules: - apiGroups: - networking.k8s.io apiVersions: - v1 operations: - CREATE - UPDATE resources: - ingresses sideEffects: None
[root@k8master ingress]# kubectl get all -n ingress-nginx NAME READY STATUS RESTARTS AGE pod/ingress-nginx-admission-create-fqsl7 0/1 Completed 0 122m pod/ingress-nginx-admission-patch-nmbrd 0/1 Completed 0 122m pod/ingress-nginx-controller-6b68d8cbbf-9xj8t 1/1 Running 0 122m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/ingress-nginx-controller LoadBalancer 10.109.255.117 <pending> 80:30297/TCP,443:31879/TCP 122m service/ingress-nginx-controller-admission ClusterIP 10.99.13.106 <none> 443/TCP 122m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ingress-nginx-controller 1/1 1 1 122m NAME DESIRED CURRENT READY AGE replicaset.apps/ingress-nginx-controller-6b68d8cbbf 1 1 1 122m NAME COMPLETIONS DURATION AGE job.batch/ingress-nginx-admission-create 1/1 8s 122m job.batch/ingress-nginx-admission-patch 1/1 24s 122m
9.5.部署 Ingress-nginx
(1)准备工作
给 k8node1 节点打了app=ingress标签,因为上面的ingress-nginx-controller 使用的是 hostNetwork 模式(只会放pod真实pod 的 端口) + nodeSelector
[root@k8master ingress]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8master Ready control-plane,master 47h v1.23.0 172.16.4.58 <none> CentOS Linux 7 (Core) 3.10.0-1127.el7.x86_64 docker://19.3.8
k8node1 Ready work 47h v1.23.0 172.16.3.199 <none> CentOS Linux 7 (Core) 3.10.0-1160.102.1.el7.x86_64 docker://19.3.8
[root@k8master ingress]# kubectl label node k8node1 app=ingress node/k8node1 labeled [root@k8master ingress]# kubectl get node --show-labels NAME STATUS ROLES AGE VERSION LABELS k8master Ready control-plane,master 45h v1.23.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8master,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers= k8node1 Ready work 44h v1.23.0 app=ingress,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8node1,kubernetes.io/os=linux,node-role.kubernetes.io/work=work
(2)部署deploy.yaml, kubect apply -f
kubectl apply -f deploy.yaml #通过 kubectl apply 命令进行部署 ,前提是镜像准备好,否则GG
[root@k8master ingress]# kubectl apply -f deploy.yaml namespace/ingress-nginx created serviceaccount/ingress-nginx created serviceaccount/ingress-nginx-admission created role.rbac.authorization.k8s.io/ingress-nginx created role.rbac.authorization.k8s.io/ingress-nginx-admission created clusterrole.rbac.authorization.k8s.io/ingress-nginx created clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created rolebinding.rbac.authorization.k8s.io/ingress-nginx created rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created configmap/ingress-nginx-controller created service/ingress-nginx-controller created service/ingress-nginx-controller-admission created deployment.apps/ingress-nginx-controller created job.batch/ingress-nginx-admission-create created job.batch/ingress-nginx-admission-patch created ingressclass.networking.k8s.io/nginx created validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
(3) 查看状态
kubectl get all -n ingress-nginx #查看 ingress-nginx namespace的 部署情况
[root@k8master ingress]# kubectl get all -n ingress-nginx NAME READY STATUS RESTARTS AGE pod/ingress-nginx-admission-create-fqsl7 0/1 Completed 0 147m pod/ingress-nginx-admission-patch-nmbrd 0/1 Completed 0 147m pod/ingress-nginx-controller-6b68d8cbbf-9xj8t 1/1 Running 0 147m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/ingress-nginx-controller LoadBalancer 10.109.255.117 <pending> 80:30297/TCP,443:31879/TCP 147m service/ingress-nginx-controller-admission ClusterIP 10.99.13.106 <none> 443/TCP 147m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ingress-nginx-controller 1/1 1 1 147m NAME DESIRED CURRENT READY AGE replicaset.apps/ingress-nginx-controller-6b68d8cbbf 1 1 1 147m NAME COMPLETIONS DURATION AGE job.batch/ingress-nginx-admission-create 1/1 8s 147m job.batch/ingress-nginx-admission-patch 1/1 24s 147m
(4)查看 ingress-nginx-controller 的 日志情况
kubectl logs -f ingress-nginx-controller-6b68d8cbbf-9xj8t -n ingress-nginx
(5)测试访问(因为给k8node1节点打了app=ingress标签)
直接访问 k8node1的 ip 即可,因为 ingress-nginx-controller 默认是 监听 80端口,由于上面的 nodeSelector: #选择 node label 中有 app=ingress的节点进行部署 ,而 k8node1 是被打了标签的节点node
(6)部署一个 tomcat 测试 Ingress-nginx
通过部署一个tomcat ,测试Ingress-nginx的代理 是否生效
1. 编写 deploy-tomcat.yaml
- Deployment 部署tomcat:8.0-alpine,
- Service 暴露 tomcat pod
- 一个 Ingress 资源它规定 访问 tomcat.demo.com 这个域名的 所有请求 / 都转发到 tomcat-demo Service 上
- IngressClass 新版本提供的资源 ,用于在定义 Ingress资源的时候 指定,在集群中有多个 Ingress controller 的时候很有用处
[root@k8master ingress]# cat deploy-tomcat.yaml apiVersion: apps/v1 kind: Deployment metadata: name: tomcat-demo spec: selector: matchLabels: app: tomcat-demo replicas: 1 template: metadata: labels: app: tomcat-demo spec: containers: - name: tomcat-demo image: tomcat:8.0-alpine ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: tomcat-demo spec: selector: app: tomcat-demo ports: - port: 80 protocol: TCP targetPort: 8080 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: tomcat-demo spec: defaultBackend: service: name: default-http-backend port: number: 80 ingressClassName: nginx rules: - host: tomcat.demo.com http: paths: - pathType: Prefix path: "/" backend: service: name: tomcat-demo port: number: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: registry.cn-hangzhou.aliyuncs.com/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi --- apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend
2. deploy-tomcat.yaml解释
apiVersion: apps/v1 kind: Deployment metadata: name: tomcat-demo spec: selector: matchLabels: app: tomcat-demo replicas: 1 template: metadata: labels: app: tomcat-demo spec: containers: - name: tomcat-demo image: tomcat:8.0-alpine ports: - containerPort: 8080 创建一个名为 tomcat-demo 的 Deployment 对象,用于部署 Tomcat 应用。 使用标签选择器匹配 app: tomcat-demo 的 pod。 设置副本数为 1。 定义 pod 模板,使用 Tomcat 8.0 Alpine 版本镜像,容器暴露 8080 端口
apiVersion: v1 kind: Service metadata: name: tomcat-demo spec: selector: app: tomcat-demo ports: - port: 80 protocol: TCP targetPort: 8080 创建一个名为 tomcat-demo 的 Service 对象,用于公开 Tomcat 应用。 使用标签选择器选择具有 app: tomcat-demo 标签的 pod。 在服务上公开端口 80,将流量转发到 pod 的 8080 端口。
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: tomcat-demo spec: defaultBackend: service: name: default-http-backend port: number: 80 ingressClassName: nginx rules: - host: tomcat.demo.com http: paths: - pathType: Prefix path: "/" backend: service: name: tomcat-demo port: number: 80 创建一个名为 tomcat-demo 的 Ingress 对象,定义了路由规则。 使用默认的后端服务 default-http-backend 处理未匹配到规则的请求。 指定 Ingress 使用的类别是 nginx。 定义了一个规则:当请求的主机是 tomcat.demo.com 时,将请求路径为 "/" 的流量转发到 tomcat-demo 服务的 80 端口
apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: registry.cn-hangzhou.aliyuncs.com/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi 创建一个名为 default-http-backend 的 Deployment 对象,用于部署默认的后端服务。 设置副本数为 1。 使用标签选择器匹配 app: default-http-backend 的 pod。 定义 pod 模板,使用默认后端服务的镜像,容器暴露 8080 端口。 配置 livenessProbe,确保服务正常运行
apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend 创建一个名为 default-http-backend 的 Service 对象,用于公开默认的后端服务。 在服务上公开端口 80,将流量转发到 pod 的 8080 端口。 使用标签选择器选择具有 app: default-http-backend 标签的 pod。 这个配置文件定义了一个包含 Tomcat 应用和默认后端服务的完整 Kubernetes 部署,以及相关的 Service 和 Ingress 资源。 Ingress 规则将根据主机名和路径将流量路由到相应的服务
3.执行deploy-tomcat.yaml文件
[root@k8master ingress]# kubectl apply -f deploy-tomcat.yaml deployment.apps/tomcat-demo unchanged service/tomcat-demo unchanged ingress.networking.k8s.io/tomcat-demo created deployment.apps/default-http-backend created service/default-http-backend created
4.测试是否可以访问成功
编辑windows的hosts文件
172.16.3.199 tomcat.demo.com
172.16.3.199 api.demo.com
10.参考部署文档
https://blog.51cto.com/u_16213624/7693786
11.我们自己的业务需求
11.1业务需求说明
我们是做算法的,每个算法会被封装成一个镜像,比如:image1(入侵算法),image2(安全带识别算)
结合k8s流程: ingress-nginx(做了hostNetwork: true 网络模式,省去了service的一成转发),直接可以访问ingress-nginx的域名和端口——>客户通过ingress发布的host+path+业务自定义的接口地址拼接成访问url——>ingress-nginx根据对应的匹配规则转发 到后端service——>deployment——>pods(pvc——>pv)
最后的流程就是:客户通过固定的接口格式,访问对应的路由匹配到不同的后端service提供算法服务
11.2 ai-ingress部署
[root@k8master ingress]# cat ai-ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ai-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: /ai/test/alg/$1 spec: defaultBackend: service: name: default-http-backend port: number: 80 ingressClassName: nginx rules: - host: funuo.ai.com http: paths: - pathType: Prefix path: "/A/(.*)" backend: service: name: ai-svc port: number: 28865 - pathType: Prefix path: "/B/(.*)" backend: service: name: ai1-svc port: number: 28865 --- apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: registry.cn-hangzhou.aliyuncs.com/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi --- apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend
11.3 ingress部分解释
在我的 Ingress 配置中,path: "/B/(.*)", 以及 annotations 中的 nginx.ingress.kubernetes.io/rewrite-target: /ai/test/alg/$1 的写法是用于通过正则表达式捕获路径的一部分,并在重写目标中使用捕获到的值。 具体来说: path: "/B/(.*)": 这是一个使用正则表达式的路径规则,其中 (.*) 表示捕获任意字符序列。在这个规则中,路径以 /B/ 开头,然后 (.*) 捕获后续的字符序列。 annotations: nginx.ingress.kubernetes.io/rewrite-target: /ai/test/alg/$1: 这是一个注解,它告诉 Ingress 控制器在将请求发送到后端服务之前重写路径。其中 $1 是在路径匹配中捕获到的第一个组的值(即正则表达式中的 (.*) 部分)。 如果你的路径是 /B/infer/ai/test/alg/infer,那么 /B/(.*) 中的 (.*) 会捕获 infer/ai/test/alg/infer,然后 $1 就会被替换为捕获到的值,最终的重写路径就是 /ai/test/alg/infer/ai/test/alg/infer。 如果你的路径是 /B/foo/bar,那么 /B/(.*) 中的 (.*) 会捕获 foo/bar,然后 $1 就会被替换为捕获到的值,最终的重写路径就是 /ai/test/alg/foo/bar。
11.4 ingress状态查询
[root@k8master ingress]# kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE ai-ingress nginx funuo.ai.com 80 3h34m tomcat-demo nginx tomcat.demo.com 80 3h51m
11.5 结果展示(还需要提前启动好对应的服务,这就是业务上的事情了,其他的部署方式可以参考文档)
12.k8s自动伸缩(CPU、MEM)
12.1 自动伸缩类型说明
Horizontal Pod Autoscaler (HPA): HPA 允许根据应用程序的负载动态地调整 Pod 的副本数量。它监测 Pod 的 CPU 或内存使用率,并在需要时增加或减少 Pod 的数量,以确保应用程序能够满足负载需求。
Cluster Autoscaler (CA): CA 允许在集群级别动态调整节点的数量。它监测节点的资源利用率,并在需要时增加或减少节点的数量,以确保集群能够满足 Pod 的资源需求。CA 通常用于处理整个集群的自动伸缩,而不仅仅是应用程序的 Pod。
Vertical Pod Autoscaler (VPA): VPA 允许根据容器的资源需求调整容器的资源分配。它监测容器的 CPU 和内存使用情况,并在需要时调整容器的请求和限制,以确保容器能够获得足够的资源。
Horizontal Vertical Pod Autoscaler (HVPA): HVPA 结合了 HPA 和 VPA 的功能,允许根据应用程序负载和容器资源需求动态地调整 Pod 和容器的副本数量和资源分配。
12.2 HPA工作原理
1)对于每个pod的资源指标(如CPU),控制器从资源指标API中获取每一个 HorizontalPodAutoscaler指定的pod的指标,然后,如果设置了目标使用率,控制器获取每个pod中的容器资源使用情况,并计算资源使用率。如果使用原始值,将直接使用原始数据(不再计算百分比)。然后,控制器根据平均的资源使用率或原始值计算出缩放的比例,进而计算出目标副本数。需要注意的是,如果pod某些容器不支持资源采集,那么控制器将不会使用该pod的CPU使用率
2)如果 pod 使用自定义指标,控制器机制与资源指标类似,区别在于自定义指标只使用原始值,而不是使用率。
3)如果pod 使用对象指标和外部指标(每个指标描述一个对象信息)。这个指标将直接跟据目标设定值相比较,并生成一个上面提到的缩放比例。在autoscaling/v2beta2版本API中,这个指标也可以根据pod数量平分后再计算。通常情况下,控制器将从一系列的聚合API(metrics.k8s.io、custom.metrics.k8s.io和external.metrics.k8s.io)中获取指标数据。metrics.k8s.io API通常由 metrics-server(需要额外启动)提供。
12.3 HPA版本介绍
[root@k8master data]# kubectl api-versions | grep autoscal autoscaling/v1 autoscaling/v2 autoscaling/v2beta1 autoscaling/v2beta2
autoscaling/v1:
这是最早的 Autoscaling API 版本,引入了 Horizontal Pod Autoscaler(HPA)。
HPA 允许根据指标(如 CPU 使用率)自动调整 Pod 的数量,以适应应用程序的负载变化。
该版本提供了基本的水平伸缩功能,但可能缺乏一些后续版本引入的高级功能。
autoscaling/v2:
在 Kubernetes 1.8 版本中引入,引入了 Custom Metrics API,允许用户使用自定义指标进行自动伸缩。
Custom Metrics API 扩展了 HPA 的功能,使其能够使用更多类型的指标,而不仅仅是默认的 CPU 和内存指标。
autoscaling/v2beta1:
在 Kubernetes 1.8 版本中引入,为 Horizontal Pod Autoscaler(HPA)引入了多指标支持。
这使得可以同时使用多个指标进行自动伸缩,例如同时基于 CPU 和内存的指标。
这个版本引入了对多指标自动伸缩的支持,提供更多灵活性。
autoscaling/v2beta2:(需要k8s版本1.23+)
这是当前(截至我的知识截止日期为2022年1月)最新的 HPA 版本,提供了更多的功能和改进。
改进了 API 的可用性和效率,引入了稳定性,增强了与 Custom Metrics API 的集成,支持更复杂的规则和指标。
12.4 metrics server
metrics-server是一个集群范围内的资源数据集和工具,同样的,metrics-server也只是显示数据,并不提供数据存储服务,主要关注的是资源度量API的实现,比如CPU、文件描述符、内存、请求延时等指标,metric-server收集数据给k8s集群内使用,如kubectl,hpa,scheduler等
(1)metrics server部署文档
https://zhuanlan.zhihu.com/p/572406293
(2)修改后的配置文件展示
[root@k8master data]# cat metrics-server-components.yaml apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - nodes/metrics verbs: - get - apiGroups: - "" resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s #lipc 添加 spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s - --kubelet-insecure-tls image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100
(3)应用metrics server
kubectl apply -f metrics-server-components.yaml
(4)查看metrics server服务状态
[root@k8master data]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-5b9cd88b65-vg4gq 1/1 Running 3 (5d15h ago) 5d20h calico-node-86l8c 1/1 Running 2 (5d15h ago) 5d20h calico-node-lg2mg 1/1 Running 0 5d20h coredns-6d8c4cb4d-wm8d2 1/1 Running 0 5d21h coredns-6d8c4cb4d-xxdmm 1/1 Running 0 5d21h etcd-k8master 1/1 Running 0 5d21h kube-apiserver-k8master 1/1 Running 0 5d21h kube-controller-manager-k8master 1/1 Running 0 5d21h kube-proxy-bbvzc 1/1 Running 2 (5d15h ago) 5d21h kube-proxy-smhnc 1/1 Running 0 5d21h kube-scheduler-k8master 1/1 Running 0 5d21h metrics-server-fd9598766-495c9 1/1 Running 3 (5d15h ago) 5d20h
(5)测试kubectl top命令
[root@k8master data]# kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% k8master 278m 3% 3218Mi 20% k8node1 300m 1% 14085Mi 44% [root@k8master data]# kubectl top pods NAME CPU(cores) MEMORY(bytes) ai1-6d98756bdd-4b86q 3m 1257Mi ai1-6d98756bdd-5dwlz 3m 1251Mi ai1-6d98756bdd-fhgvv 3m 1265Mi ai1-6d98756bdd-jgrxb 3m 1243Mi ai1-6d98756bdd-mc5zp 3m 1248Mi ai1-6d98756bdd-t2sv4 3m 1264Mi ai1-6d98756bdd-w5vsq 3m 1275Mi ai1-6d98756bdd-z6ptz 3m 1262Mi default-http-backend-ff744689f-wxqnf 1m 3Mi php-apache-866cb4fc88-zvq6g 1m 8Mi tomcat-demo-55b6bbcb97-kmt25 1m 336Mi v1 0m 0Mi
(6)top参数说明
在 Kubernetes 中,CPU 资源的计算是以 CPU 核的 "毫核"(miliCPU)为单位进行的。这意味着 1 毫核等于 0.001 核。 因此,当你在 Pod 配置中设置 CPU 请求为 "100m" 时,表示该容器请求的是 0.1 个 CPU 核。这是因为 100m 表示 100 毫核,而 100m 等于 0.1。 举例来说: "100m" 表示请求 0.1 个 CPU 核。 "200m" 表示请求 0.2 个 CPU 核。 "1000m" 表示请求 1.0 个 CPU 核。
12.5 测试HPA的autoscaling/v2beta2版-基于CPU、MEM自动扩缩容
用Deployment创建一个php-apache服务,然后利用HPA进行自动扩缩容。步骤如下:
12.5.1 创建并运行一个php-apache服务,通过deployment创建pod,在k8s的master节点操作
1)使用dockerfile构建一个新的镜像,在k8s的master节点构建
[root@k8master dockerfile]# cat Dockerfile FROM php:5-apache ADD index.php /var/www/html/index.php RUN chmod a+rx index.php
2)创建前端php文件
[root@k8master dockerfile]# cat index.php <?php $x = 0.0001; for ($i = 0; $i <= 1000000;$i++) { $x += sqrt($x); } echo "OK!"; ?>
3)打包镜像
docker build . -t 172.168.4.177:8090/test/hpa-example:v1 docker push 172.168.4.177:8090/test/hpa-example:v1
4)通过deployment部署一个php-apache服务
[root@k8master hpa]# cat php-apache.yaml apiVersion: apps/v1 kind: Deployment metadata: name: php-apache spec: selector: matchLabels: run: php-apache replicas: 1 template: metadata: labels: run: php-apache spec: containers: - name: php-apache image: 172.168.4.177:8090/test/hpa-example:v1 imagePullPolicy: IfNotPresent ports: - containerPort: 80 resources: limits: cpu: "200m" memory: "256Mi" requests: cpu: "100m" memory: "128Mi" --- apiVersion: v1 kind: Service metadata: name: php-apache labels: run: php-apache spec: ports: - port: 80 selector: run: php-apache
5)创建deployment、service
kubectl apply -f php-apache.yaml
6)查看是否创建成功
[root@k8master hpa]# kubectl get pods -l run=php-apache NAME READY STATUS RESTARTS AGE php-apache-866cb4fc88-zvq6g 1/1 Running 0 147m
12.5.2 创建HPA
1)创建HPA文件
[root@k8master hpa]# cat php-apache-hpa.yaml apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: php-apache-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment #监控的是deployment name: php-apache #deployment的名称 minReplicas: 1 #最小pod个数 maxReplicas: 10 #最大pod个数 metrics: - type: Resource resource: name: cpu target: type: Utilization #类型百分比 averageUtilization: 50 #50% - type: Resource resource: name: memory target: type: AverageValue #类型数值 averageValue: 200Mi # 设置目标平均内存值
2)启用自动伸缩
kubectl apply -f php-apache-hpa.yaml
3)查看是否创建成功
[root@k8master hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache-hpa Deployment/php-apache 9195520/200Mi, 1%/50% 1 10 1 143m
4)压测php-apache服务,只是针对CPU做压测
[root@k8master hpa]# kubectl run v1 -it --image=busybox /bin/sh If you don't see a command prompt, try pressing enter. / # while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!^C
5)监控hpa伸缩变化
[root@k8master hpa]# kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache-hpa Deployment/php-apache 6074368/200Mi, 1%/50% 1 10 1 49m php-apache-hpa Deployment/php-apache 8585216/200Mi, 179%/50% 1 10 1 49m php-apache-hpa Deployment/php-apache 8585216/200Mi, 179%/50% 1 10 4 49m php-apache-hpa Deployment/php-apache 7797418666m/200Mi, 141%/50% 1 10 4 49m php-apache-hpa Deployment/php-apache 7801856/200Mi, 141%/50% 1 10 4 50m php-apache-hpa Deployment/php-apache 8285184/200Mi, 48%/50% 1 10 4 50m php-apache-hpa Deployment/php-apache 8591360/200Mi, 49%/50% 1 10 4 50m php-apache-hpa Deployment/php-apache 8604672/200Mi, 64%/50% 1 10 4 50m php-apache-hpa Deployment/php-apache 8620032/200Mi, 48%/50% 1 10 6 51m php-apache-hpa Deployment/php-apache 8141209600m/200Mi, 34%/50% 1 10 6 51m php-apache-hpa Deployment/php-apache 8118954666m/200Mi, 44%/50% 1 10 6 51m php-apache-hpa Deployment/php-apache 8123733333m/200Mi, 37%/50% 1 10 6 51m php-apache-hpa Deployment/php-apache 8444586666m/200Mi, 38%/50% 1 10 6 52m php-apache-hpa Deployment/php-apache 8519680/200Mi, 33%/50% 1 10 6 52m php-apache-hpa Deployment/php-apache 8624128/200Mi, 35%/50% 1 10 6 52m php-apache-hpa Deployment/php-apache 8633002666m/200Mi, 33%/50% 1 10 6 52m php-apache-hpa Deployment/php-apache 8658261333m/200Mi, 35%/50% 1 10 6 53m php-apache-hpa Deployment/php-apache 8642560/200Mi, 27%/50% 1 10 6 53m php-apache-hpa Deployment/php-apache 8647338666m/200Mi, 38%/50% 1 10 6 53m php-apache-hpa Deployment/php-apache 8648704/200Mi, 34%/50% 1 10 6 53m php-apache-hpa Deployment/php-apache 8649386666m/200Mi, 42%/50% 1 10 6 54m php-apache-hpa Deployment/php-apache 8652800/200Mi, 35%/50% 1 10 6 54m php-apache-hpa Deployment/php-apache 8655530666m/200Mi, 43%/50% 1 10 6 54m php-apache-hpa Deployment/php-apache 8656213333m/200Mi, 25%/50% 1 10 6 54m php-apache-hpa Deployment/php-apache 8677376/200Mi, 37%/50% 1 10 6 55m php-apache-hpa Deployment/php-apache 8742912/200Mi, 38%/50% 1 10 6 55m php-apache-hpa Deployment/php-apache 8746325333m/200Mi, 34%/50% 1 10 6 55m php-apache-hpa Deployment/php-apache 8746325333m/200Mi, 24%/50% 1 10 6 55m php-apache-hpa Deployment/php-apache 8746325333m/200Mi, 5%/50% 1 10 6 56m php-apache-hpa Deployment/php-apache 8746325333m/200Mi, 1%/50% 1 10 6 56m php-apache-hpa Deployment/php-apache 8746325333m/200Mi, 1%/50% 1 10 6 59m php-apache-hpa Deployment/php-apache 8762163200m/200Mi, 1%/50% 1 10 5 59m php-apache-hpa Deployment/php-apache 8762163200m/200Mi, 1%/50% 1 10 5 60m php-apache-hpa Deployment/php-apache 8835072/200Mi, 1%/50% 1 10 3 60m php-apache-hpa Deployment/php-apache 9195520/200Mi, 1%/50% 1 10 1 61m
6)php-apache的pods个数变化观察
[root@k8master ~]# kubectl get pods -l run=php-apache NAME READY STATUS RESTARTS AGE php-apache-866cb4fc88-7dmrp 1/1 Running 0 4m26s php-apache-866cb4fc88-7sx4z 1/1 Running 0 4m26s php-apache-866cb4fc88-b4hjw 1/1 Running 0 5m56s php-apache-866cb4fc88-km7nx 1/1 Running 0 5m56s php-apache-866cb4fc88-mwx5j 1/1 Running 0 5m56s php-apache-866cb4fc88-zvq6g 1/1 Running 0 62m [root@k8master ~]# kubectl get pods -l run=php-apache NAME READY STATUS RESTARTS AGE php-apache-866cb4fc88-7sx4z 1/1 Running 0 9m11s php-apache-866cb4fc88-b4hjw 1/1 Running 0 10m php-apache-866cb4fc88-km7nx 1/1 Running 0 10m php-apache-866cb4fc88-mwx5j 1/1 Running 0 10m php-apache-866cb4fc88-zvq6g 1/1 Running 0 67m [root@k8master ~]# kubectl get pods -l run=php-apache NAME READY STATUS RESTARTS AGE php-apache-866cb4fc88-zvq6g 1/1 Running 0 69m
12.5.3 基于autoscaling/v2beta2对内存进行压测
1)创建nginx的deployment的pod、service
[root@k8master hpa]# cat deploy-nginx.yaml apiVersion: apps/v1 kind: Deployment metadata: name: nginx-hpa spec: selector: matchLabels: app: nginx replicas: 1 template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.9.1 ports: - containerPort: 80 name: http protocol: TCP resources: requests: cpu: 0.01 memory: 25Mi limits: cpu: 0.05 memory: 60Mi --- apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: selector: app: nginx type: NodePort ports: - name: http protocol: TCP port: 80 targetPort: 80 nodePort: 30080
2)应用deploy-nginx的deployment
[root@k8master hpa]# kubectl apply -f deploy-nginx.yaml
3)查看运行状态
[root@k8master hpa]# kubectl get pods NAME READY STATUS RESTARTS AGE ai1-6d98756bdd-4b86q 1/1 Running 0 14h ai1-6d98756bdd-5dwlz 1/1 Running 0 14h ai1-6d98756bdd-fhgvv 1/1 Running 0 14h ai1-6d98756bdd-jgrxb 1/1 Running 0 14h ai1-6d98756bdd-mc5zp 1/1 Running 0 14h ai1-6d98756bdd-t2sv4 1/1 Running 0 14h ai1-6d98756bdd-w5vsq 1/1 Running 0 14h ai1-6d98756bdd-z6ptz 1/1 Running 0 14h default-http-backend-ff744689f-wxqnf 1/1 Running 0 4d nginx-hpa-7f5d6fbfd9-fp5qm 1/1 Running 0 47s php-apache-866cb4fc88-zvq6g 1/1 Running 0 168m tomcat-demo-55b6bbcb97-kmt25 1/1 Running 0 4d1h v1 1/1 Running 0 114m
4)创建一个hpa(可以基于百分比,也可以基于具体数值定义伸缩的阈值)
[root@k8master hpa]# cat nginx-hpa.yaml apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: nginx-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx-hpa minReplicas: 1 maxReplicas: 10 metrics: #- type: Resource #resource: # name: cpu # target: # type: Utilization # averageUtilization: 50 - type: Resource resource: name: memory target: type: AverageValue averageValue: 50Mi # 设置目标平均内存值
5)应用hpa
[root@k8master hpa]# kubectl apply -f nginx-hpa.yaml Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+ horizontalpodautoscaler.autoscaling/nginx-hpa created
6)查看hpa状态
[root@k8master hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE nginx-hpa Deployment/nginx-hpa 1380352/50Mi 1 10 1 17s php-apache-hpa Deployment/php-apache 9195520/200Mi, 1%/50% 1 10 1 168m
7)压测nginx内存,会自动伸缩pod
[root@k8master hpa]# kubectl get pods NAME READY STATUS RESTARTS AGE nginx-hpa-7f5d6fbfd9-fp5qm 1/1 Running 0 10m [root@k8master hpa]# kubectl exec -it nginx-hpa-7f5d6fbfd9-fp5qm bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. root@nginx-hpa-7f5d6fbfd9-fp5qm:/# dd if=/dev/zero of=/tmp/a
8)hpa展示
[root@k8master hpa]# kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE nginx-hpa Deployment/nginx-hpa 31979520/20Mi 1 10 2 11m nginx-hpa Deployment/nginx-hpa 31979520/20Mi 1 10 4 11m nginx-hpa Deployment/nginx-hpa 10496k/20Mi 1 10 4 11m nginx-hpa Deployment/nginx-hpa 16656384/20Mi 1 10 4 12m nginx-hpa Deployment/nginx-hpa 16721920/20Mi 1 10 4 12m
nginx-hpa Deployment/nginx-hpa 16721920/20Mi 1 10 1 15m
9)nginx pod伸缩展示
[root@k8master hpa]# kubectl get pods -l app=nginx -w NAME READY STATUS RESTARTS AGE nginx-hpa-7f5d6fbfd9-cf4hg 1/1 Running 0 4m20s nginx-hpa-7f5d6fbfd9-fp5qm 1/1 Running 0 19m nginx-hpa-7f5d6fbfd9-p59wk 1/1 Running 0 19s nginx-hpa-7f5d6fbfd9-pkdn5 1/1 Running 0 19s
[root@k8master hpa]# kubectl get pods -l app=nginx -w NAME READY STATUS RESTARTS AGE nginx-hpa-7f5d6fbfd9-cf4hg 1/1 Running 0 4m20s
12.5.4 k8s HPA自动伸缩文档
https://www.52dianzi.com/category/article/2a49706c73ced38df05ad1aba2966cf0.html
至此我们基于autoscaling/v2beta2 对cpu和内存自定义pod的伸缩就完成了!!!
13 k8s基于GPU值的自动伸缩
13.1 参考文档
https://www.cnblogs.com/lixinliang/p/16938630.html
13.2
14 argocd部署与应用
14.1 argocd介绍
14.1.1 argocd简介
- Argo CD 是用于 Kubernetes 的声明性 GitOps 连续交付工具
- Argo CD 的主要职责是 CD(Continuous Delivery,持续交付),将应用部署到 Kubernetes 等环境中,而 CI(Continuous Integration,持续集成)主要是交给 Jenkins,Gitlab CI 等工具来完成
14.1.2 架构图
- Argo CD 从 Git Repo 拉取应用的配置,部署在 Kubernetes 集群中。
- 当有人新增功能时,提交一个 Pull Requests 到 Git Repo 修改应用的部署配置,等待合并。
- 在 Pull Requests 合并之后,通过 Webhook 触发 Argo CD 执行更新操作。
- 应用得到更新,发送通知
14.2 argocd部署
14.2.1 版本信息
k8s集群版本:v1.23.0
argocd版本:v2.10.0-rc1
14.2.2 argocd官网
https://argo-cd.readthedocs.io/en/stable/operator-manual/installation/
#选择我们需要的yaml部署文件,我这里是测试,没有使用ha相关的,可以根据需求自己获取对应的文件
#点击上边会跳转到github,下载对应版本的install.yaml文件,我选择的是目前(2023-12-20)最新的版本,如果wget下载不了,可以在浏览器中先下载,在传到服务器
14.2.3 创建argocd的命名空间
kubectl create namespace argocd
kubectl get ns
14.2.4 上传到服务器,目前只能安装2.3.5版本的
rz install.yaml(2.10.0-rc1)
14.2.5 应用argocd yaml
kubectl apply -n argocd -f install.yaml
- 部署遇到问题,在部署最新版本以及较新版的时候,argocd-repo-server报错
[root@k8master argocd]# kubectl logs -f argocd-repo-server-54c8d6cf58-c9wxz -n argocd time="2023-12-19T09:39:15Z" level=info msg="ArgoCD Repository Server is starting" built="2023-12-01T23:05:50Z" commit=6eba5be864b7e031871ed7698f5233336dfe75c7 port=8081 version=v2.9.3+6eba5be time="2023-12-19T09:39:15Z" level=info msg="Generating self-signed TLS certificate for this session" time="2023-12-19T09:39:15Z" level=info msg="Initializing GnuPG keyring at /app/config/gpg/keys" time="2023-12-19T09:39:15Z" level=info msg="gpg --no-permission-warning --logger-fd 1 --batch --gen-key /tmp/gpg-key-recipe224381730" dir= execID=96582 time="2023-12-19T09:39:21Z" level=error msg="`gpg --no-permission-warning --logger-fd 1 --batch --gen-key /tmp/gpg-key-recipe224381730` failed exit status 2" execID=96582 time="2023-12-19T09:39:21Z" level=info msg=Trace args="[gpg --no-permission-warning --logger-fd 1 --batch --gen-key /tmp/gpg-key-recipe224381730]" dir= operation_name="exec gpg" time_ms=6036.826093000001 time="2023-12-19T09:39:21Z" level=fatal msg="`gpg --no-permission-warning --logger-fd 1 --batch --gen-key /tmp/gpg-key-recipe224381730` failed exit status 2"
- 目前的解决方法:
(1)降低argocd的版本,部署argocd的2.3.5是没有问题的
(2)通过argocd中的一些博客可以看到解决方案(推荐使用)
https://github.com/argoproj/argo-cd/issues/9809#issuecomment-1243415495 https://github.com/argoproj/argo-cd/issues/11647#issuecomment-1348758596
- 问题解决,编辑argocd-repo-server 的deployment文件,删除对应的两个参数
#编辑文件
kubectl edit deploy -n argocd argocd-repo-server
#找到这两行,删除完进行保存
seccompProfile:
type: RuntimeDefault
#如图:
14.2.6 查看安装详情
kubectl get pods -n argocd -o wide
#如图
14.2.7 给argocd-server设置nodeport映射端口
kubectl patch svc argocd-server -p '{"spec": {"type": "NodePort"}}' -n argocd
#如图
14.2.8 获取argocd ui登陆密码(admin/z7A7xxxxvxR-h3i1)
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
#如图
14.2.9 argocd ui登陆方式
https://172.16.4.58:32066
admin/z7A7xxxxvxR-h3i1
argocd web UI
14.2.10 argocd CLI命令行工具安装(浏览器登陆下载指定版本)
#下载指定版本 https://github.com/argoproj/argo-cd/releases/download/v2.3.5/argocd-linux-amd64 https://github.com/argoproj/argo-cd/releases/download/v2.10.0-rc1/argocd-linux-amd64 #我当前使用的是这个版本 #上传到服务器,并给执行权限 chmod +x argocd-linux-amd64 #做软连接全局可用 ln -s /data/argocd/argocd-linux-amd64 /usr/bin/argocd #查看版本信息 argocd version
展示:
[root@k8master argocd]# argocd version argocd: v2.10.0-rc1+d919606 BuildDate: 2023-12-18T21:04:29Z GitCommit: d9196060c2d8ea3eadafe278900e776760c5fcc6 GitTreeState: clean GoVersion: go1.21.5 Compiler: gc Platform: linux/amd64 argocd-server: v2.10.0-rc1+d919606 BuildDate: 2023-12-18T20:45:12Z GitCommit: d9196060c2d8ea3eadafe278900e776760c5fcc6 GitTreeState: clean GoVersion: go1.21.3 Compiler: gc Platform: linux/amd64 Kustomize Version: v5.2.1 2023-10-19T20:13:51Z Helm Version: v3.13.2+g2a2fb3b Kubectl Version: v0.26.11 Jsonnet Version: v0.20.0
14.2.11 使用 CLI 登录集群 IP是当前部署服务器的主机或者是master机器,端口是argocd-server的svc的nodeport端口
argocd login 172.16.4.58:32066 --username admin --password z7A7xxxxvxR-h3i1
展示:
[root@k8master argocd]# argocd login 172.16.4.58:32066 --username admin --password z7A7xxxxvxR-h3i1 WARNING: server certificate had error: tls: failed to verify certificate: x509: cannot validate certificate for 172.16.4.58 because it doesn't contain any IP SANs. Proceed insecurely (y/n)? y 'admin:login' logged in successfully Context '172.16.4.58:32066' updated
14.2.12 修改密码(admin/keshen@1234),修改后用新密码登陆
argocd account update-password --account admin --current-password z7A7xxxxvxR-h3i1 --new-password keshen@1234
展示:
[root@k8master argocd]# argocd account update-password --account admin --current-password z7A7xxxxvxR-h3i1 --new-password keshen@1234 Password updated Context '172.16.4.58:32066' updated
14.3 gitlab代码仓准备
- 在 Gitlab 上创建项目,取名为 argocd-lab,为了方便实验将仓库设置为 public 公共仓库。
- 在仓库中创建 yfile 目录,在目录中创建两个 yaml 资源文件,分别是 myapp-deployment.yaml 和 myapp-service.yaml。
14.3.1 创建一个项目
14.3.2 创建一个test分支
14.3.3 将yaml文件推送到gitlab
配置git全局账号 git config user.email "lipc@zhengjue-ai.com" git config --global user.name "lipc" 拉取gitlab项目 git clone http://172.16.4.53/lipc/argocd-lab.git 切换分支test cd argocd-lab/ git checkout test 创建yfile目录,存放argocd使用的yaml文件资源,也是后边配置argocd的path资源路径的值 mkdir yfile 创建yaml文件,并拷贝到yflie目录中,文件在下边有展示 myapp-deployment.yaml myapp-service.yaml 将yaml文件提交到gitlab git add . git commit -m "file yaml" git push origin test
#yaml文件展示
[root@k8master yfile]# cat myapp-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: ksapp spec: replicas: 1 selector: matchLabels: app: ksapp template: metadata: labels: app: ksapp spec: containers: - image: registry.cn-shanghai.aliyuncs.com/public-namespace/myapp:v1 name: ksapp ports: - containerPort: 80
[root@k8master yfile]# cat myapp-service.yaml apiVersion: v1 kind: Service metadata: name: ksapp spec: ports: - port: 80 targetPort: 80 nodePort: 32060 type: NodePort selector: app: ksapp
#如图展示
14.4 创建 Argo CD App
14.4.1首先创建一个命名空间 devops 用于 Argo CD 部署应用
kubectl create ns devops
kubectl get ns
14.4.2 创建argocd app
方式一:使用 UI 创建 App
- Application Name: 自定义的应用名。
- Project: 使用默认创建好的 default 项目。
- SYNC POLICY: 同步方式,可以选择自动或者手动,这里我们选择手动同步。
- Repository URL: 项目的 Git 地址。
- Revision: 分支名。
- Path: yaml 资源文件所在的相对路径。
- Cluster URL: Kubernetes API Server 的访问地址,由于 Argo CD 和下发应用的 Kubernetes 集群是同 一个,因此可以直接使用 https://kubernetes.default.svc 来访问。关于 Kubernetes 中 DNS 解析规则可 以查看 Pod 与 Service 的 DNS。
- Namespace: 部署应用的命名空间。
- 创建完成后如下图所示,此时处于 OutOfSync 的状态:
- 由于设置的是手动同步,因此需要点一下下面的 SYNC 进行同步:
- 在弹出框点击 SYNCHRONIZE,确认同步:
- 等待同步完成
- 同步完成
- 在 Argo CD 上点击应用进入查看详情,如下图:
方式二:使用 CLI 创建 APP
argocd app create ksapp2 \ --repo http://172.16.4.53/lipc/argocd-lab.git \ --path yfile \ --dest-server https://kubernetes.default.svc \ --dest-namespace devops \ --revision test
- argocd 查看创建信息
[root@k8master yfile]# argocd app list #列出app应用 NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET argocd/ksapp https://kubernetes.default.svc devops default Synced Healthy <none> <none> http://172.16.4.53/lipc/argocd-lab.git yfile test argocd/ksapp2 https://kubernetes.default.svc devops default OutOfSync Healthy <none> SharedResourceWarning(2) http://172.16.4.53/lipc/argocd-lab.git yfile test
[root@k8master yfile]# argocd app get ksapp #查看ksapp应用 Name: argocd/ksapp Project: default Server: https://kubernetes.default.svc Namespace: devops URL: https://172.16.4.58:32066/applications/ksapp Repo: http://172.16.4.53/lipc/argocd-lab.git Target: test Path: yfile SyncWindow: Sync Allowed Sync Policy: <none> Sync Status: Synced to test (42ca17b) Health Status: Healthy GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE Service devops myapp Synced Healthy service/myapp created apps Deployment devops myapp Synced Healthy deployment.apps/myapp created
方式三:使用 YAML 文件创建
apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: ksapp namespace: argocd spec: destination: namespace: devops # 部署应用的命名空间 server: https://kubernetes.default.svc # API Server 地址 project: default # 项目名 source: path: quickstart # 资源文件路径 repoURL: http://172.16.4.53/lipc/argocd-lab.git # Git 仓库地址 targetRevision: test # 分支名
14.5 版本升级
- 将 ksapp 应用从手动同步改成自动同步。点击 APP DETAILS -> SYNC POLICY,点击 ENABLE AUTO- SYNC
- 当前版本
- 编辑 ksapp 资源文件,将版本从 v1 改为 v2,点击 Commit changes,提交更改:
- 等待一会 Argo CD 会自动更新应用,如果等不及可以点击 Refresh,Argo CD 会去立即获取最新的资源 文件。可以看到此时 ksapp Deployment 会新创建 v2 版本的 Replicaset,v2 版本的 Replicaset 会创 建并管理 v2 版本的 Pod
- 升级之后的版本v2
14.6 版本回滚
- 升级到 v2 版本以后, v1 版本的 Replicaset 并没有被删除,而是继续保留,这是为了方便我们回滚应 用。在 ksapp 应用中点击 HISTORY AND ROLLBACK 查看历史记录,可以看到有 2 个历史记录:
- 假设刚刚上线的 v2 版本出现了问题,需要回滚回 v1 版本,那么可以选中 v1 版本,然后点击 Rollback 进行回滚:
- 在回滚的时候需要禁用 AUTO-SYNC 自动同步,点击 OK 确认即可:
- 等待一会可以看到此时已经回滚成功,此时 Pod 是 v1 版本的,并且由于此时线上的版本并不是 Git 仓 库中最新的版本,因此此时同步状态是 OutOfSync:
14.7 argocd部署参考文档
#主要
https://blog.csdn.net/lzhcoder/article/details/131763500
#辅助
https://blog.csdn.net/Coder_Boy_/article/details/131747958
https://blog.csdn.net/m0_37723088/article/details/129991679
#问题解决
https://linuxea.com/3086.html
15. 新的node节点加入k8s集群
15.1 初始化新的node节点
按照上边步骤——> 3.安装k8s准备工作
15.2 在master上获取加入k8s集群的命令和授权
kubeadm token create --print-join-command
获取的信息:
kubeadm join 172.16.4.58:6443 --token m7ngmk.4mowyu019g6cwttp --discovery-token-ca-cert-hash sha256:de19ee27e18341ce9acf6248d76664c3bc932372745930fe28687a20073c179a
15.3 在新的node节点上执行
15.4 执行完成后master查看node信息
[root@k8master ~]# kubectl get node
16 core dns验证
参考文档:
http://www.taodudu.cc/news/show-634984.html?action=onClick
可以在容器里边ping或者nc指定服务的域名,格式如下:
#statefulset <PodName>.<ServiceName>.<Namespace>.svc.cluster.local mysql-ss-0.mysql-svc.default.svc.cluster.local 3306 #deployment <ServiceName>.<Namespace>.svc.cluster.local nginx-svc.default.svc.cluster.local