64、K8S-使用K8S部署Prometheus、grafana【软件安装】
1、准备工作
1.1、教程Github地址
https://github.com/prometheus-operator/kube-prometheus.git
1.2、下载编写好的yaml
wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.12.0.tar.gz
1.3、解压项目代码
tar xvf kube-prometheus-0.12.0.tar.gz cd kube-prometheus-0.12.0/
2、创建命令空间和自定义资源控制器
2.1、应用资源配置清单
]# kubectl create -f kube-prometheus-0.12.0/manifests/setup/ customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created namespace/monitoring created
2.2、资源配置清单解析
2.2.1、创建命名空间
]# cat kube-prometheus-0.12.0/manifests/setup/namespace.yaml apiVersion: v1 kind: Namespace metadata: name: monitoring ]# kubectl get ns NAME STATUS AGE default Active 9d ingress-nginx Active 4d23h kube-node-lease Active 9d kube-public Active 9d kube-system Active 9d kubernetes-dashboard Active 4d20h monitoring Active 108s
2.2.2、创建自定义控制器
# 其他的配置文件
]# kubectl get customresourcedefinitions.apiextensions.k8s.io | grep coreos alertmanagerconfigs.monitoring.coreos.com 2023-04-12T06:09:30Z alertmanagers.monitoring.coreos.com 2023-04-12T06:09:30Z podmonitors.monitoring.coreos.com 2023-04-12T06:09:30Z probes.monitoring.coreos.com 2023-04-12T06:09:30Z prometheuses.monitoring.coreos.com 2023-04-12T06:09:30Z prometheusrules.monitoring.coreos.com 2023-04-12T06:09:31Z servicemonitors.monitoring.coreos.com 2023-04-12T06:09:31Z thanosrulers.monitoring.coreos.com 2023-04-12T06:09:31Z
3、部署Prometheus资源
3.1、分类出prometheus的资源配置清单
# 将资源配置清单分类为prometheus mkdir prom-server mv prometheus-*.yaml prom-server/ prom-server]# ll -rw-rw-r-- 1 root root 483 Jan 24 18:14 prometheus-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 430 Jan 24 18:14 prometheus-clusterRole.yaml -rw-rw-r-- 1 root root 922 Jan 24 18:14 prometheus-networkPolicy.yaml -rw-rw-r-- 1 root root 546 Jan 24 18:14 prometheus-podDisruptionBudget.yaml -rw-rw-r-- 1 root root 16430 Jan 24 18:14 prometheus-prometheusRule.yaml -rw-rw-r-- 1 root root 1238 Jan 24 18:14 prometheus-prometheus.yaml -rw-rw-r-- 1 root root 507 Jan 24 18:14 prometheus-roleBindingConfig.yaml -rw-rw-r-- 1 root root 1661 Jan 24 18:14 prometheus-roleBindingSpecificNamespaces.yaml -rw-rw-r-- 1 root root 402 Jan 24 18:14 prometheus-roleConfig.yaml -rw-rw-r-- 1 root root 2161 Jan 24 18:14 prometheus-roleSpecificNamespaces.yaml -rw-rw-r-- 1 root root 342 Jan 24 18:14 prometheus-serviceAccount.yaml -rw-rw-r-- 1 root root 624 Jan 24 18:14 prometheus-serviceMonitor.yaml -rw-rw-r-- 1 root root 637 Jan 24 18:14 prometheus-service.yaml
3.2、配置离线镜像
3.2.1、准备离线镜像
docker pull quay.io/prometheus/prometheus:v2.41.0 docker tag quay.io/prometheus/prometheus:v2.41.0 192.168.10.33:80/k8s/prometheus/prometheus:v2.41.0 docker push 192.168.10.33:80/k8s/prometheus/prometheus:v2.41.0
3.2.2、修改资源配置清单image
prom-server]# cat prometheus-prometheus.yaml | grep image image: 192.168.10.33:80/k8s/prometheus/prometheus:v2.41.0
3.3、修改service配置
3.3.1、修改SVC为NodePort用于测试
]# vi prometheus-service.yaml spec: ports: - name: web port: 9090 targetPort: web nodePort: 30090 - name: reloader-web port: 8080 targetPort: reloader-web nodePort: 30080 type: NodePort selector: app.kubernetes.io/component: prometheus app.kubernetes.io/instance: k8s app.kubernetes.io/name: prometheus app.kubernetes.io/part-of: kube-prometheus
3.4、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom-server/ clusterrole.rbac.authorization.k8s.io/prometheus-k8s created clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created networkpolicy.networking.k8s.io/prometheus-k8s created poddisruptionbudget.policy/prometheus-k8s created prometheus.monitoring.coreos.com/k8s created prometheusrule.monitoring.coreos.com/prometheus-k8s-prometheus-rules created rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s-config created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created service/prometheus-k8s created serviceaccount/prometheus-k8s created servicemonitor.monitoring.coreos.com/prometheus-k8s created
3.5、查询运行状态
]# kubectl -n monitoring get prometheus NAME VERSION DESIRED READY RECONCILED AVAILABLE AGE k8s 2.41.0 2 5m55s ]# kubectl -n monitoring get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE prometheus-k8s NodePort 10.111.189.205 <none> 9090:30599/TCP,8080:30090/TCP 6m ]# kubectl -n monitoring get endpoints NAME ENDPOINTS AGE prometheus-k8s <none> 9m58s
4、部署prometheusOperator、prometheusAdapter
4.1、安装prometheusOperator
4.1.1、归类配置清单
mkdir prom_opt mv prometheusOperator-*.yaml prom_opt/ ]# cd prom_opt/ && ll total 36 -rw-rw-r-- 1 root root 471 Jan 24 18:14 prometheusOperator-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 1401 Jan 24 18:14 prometheusOperator-clusterRole.yaml -rw-rw-r-- 1 root root 2631 Jan 24 18:14 prometheusOperator-deployment.yaml -rw-rw-r-- 1 root root 694 Jan 24 18:14 prometheusOperator-networkPolicy.yaml -rw-rw-r-- 1 root root 5819 Jan 24 18:14 prometheusOperator-prometheusRule.yaml -rw-rw-r-- 1 root root 321 Jan 24 18:14 prometheusOperator-serviceAccount.yaml -rw-rw-r-- 1 root root 715 Jan 24 18:14 prometheusOperator-serviceMonitor.yaml -rw-rw-r-- 1 root root 515 Jan 24 18:14 prometheusOperator-service.yaml
4.1.2、配置离线镜像
# 下载镜像为本地 docker pull quay.io/brancz/kube-rbac-proxy:v0.14.0 docker tag quay.io/brancz/kube-rbac-proxy:v0.14.0 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 docker push 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 docker pull quay.io/prometheus-operator/prometheus-operator:v0.62.0 docker tag quay.io/prometheus-operator/prometheus-operator:v0.62.0 192.168.10.33:80/k8s/prometheus-operator/prometheus-operator:v0.62.0 docker push 192.168.10.33:80/k8s/prometheus-operator/prometheus-operator:v0.62.0 docker pull quay.io/brancz/kube-rbac-proxy:v0.14.0 docker tag quay.io/brancz/kube-rbac-proxy:v0.14.0 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 docker push 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 docker pull quay.io/prometheus-operator/prometheus-config-reloader:v0.62.0 docker tag quay.io/prometheus-operator/prometheus-config-reloader:v0.62.0 192.168.10.33:80/k8s/prometheus-operator/prometheus-config-reloader:v0.62.0 docker push 192.168.10.33:80/k8s/prometheus-operator/prometheus-config-reloader:v0.62.0 # 修改配置文件 sed -i 's#quay.io#192.168.10.33:80/k8s#g' prometheusOperator-deployment.yaml ]# cat prometheusOperator-deployment.yaml | grep -E 'image|reloader' - --prometheus-config-reloader=192.168.10.33:80/k8s/prometheus-operator/prometheus-config-reloader:v0.62.0 image: 192.168.10.33:80/k8s/prometheus-operator/prometheus-operator:v0.62.0 image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0
4.1.3、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_opt/ clusterrole.rbac.authorization.k8s.io/prometheus-operator created clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created deployment.apps/prometheus-operator created networkpolicy.networking.k8s.io/prometheus-operator created prometheusrule.monitoring.coreos.com/prometheus-operator-rules created service/prometheus-operator created serviceaccount/prometheus-operator created servicemonitor.monitoring.coreos.com/prometheus-operator created
4.1.4、检查运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES prometheus-k8s-0 2/2 Running 0 12m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 12m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 12m 10.244.3.93 node1 <none> <none>
4.1.5、查询运行状态
[root@master1 prom_opt]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES prometheus-k8s-0 2/2 Running 0 14m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 14m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 14m 10.244.3.93 node1 <none> <none> [root@master1 prom_opt]# kubectl -n monitoring get prometheus NAME VERSION DESIRED READY RECONCILED AVAILABLE AGE k8s 2.41.0 2 2 True True 34m
4.2、安装prometheusAdapter
4.2.1、归类配置清单
mkdir prom_adapter && mv prometheusAdapter-*.yaml prom_adapter/ && cd prom_adapter/ ]# ll -rw-rw-r-- 1 root root 483 Jan 24 18:14 prometheusAdapter-apiService.yaml -rw-rw-r-- 1 root root 601 Jan 24 18:14 prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml -rw-rw-r-- 1 root root 519 Jan 24 18:14 prometheusAdapter-clusterRoleBindingDelegator.yaml -rw-rw-r-- 1 root root 496 Jan 24 18:14 prometheusAdapter-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 403 Jan 24 18:14 prometheusAdapter-clusterRoleServerResources.yaml -rw-rw-r-- 1 root root 434 Jan 24 18:14 prometheusAdapter-clusterRole.yaml -rw-rw-r-- 1 root root 2205 Jan 24 18:14 prometheusAdapter-configMap.yaml -rw-rw-r-- 1 root root 3179 Jan 24 18:14 prometheusAdapter-deployment.yaml -rw-rw-r-- 1 root root 565 Jan 24 18:14 prometheusAdapter-networkPolicy.yaml -rw-rw-r-- 1 root root 502 Jan 24 18:14 prometheusAdapter-podDisruptionBudget.yaml -rw-rw-r-- 1 root root 516 Jan 24 18:14 prometheusAdapter-roleBindingAuthReader.yaml -rw-rw-r-- 1 root root 324 Jan 24 18:14 prometheusAdapter-serviceAccount.yaml -rw-rw-r-- 1 root root 907 Jan 24 18:14 prometheusAdapter-serviceMonitor.yaml -rw-rw-r-- 1 root root 502 Jan 24 18:14 prometheusAdapter-service.yaml
4.2.2、配置离线镜像
docker pull registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.10.0 docker tag registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.10.0 192.168.10.33:80/k8s/prometheus-adapter/prometheus-adapter:v0.10.0 docker push 192.168.10.33:80/k8s/prometheus-adapter/prometheus-adapter:v0.10.0 sed -i 's#registry.k8s.io#192.168.10.33:80/k8s#g' prometheusAdapter-deployment.yaml ]# cat prometheusAdapter-deployment.yaml | grep image image: 192.168.10.33:80/k8s/prometheus-adapter/prometheus-adapter:v0.10.0
4.2.3、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_adapter/ apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured clusterrole.rbac.authorization.k8s.io/prometheus-adapter created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader configured clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created configmap/adapter-config created deployment.apps/prometheus-adapter created networkpolicy.networking.k8s.io/prometheus-adapter created poddisruptionbudget.policy/prometheus-adapter created rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created service/prometheus-adapter created serviceaccount/prometheus-adapter created servicemonitor.monitoring.coreos.com/prometheus-adapter created
4.2.4、查询运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES prometheus-adapter-67d7695cb7-nlz59 1/1 Running 0 48s 10.244.3.95 node1 <none> <none> prometheus-adapter-67d7695cb7-zqgtd 1/1 Running 0 48s 10.244.4.127 node2 <none> <none> prometheus-k8s-0 2/2 Running 0 25m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 25m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 25m 10.244.3.93 node1 <none> <none>
5、部署kubernetesControlPlane、kubeStateMetrics
5.1、安装kubernetesControlPlane
5.1.1、整理分类
mkdir prom_control_plane && mv kubernetesControlPlane-*.yaml prom_control_plane/ && cd prom_control_plane/ [root@master1 prom_control_plane]# ll total 104 -rw-rw-r-- 1 root root 71670 Jan 24 18:14 kubernetesControlPlane-prometheusRule.yaml -rw-rw-r-- 1 root root 6997 Jan 24 18:14 kubernetesControlPlane-serviceMonitorApiserver.yaml -rw-rw-r-- 1 root root 591 Jan 24 18:14 kubernetesControlPlane-serviceMonitorCoreDNS.yaml -rw-rw-r-- 1 root root 6516 Jan 24 18:14 kubernetesControlPlane-serviceMonitorKubeControllerManager.yaml -rw-rw-r-- 1 root root 7714 Jan 24 18:14 kubernetesControlPlane-serviceMonitorKubelet.yaml -rw-rw-r-- 1 root root 577 Jan 24 18:14 kubernetesControlPlane-serviceMonitorKubeScheduler.yaml
5.1.2、应用资源配置清单
这个服务没有要下载的镜像,直接创建即可 ]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_control_plane/ prometheusrule.monitoring.coreos.com/kubernetes-monitoring-rules created servicemonitor.monitoring.coreos.com/kube-apiserver created servicemonitor.monitoring.coreos.com/coredns created servicemonitor.monitoring.coreos.com/kube-controller-manager created servicemonitor.monitoring.coreos.com/kube-scheduler created servicemonitor.monitoring.coreos.com/kubelet created
5.1.3、查询运行状态
]# kubectl get prometheusrules.monitoring.coreos.com -n monitoring NAME AGE kubernetes-monitoring-rules 63s prometheus-k8s-prometheus-rules 53m prometheus-operator-rules 35m
5.2、安装kubeStateMetrics
5.2.1、整理分类
]# mkdir prom_kube_state_metric ]# mv kubeStateMetrics-*.yaml prom_kube_state_metric/ # cd prom_kube_state_metric/ ]# ll -rw-rw-r-- 1 root root 464 Jan 24 18:14 kubeStateMetrics-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 1903 Jan 24 18:14 kubeStateMetrics-clusterRole.yaml -rw-rw-r-- 1 root root 3428 Jan 24 18:14 kubeStateMetrics-deployment.yaml -rw-rw-r-- 1 root root 723 Jan 24 18:14 kubeStateMetrics-networkPolicy.yaml -rw-rw-r-- 1 root root 3152 Jan 24 18:14 kubeStateMetrics-prometheusRule.yaml -rw-rw-r-- 1 root root 316 Jan 24 18:14 kubeStateMetrics-serviceAccount.yaml -rw-rw-r-- 1 root root 1167 Jan 24 18:14 kubeStateMetrics-serviceMonitor.yaml -rw-rw-r-- 1 root root 580 Jan 24 18:14 kubeStateMetrics-service.yaml
5.2.2、配置离线镜像
# 原来的镜像 prom_kube_state_metric]# grep 'image' * kubeStateMetrics-deployment.yaml: image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0 kubeStateMetrics-deployment.yaml: image: quay.io/brancz/kube-rbac-proxy:v0.14.0 kubeStateMetrics-deployment.yaml: image: quay.io/brancz/kube-rbac-proxy:v0.14.0 # 准备离线镜像 docker pull registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0 docker tag registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0 192.168.10.33:80/k8s/kube-state-metrics/kube-state-metrics:v2.7.0 docker push 192.168.10.33:80/k8s/kube-state-metrics/kube-state-metrics:v2.7.0 # 修改为离线镜像运行 sed -i 's#registry.k8s.io#192.168.10.33:80/k8s#g' kubeStateMetrics-deployment.yaml sed -i 's#quay.io#192.168.10.33:80/k8s#g' kubeStateMetrics-deployment.yaml # 修改后的镜像 prom_kube_state_metric]# grep 'image' * kubeStateMetrics-deployment.yaml: image: 192.168.10.33:80/k8s/kube-state-metrics/kube-state-metrics:v2.7.0 kubeStateMetrics-deployment.yaml: image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 kubeStateMetrics-deployment.yaml: image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0
5.2.3、运行镜像
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_kube_state_metric/ clusterrole.rbac.authorization.k8s.io/kube-state-metrics created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created deployment.apps/kube-state-metrics created networkpolicy.networking.k8s.io/kube-state-metrics created prometheusrule.monitoring.coreos.com/kube-state-metrics-rules created service/kube-state-metrics created serviceaccount/kube-state-metrics created servicemonitor.monitoring.coreos.com/kube-state-metrics created
5.2.4、查询运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-state-metrics-c7c57885f-t9ct7 3/3 Running 0 30s 10.244.3.96 node1 <none> <none> node-exporter-6q6d9 2/2 Running 0 164m 192.168.10.26 master1 <none> <none> node-exporter-7ngm9 2/2 Running 0 164m 192.168.10.29 node1 <none> <none> node-exporter-k7kzr 2/2 Running 0 163m 192.168.10.30 node2 <none> <none> node-exporter-l5cvm 2/2 Running 0 164m 192.168.10.27 master2 <none> <none> prometheus-adapter-67d7695cb7-nlz59 1/1 Running 0 3h7m 10.244.3.95 node1 <none> <none> prometheus-adapter-67d7695cb7-zqgtd 1/1 Running 0 3h7m 10.244.4.127 node2 <none> <none> prometheus-k8s-0 2/2 Running 0 3h32m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 3h32m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 3h32m 10.244.3.93 node1 <none> <none>
6、部署nodeExporter、blackboxExporter、alertmanager
6.1、安装nodeExporter
6.1.1、整理分类
mkdir prom_nodeExporter && mv nodeExporter-*.yaml prom_nodeExporter/ && cd prom_nodeExporter/ ]# ll -rw-rw-r-- 1 root root 468 Jan 24 18:14 nodeExporter-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 485 Jan 24 18:14 nodeExporter-clusterRole.yaml -rw-rw-r-- 1 root root 3640 Jan 24 18:14 nodeExporter-daemonset.yaml -rw-rw-r-- 1 root root 671 Jan 24 18:14 nodeExporter-networkPolicy.yaml -rw-rw-r-- 1 root root 15004 Jan 24 18:14 nodeExporter-prometheusRule.yaml -rw-rw-r-- 1 root root 306 Jan 24 18:14 nodeExporter-serviceAccount.yaml -rw-rw-r-- 1 root root 850 Jan 24 18:14 nodeExporter-serviceMonitor.yaml -rw-rw-r-- 1 root root 492 Jan 24 18:14 nodeExporter-service.yaml
6.1.2、准备离线镜像
docker pull quay.io/prometheus/node-exporter:v1.5.0 docker tag quay.io/prometheus/node-exporter:v1.5.0 192.168.10.33:80/k8s/prometheus/node-exporter:v1.5.0 docker push 192.168.10.33:80/k8s/prometheus/node-exporter:v1.5.0 sed -i 's#quay.io#192.168.10.33:80/k8s#g' nodeExporter-daemonset.yaml ]# cat nodeExporter-daemonset.yaml | grep image image: 192.168.10.33:80/k8s/prometheus/node-exporter:v1.5.0 image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0 # 这些已经下载过,直接配置即可
6.1.3、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_nodeExporter/ clusterrole.rbac.authorization.k8s.io/node-exporter created clusterrolebinding.rbac.authorization.k8s.io/node-exporter created daemonset.apps/node-exporter created networkpolicy.networking.k8s.io/node-exporter created prometheusrule.monitoring.coreos.com/node-exporter-rules created service/node-exporter created serviceaccount/node-exporter created servicemonitor.monitoring.coreos.com/node-exporter created
6.1.4、查询运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES node-exporter-6q6d9 2/2 Running 0 30s 192.168.10.26 master1 <none> <none> node-exporter-7ngm9 2/2 Running 0 30s 192.168.10.29 node1 <none> <none> node-exporter-k7kzr 2/2 Running 0 29s 192.168.10.30 node2 <none> <none> node-exporter-l5cvm 2/2 Running 0 30s 192.168.10.27 master2 <none> <none> prometheus-adapter-67d7695cb7-nlz59 1/1 Running 0 24m 10.244.3.95 node1 <none> <none> prometheus-adapter-67d7695cb7-zqgtd 1/1 Running 0 24m 10.244.4.127 node2 <none> <none> prometheus-k8s-0 2/2 Running 0 48m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 48m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 48m 10.244.3.93 node1 <none> <none>
6.2、安装blackboxExporter
6.2.1、整理分类
mkdir prom_blackbox && mv blackboxExporter-*.yaml prom_blackbox && cd prom_blackbox
prom_blackbox]# ll total 32 -rw-rw-r-- 1 root root 485 Jan 24 18:14 blackboxExporter-clusterRoleBinding.yaml -rw-rw-r-- 1 root root 287 Jan 24 18:14 blackboxExporter-clusterRole.yaml -rw-rw-r-- 1 root root 1392 Jan 24 18:14 blackboxExporter-configuration.yaml -rw-rw-r-- 1 root root 3545 Jan 24 18:14 blackboxExporter-deployment.yaml -rw-rw-r-- 1 root root 722 Jan 24 18:14 blackboxExporter-networkPolicy.yaml -rw-rw-r-- 1 root root 315 Jan 24 18:14 blackboxExporter-serviceAccount.yaml -rw-rw-r-- 1 root root 680 Jan 24 18:14 blackboxExporter-serviceMonitor.yaml -rw-rw-r-- 1 root root 540 Jan 24 18:14 blackboxExporter-service.yaml
6.2.2、准备离线镜像
# 原来的镜像地址 prom_blackbox]# grep 'image' * blackboxExporter-deployment.yaml: image: quay.io/prometheus/blackbox-exporter:v0.23.0 blackboxExporter-deployment.yaml: image: jimmidyson/configmap-reload:v0.5.0 blackboxExporter-deployment.yaml: image: quay.io/brancz/kube-rbac-proxy:v0.14.0 # 下载镜像并且上传至本地仓库 docker pull quay.io/prometheus/blackbox-exporter:v0.23.0 docker tag quay.io/prometheus/blackbox-exporter:v0.23.0 192.168.10.33:80/k8s/prometheus/blackbox-exporter:v0.23.0 docker push 192.168.10.33:80/k8s/prometheus/blackbox-exporter:v0.23.0 docker pull jimmidyson/configmap-reload:v0.5.0 docker tag jimmidyson/configmap-reload:v0.5.0 192.168.10.33:80/k8s/configmap-reload:v0.5.0 docker push 192.168.10.33:80/k8s/configmap-reload:v0.5.0 # 修改配置文件 sed -i 's#jimmidyson#192.168.10.33:80/k8s#g' blackboxExporter-deployment.yaml sed -i 's#quay.io#192.168.10.33:80/k8s#g' blackboxExporter-deployment.yaml # 修改之后的 prom_blackbox]# grep 'image' * blackboxExporter-deployment.yaml: image: 192.168.10.33:80/k8s/prometheus/blackbox-exporter:v0.23.0 blackboxExporter-deployment.yaml: image: 192.168.10.33:80/k8s/configmap-reload:v0.5.0 blackboxExporter-deployment.yaml: image: 192.168.10.33:80/k8s/brancz/kube-rbac-proxy:v0.14.0
6.2.3、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_blackbox/ clusterrole.rbac.authorization.k8s.io/blackbox-exporter created clusterrolebinding.rbac.authorization.k8s.io/blackbox-exporter created configmap/blackbox-exporter-configuration created deployment.apps/blackbox-exporter created networkpolicy.networking.k8s.io/blackbox-exporter created service/blackbox-exporter created serviceaccount/blackbox-exporter created servicemonitor.monitoring.coreos.com/blackbox-exporter created
6.2.4、查询运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES blackbox-exporter-84bb6f6bd9-49rxv 3/3 Running 0 45s 10.244.4.128 node2 <none> <none> kube-state-metrics-c7c57885f-t9ct7 3/3 Running 0 13m 10.244.3.96 node1 <none> <none> node-exporter-6q6d9 2/2 Running 0 177m 192.168.10.26 master1 <none> <none> node-exporter-7ngm9 2/2 Running 0 177m 192.168.10.29 node1 <none> <none> node-exporter-k7kzr 2/2 Running 0 177m 192.168.10.30 node2 <none> <none> node-exporter-l5cvm 2/2 Running 0 177m 192.168.10.27 master2 <none> <none> prometheus-adapter-67d7695cb7-nlz59 1/1 Running 0 3h20m 10.244.3.95 node1 <none> <none> prometheus-adapter-67d7695cb7-zqgtd 1/1 Running 0 3h20m 10.244.4.127 node2 <none> <none> prometheus-k8s-0 2/2 Running 0 3h45m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 3h45m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 3h45m 10.244.3.93 node1 <none> <none>
6.3、安装alertmanager
6.3.1、整理分类
mkdir prom_alertmanager && mv alertmanager-*.yaml prom_alertmanager/ && cd prom_alertmanager/ prom_alertmanager]# ll -rw-rw-r-- 1 root root 928 Jan 24 18:14 alertmanager-alertmanager.yaml -rw-rw-r-- 1 root root 977 Jan 24 18:14 alertmanager-networkPolicy.yaml -rw-rw-r-- 1 root root 561 Jan 24 18:14 alertmanager-podDisruptionBudget.yaml -rw-rw-r-- 1 root root 7072 Jan 24 18:14 alertmanager-prometheusRule.yaml -rw-rw-r-- 1 root root 1443 Jan 24 18:14 alertmanager-secret.yaml -rw-rw-r-- 1 root root 351 Jan 24 18:14 alertmanager-serviceAccount.yaml -rw-rw-r-- 1 root root 637 Jan 24 18:14 alertmanager-serviceMonitor.yaml -rw-rw-r-- 1 root root 650 Jan 24 18:14 alertmanager-service.yaml
6.3.2、准备离线镜像
# 原来的镜像地址 prom_alertmanager]# grep 'image' * alertmanager-alertmanager.yaml: image: quay.io/prometheus/alertmanager:v0.25.0 # 下载离线镜像 docker pull quay.io/prometheus/alertmanager:v0.25.0 docker tag quay.io/prometheus/alertmanager:v0.25.0 192.168.10.33:80/k8s/prometheus/alertmanager:v0.25.0 docker push 192.168.10.33:80/k8s/prometheus/alertmanager:v0.25.0 # 修改离线镜像 sed -i 's#quay.io#192.168.10.33:80/k8s#g' alertmanager-alertmanager.yaml # 修改后的镜像地址 prom_alertmanager]# grep 'image' alertmanager-alertmanager.yaml image: 192.168.10.33:80/k8s/prometheus/alertmanager:v0.25.0
6.3.3、将service修改为NodePort
prom_alertmanager]# cat alertmanager-service.yaml spec: ports: - name: web port: 9093 targetPort: web nodePort: 30093 - name: reloader-web port: 8080 targetPort: reloader-web nodePort: 30081 type: NodePort selector: app.kubernetes.io/component: alert-router app.kubernetes.io/instance: main app.kubernetes.io/name: alertmanager app.kubernetes.io/part-of: kube-prometheus
6.3.4、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_alertmanager/ alertmanager.monitoring.coreos.com/main created networkpolicy.networking.k8s.io/alertmanager-main created poddisruptionbudget.policy/alertmanager-main created prometheusrule.monitoring.coreos.com/alertmanager-main-rules created secret/alertmanager-main created service/alertmanager-main created serviceaccount/alertmanager-main created servicemonitor.monitoring.coreos.com/alertmanager-main created
6.3.5、查询运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-main-0 2/2 Running 0 26s 10.244.3.97 node1 <none> <none> alertmanager-main-1 2/2 Running 0 26s 10.244.4.129 node2 <none> <none> alertmanager-main-2 2/2 Running 0 26s 10.244.4.130 node2 <none> <none> blackbox-exporter-84bb6f6bd9-49rxv 3/3 Running 0 32m 10.244.4.128 node2 <none> <none> kube-state-metrics-c7c57885f-t9ct7 3/3 Running 0 45m 10.244.3.96 node1 <none> <none> node-exporter-6q6d9 2/2 Running 0 3h28m 192.168.10.26 master1 <none> <none> node-exporter-7ngm9 2/2 Running 0 3h28m 192.168.10.29 node1 <none> <none> node-exporter-k7kzr 2/2 Running 0 3h28m 192.168.10.30 node2 <none> <none> node-exporter-l5cvm 2/2 Running 0 3h28m 192.168.10.27 master2 <none> <none> prometheus-adapter-67d7695cb7-nlz59 1/1 Running 0 3h52m 10.244.3.95 node1 <none> <none> prometheus-adapter-67d7695cb7-zqgtd 1/1 Running 0 3h52m 10.244.4.127 node2 <none> <none> prometheus-k8s-0 2/2 Running 0 4h17m 10.244.3.94 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 4h17m 10.244.4.126 node2 <none> <none> prometheus-operator-ffcc9958-hffd6 2/2 Running 0 4h17m 10.244.3.93 node1 <none> <none>
7、部署grafana
7.1、整理分类
7.1.1、kubePrometheus-prometheusRule.yaml应用资源配置清单
# 还剩下一个没有创建到,现在把它创建起来 ]# kubectl apply -f kubePrometheus-prometheusRule.yaml prometheusrule.monitoring.coreos.com/kube-prometheus-rules created
7.1.2、目录分类
mkdir prom_grafana && mv grafana-*.yaml prom_grafana && cd prom_grafana [root@master1 prom_grafana]# ll -rw-rw-r-- 1 root root 344 Jan 24 18:14 grafana-config.yaml -rw-rw-r-- 1 root root 680 Jan 24 18:14 grafana-dashboardDatasources.yaml -rw-rw-r-- 1 root root 1549788 Jan 24 18:14 grafana-dashboardDefinitions.yaml -rw-rw-r-- 1 root root 658 Jan 24 18:14 grafana-dashboardSources.yaml -rw-rw-r-- 1 root root 9290 Jan 24 18:14 grafana-deployment.yaml -rw-rw-r-- 1 root root 651 Jan 24 18:14 grafana-networkPolicy.yaml -rw-rw-r-- 1 root root 1427 Jan 24 18:14 grafana-prometheusRule.yaml -rw-rw-r-- 1 root root 293 Jan 24 18:14 grafana-serviceAccount.yaml -rw-rw-r-- 1 root root 398 Jan 24 18:14 grafana-serviceMonitor.yaml -rw-rw-r-- 1 root root 452 Jan 24 18:14 grafana-service.yaml
7.2、修改service
7.2.1、将service修改为NodePort
prom_grafana]# cat grafana-service.yaml spec: ports: - name: http port: 3000 targetPort: http nodePort: 30030 type:NodePort selector: app.kubernetes.io/component: grafana app.kubernetes.io/name: grafana app.kubernetes.io/part-of: kube-prometheus
7.3、修改离线镜像
# 原来的镜像 prom_grafana]# grep 'image: ' * grafana-deployment.yaml: image: grafana/grafana:9.3.2 # 下载镜像文件并且上传本地仓库 docker pull grafana/grafana:9.3.2 docker tag grafana/grafana:9.3.2 192.168.10.33:80/k8s/grafana:9.3.2 docker push 192.168.10.33:80/k8s/grafana:9.3.2 # 修改离线镜像 sed -i 's#grafana/grafana:9.3.2#192.168.10.33:80/k8s/grafana:9.3.2#g' grafana-deployment.yaml # 修改后的镜像 prom_grafana]# grep 'image: ' * grafana-deployment.yaml: image: 192.168.10.33:80/k8s/grafana:9.3.2
7.4、应用资源配置清单
]# kubectl apply -f kube-prometheus-0.12.0/manifests/prom_grafana/ secret/grafana-config configured secret/grafana-datasources configured configmap/grafana-dashboard-alertmanager-overview created configmap/grafana-dashboard-apiserver created configmap/grafana-dashboard-cluster-total created configmap/grafana-dashboard-controller-manager created configmap/grafana-dashboard-grafana-overview created configmap/grafana-dashboard-k8s-resources-cluster created configmap/grafana-dashboard-k8s-resources-namespace created configmap/grafana-dashboard-k8s-resources-node created configmap/grafana-dashboard-k8s-resources-pod created configmap/grafana-dashboard-k8s-resources-workload created configmap/grafana-dashboard-k8s-resources-workloads-namespace created configmap/grafana-dashboard-kubelet created configmap/grafana-dashboard-namespace-by-pod created configmap/grafana-dashboard-namespace-by-workload created configmap/grafana-dashboard-node-cluster-rsrc-use created configmap/grafana-dashboard-node-rsrc-use created configmap/grafana-dashboard-nodes-darwin created configmap/grafana-dashboard-nodes created configmap/grafana-dashboard-persistentvolumesusage created configmap/grafana-dashboard-pod-total created configmap/grafana-dashboard-prometheus-remote-write created configmap/grafana-dashboard-prometheus created configmap/grafana-dashboard-proxy created configmap/grafana-dashboard-scheduler created configmap/grafana-dashboard-workload-total created configmap/grafana-dashboards created deployment.apps/grafana configured networkpolicy.networking.k8s.io/grafana created prometheusrule.monitoring.coreos.com/grafana-rules created service/grafana created serviceaccount/grafana created servicemonitor.monitoring.coreos.com/grafana created
7.5、查询运行状态
]# kubectl -n monitoring get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-main-0 2/2 Running 0 3m33s 10.244.3.7 node1 <none> <none> alertmanager-main-1 2/2 Running 0 3m32s 10.244.4.8 node2 <none> <none> alertmanager-main-2 2/2 Running 0 3m31s 10.244.4.9 node2 <none> <none> blackbox-exporter-84bb6f6bd9-2tr2q 3/3 Running 0 2m25s 10.244.3.9 node1 <none> <none> grafana-7bdbdbcb4b-5d996 1/1 Running 0 13m 10.244.3.5 node1 <none> <none> kube-state-metrics-c7c57885f-scxdh 3/3 Running 0 118s 10.244.3.10 node1 <none> <none> node-exporter-27bgj 2/2 Running 0 53s 192.168.10.27 master2 <none> <none> node-exporter-cnzhw 2/2 Running 0 53s 192.168.10.30 node2 <none> <none> node-exporter-knqgv 2/2 Running 0 53s 192.168.10.29 node1 <none> <none> node-exporter-qwbb6 2/2 Running 0 53s 192.168.10.26 master1 <none> <none> prometheus-adapter-67d7695cb7-7wf9j 1/1 Running 0 2m41s 10.244.4.10 node2 <none> <none> prometheus-adapter-67d7695cb7-vbdkr 1/1 Running 0 2m41s 10.244.3.8 node1 <none> <none> prometheus-k8s-0 2/2 Running 0 18s 10.244.3.12 node1 <none> <none> prometheus-k8s-1 2/2 Running 0 18s 10.244.4.11 node2 <none> <none> prometheus-operator-ffcc9958-2dbgn 2/2 Running 0 101s 10.244.3.11 node1 <none> <none>
8、删除网络规则
8.1、原因
为了避免,因网络规则限制的问题,导致我们调试有问题,先删除它,如果需要的话,再学习networkpolicies,再去增加即可
8.2、批量删除网络规则
for i in `kubectl get networkpolicies -n monitoring | grep -v NAME | awk -F " " '{ print $1 }'`; do kubectl -n monitoring delete networkpolicies $i; done