k8s监控-kube-prometheus
k8s监控-kube-prometheus
https://www.jianshu.com/p/2fbbe767870d kube-prometheus
1 下载部署
1.1 下载
git clone https://github.com/coreos/kube-prometheus.git
安装文件都在kube-prometheus/manifests/ 目录下。
整理yaml文件,
mkdir prometheus
cp kube-prometheus/manifests/* prometheus/
cd prometheus/
mkdir -p operator node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter
mv *-serviceMonitor* serviceMonitor/
mv grafana-* grafana/
mv kube-state-metrics-* kube-state-metrics/
mv alertmanager-* alertmanager/
mv node-exporter-* node-exporter/
mv prometheus-adapter* adapter/
mv prometheus-* prometheus/
最后目录结构如下: setup是默认存在的
1.2 部署
创建命名空间
kubectl apply -f .
kubectl apply -f setup/
依次按照各组件
kubectl apply -f adapter/
kubectl apply -f alertmanager/
kubectl apply -f node-exporter/
kubectl apply -f kube-state-metrics/
kubectl apply -f grafana/
kubectl apply -f prometheus/
kubectl apply -f serviceMonitor/
查看各个组件状态
kubectl get all -n monitoring
[root@k8s-master01 prometheus]# kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 88m
pod/alertmanager-main-1 2/2 Running 0 88m
pod/alertmanager-main-2 2/2 Running 0 88m
pod/grafana-86c68fc557-wtb99 1/1 Running 0 170m
pod/kube-state-metrics-584bdcbd9f-wljk6 3/3 Running 0 171m
pod/node-exporter-mmwxl 2/2 Running 0 113m
pod/node-exporter-mwh5m 2/2 Running 0 113m
pod/node-exporter-ncx2w 2/2 Running 0 113m
pod/prometheus-k8s-0 3/3 Running 1 107m
pod/prometheus-k8s-1 3/3 Running 1 107m
pod/prometheus-operator-68df4bff8b-wqgg4 2/2 Running 0 144m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.107.220.244 9093/TCP 95m
service/alertmanager-operated ClusterIP None 9093/TCP,9094/TCP,9094/UDP 88m
service/grafana NodePort 10.104.111.185 3000:32000/TCP 170m
service/kube-state-metrics ClusterIP None 8443/TCP,9443/TCP 171m
service/node-exporter ClusterIP None 9100/TCP 129m
service/prometheus-k8s NodePort 10.97.17.152 9090:31000/TCP 129m
service/prometheus-operated ClusterIP None 9090/TCP 129m
service/prometheus-operator ClusterIP None 8443/TCP 174m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/node-exporter 3 3 3 3 3 kubernetes.io/os=linux 113m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/grafana 1/1 1 1 170m
deployment.apps/kube-state-metrics 1/1 1 1 171m
deployment.apps/prometheus-operator 1/1 1 1 174m
NAME DESIRED CURRENT READY AGE
replicaset.apps/grafana-86c68fc557 1 1 1 170m
replicaset.apps/kube-state-metrics-584bdcbd9f 1 1 1 171m
replicaset.apps/prometheus-operator-68df4bff8b 1 1 1 174m
NAME READY AGE
statefulset.apps/alertmanager-main 3/3 88m
statefulset.apps/prometheus-k8s 2/2 129m
2 更改service类型
为了方便外网访问
grafana:
kubectl edit service/grafana -n monitoring
prometheus:
kubectl edit service/prometheus-k8s -n monitoring
如下图所示:
3 web页面访问
http://192.168.1.200:31000/targets
发现红框里面的两个数据没有被采集到
解决办法:
新建prometheus-kubeSchedulerService.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
k8s-app: kube-scheduler #与servicemonitor中的selector匹配
spec:
selector:
component: kube-scheduler # 与scheduler的pod标签一直
ports:
- name: http-metrics
port: 10251
targetPort: 10251
protocol: TCP
kubectl apply -f prometheus-kubeSchedulerService.yaml
同理新建prometheus-kubeControllerManagerService.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
k8s-app: kube-controller-manager
spec:
selector:
component: kube-controller-manager
ports:
- name: http-metrics
port: 10252
targetPort: 10252
protocol: TCP
4 访问grafnan
http://192.168.1.200:32000/login 用户名和密码都是 admin
https://github.com/loveqx/k8s-study/tree/master/k8s-grafana 模板下载