kubernetes之监控Operator部署Prometheus
什么是Operator
Operator是由 CoreOS 公司开发的,用来扩展 Kubernetes API,特定的应用程序控制器。它被用来创建、配置和管理复杂的有状态应用,如数据库、缓存和监控系统。Operator 是基于 Kubernetes 的资源和控制器概念之上构建,但同时又包含了应用程序特定的一些专业知识:比如创建一个数据库的Operator,则必须对创建的数据库的各种运维方式非常了解,创建Operator的关键是CRD(自定义资源)的设计。
Operator
是将运维人员对软件操作的知识给代码化,同时利用 Kubernetes 强大的抽象来管理大规模的软件应用。目前CoreOS
官方提供了几种Operator
的实现,其中就包括我们今天的主角:Prometheus Operator
,Operator
的核心实现就是基于 Kubernetes 的以下两个概念:
- 资源:对象的状态定义
- 控制器:观测、分析和行动,以调节资源的分布
当前CoreOS提供的以下四种Operator:
- etcd:创建etcd集群
- Rook:云原生环境下的文件、块、对象存储服务
- Prometheus:创建Prometheus监控实例
- Tectonic:部署Kubernetes集群
Kube-Prometheus(Operator) 框架
注:CRD是对 Kubernetes API 的扩展,Kubernetes 中的每个资源都是一个 API 对象的集合,例如我们在 YAML文件里定义的那些spec都是对 Kubernetes 中的资源对象的定义,所有的自定义资源可以跟 Kubernetes 中内建的资源一样使用 kubectl 操作。
接下来我们将使用Operator创建Prometheus。
安装
我们这里直接通过 Prometheus-Operator 的源码来进行安装,当然也可以用 Helm 来进行一键安装,我们采用源码安装可以去了解更多的实现细节。首页将源码 Clone 下来:
git clone https://github.com/coreos/prometheus-operator cd prometheus-operator/contrib/kube-prometheus/manifests
进入到 manifests 目录下面,这个目录下面包含我们所有的资源清单文件,直接在该文件夹下面执行创建资源命令即可:
kubectl apply -f .
部署完成后,会创建一个名为monitoring
的 namespace,所以资源对象对将部署在改命名空间下面,此外 Operator 会自动创建4个 CRD 资源对象:
kubectl get crd |grep coreos alertmanagers.monitoring.coreos.com 2019-03-18T02:43:57Z prometheuses.monitoring.coreos.com 2019-03-18T02:43:58Z prometheusrules.monitoring.coreos.com 2019-03-18T02:43:58Z servicemonitors.monitoring.coreos.com 2019-03-18T02:43:58Z
可以在 monitoring 命名空间下面查看所有的 Pod,其中 alertmanager 和 prometheus 是用 StatefulSet 控制器管理的,其中还有一个比较核心的 prometheus-operator 的 Pod,用来控制其他资源对象和监听对象变化的:
kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 0 37m alertmanager-main-1 2/2 Running 0 34m alertmanager-main-2 2/2 Running 0 33m grafana-7489c49998-pkl8w 1/1 Running 0 40m kube-state-metrics-d6cf6c7b5-7dwpg 4/4 Running 0 27m node-exporter-dlp25 2/2 Running 0 40m node-exporter-fghlp 2/2 Running 0 40m node-exporter-mxwdm 2/2 Running 0 40m node-exporter-r9v92 2/2 Running 0 40m prometheus-adapter-84cd9c96c9-n92n4 1/1 Running 0 40m prometheus-k8s-0 3/3 Running 1 37m prometheus-k8s-1 3/3 Running 1 37m prometheus-operator-7b74946bd6-vmbcj 1/1 Running 0 40m
查看创建的 Service:
kubectl get svc -n monitoring NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager-main ClusterIP 10.110.43.207 <none> 9093/TCP 40m alertmanager-operated ClusterIP None <none> 9093/TCP,6783/TCP 38m grafana ClusterIP 10.109.160.0 <none> 3000/TCP 40m kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 40m node-exporter ClusterIP None <none> 9100/TCP 40m prometheus-adapter ClusterIP 10.105.174.21 <none> 443/TCP 40m prometheus-k8s ClusterIP 10.97.195.143 <none> 9090/TCP 40m prometheus-operated ClusterIP None <none> 9090/TCP 38m prometheus-operator ClusterIP None <none> 8080/TCP 40m
可以看到上面针对 grafana 和 prometheus 都创建了一个类型为 ClusterIP 的 Service,当然如果我们想要在外网访问这两个服务的话可以通过创建对应的 Ingress 对象或者使用 NodePort 类型的 Service,我们这里为了简单,直接使用 NodePort 类型的服务即可,编辑 grafana 和 prometheus-k8s 这两个 Service,将服务类型更改为 NodePort:
kubectl edit svc grafana -n monitoring kubectl edit svc prometheus-k8s -n monitoring kubectl get svc -n monitoring NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ..... grafana NodePort 10.109.160.0 <none> 3000:31740/TCP 42m prometheus-k8s NodePort 10.97.195.143 <none> 9090:31310/TCP 42m
更改完成后,我们就可以通过去访问上面的两个服务了,比如查看 prometheus 的 targets 页面:
我们可以看到大部分的配置都是正常的,只有两三个没有管理到对应的监控目标,比如 kube-controller-manager 和 kube-scheduler 这两个系统组件,这就和 ServiceMonitor 的定义有关系了,我们先来查看下 kube-scheduler 组件对应的 ServiceMonitor 资源的定义:(prometheus-serviceMonitorKubeScheduler.yaml)
Kube-Prometheus配置
我们可以看到大部分的配置都是正常的,只有两个没有管理到对应的监控目标,比如 kube-controller-manager 和 kube-scheduler 这两个系统组件,这就和 ServiceMonitor 的定义有关系了,我们先来查看下 kube-scheduler 组件对应的 ServiceMonitor 资源的定义(prometheus-serviceMonitorKubeScheduler.yaml):
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: k8s-app: kube-scheduler name: kube-scheduler namespace: monitoring spec: endpoints: - interval: 30s # 每30s获取一次信息 port: http-metrics # 对应service的端口名 jobLabel: k8s-app namespaceSelector: # 表示去匹配某一命名空间中的service,如果想从所有的namespace中匹配用any: true matchNames: - kube-system selector: # 匹配的 Service 的labels,如果使用mathLabels,则下面的所有标签都匹配时才会匹配该service,如果使用matchExpressions,则至少匹配一个标签的service都会被选择 matchLabels: k8s-app: kube-scheduler
上面是一个典型的 ServiceMonitor 资源文件的声明方式,上面我们通过selector.matchLabels在 kube-system 这个命名空间下面匹配具有k8s-app=kube-scheduler这样的 Service,但是我们系统中根本就没有对应的 Service,所以我们需要手动创建一个 Service(prometheus-kubeSchedulerService.yaml):
apiVersion: v1 kind: Service metadata: namespace: kube-system name: kube-scheduler labels: k8s-app: kube-scheduler spec: #selector: # component: kube-scheduler ports: - name: http-metrics port: 10251 targetPort: 10251 protocol: TCP
我们采用相同的方法创建Kube-Controller-Manager的Service(prometheus-KubeControllerManagerService.yaml):
apiVersion: v1 kind: Service metadata: namespace: kube-system name: kube-controller-manager labels: k8s-app: kube-controller-manager spec: #selector: # component: kube-controller-manager ports: - name: https-metrics port: 10252 targetPort: 10252 protocol: TCP
创建完成之后我们通过 kubectl get svc -n kube-system 命令看到对应SVC资源已经生成了:
但是此时Prometheus面板的 targets 还是没有任何显示,我们通过 kubectl get ep -n kube-system 命令去查看一下 kube-controller-manager 和 kube-scheduler 的endpoints发现并没有任何endpoints,这里我们需要手动来添加endpoints。我们定义两个endpoints资源文件,如下:
编辑 prometheus-kubeSchedulerServiceEnpoints.yaml
[root@master1 manifests]# cat prometheus-kubeSchedulerServiceEnpoints.yaml apiVersion: v1 kind: Endpoints metadata: labels: k8s-app: kube-scheduler name: kube-scheduler namespace: kube-system subsets: - addresses: - ip: 192.168.200.11 - ip: 192.168.200.12 - ip: 192.168.200.13 ports: - name: http-metrics port: 10251 protocol: TCP
编辑 prometheus-KubeControllerManagerServiceEnpoints.yaml
[root@master1 manifests]# cat prometheus-KubeControllerManagerServiceEnpoints.yaml apiVersion: v1 kind: Endpoints metadata: labels: k8s-app: kube-controller-manager name: kube-controller-manager namespace: kube-system subsets: - addresses: - ip: 192.168.200.11 - ip: 192.168.200.12 - ip: 192.168.200.13 ports: - name: https-metrics port: 10252 protocol: TCP
定义完成之后我们直接去创建上面的两个endpoints资源,然后我们继续查看 kube-controller-manager 和 kube-scheduler 的endpoints资源状态:
我们已经成功为 kube-controller-manager 和 kube-scheduler 这两个SVC资源添加了 Endpoints 资源。但是此时我们去刷新Prometheus的 target 可以看到 monitoring/kube-scheduler/0 数据正常、但是 monitoring/kube-controller-manager/0 状态显示为DOWN。错误信息为:Server Returned HTTP Status 400 Bad Request。仔细观察我们会发现、Endpoints 信息为:http://172.16.200.11:10252/metrics,我们在前面部署 kube-controller-manager 明明用的是HTTPS、但是为什么这里就变成 HTTP 了呢?
我们去查看 prometheus-serviceMonitorKubeControllerManager.yaml 文件发现 kube-controller-manager 的 ServiceMonitor kind 端口写的是 http-metrics。我们需要把Port修改为https-metrics,并添加token信息,这里我们可以 insecureSkipVerify: true 参数禁止掉证书验证,详细信息如下:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: k8s-app: kube-controller-manager name: kube-controller-manager namespace: monitoring spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token # 添加token interval: 30s ...... - action: drop regex: etcd_(debugging|disk|request|server).* sourceLabels: - __name__ port: https-metrics # 把默认http-metrics修改为https-metrics scheme: https # 添加https tlsConfig: insecureSkipVerify: true # 禁止验证证书信息
我们直接通过 kubectl apply -f prometheus-serviceMonitorKubeControllerManager.yaml 更新上面的资源文件,然后稍等30秒重新刷新 prometheus 页面。我们可以看到 monitoring/kube-controller-manager/0 这个 target 状态已经变成UP了。这时我们去看看 Grafana 里面是否有对应数据,当然你也可以通过 PromSQL 来查询验证:
自定义 Kube-Prometheus 监控项
前面我们讲解了 如何快速部署 Kube-Prometheus 监控系统,下面我们继续介绍如何在 Kube-Prometheus 中添加一个自定义的监控项。除了 Kubernetes 集群中的一些资源对象、节点以及组件需要监控,有的时候我们可能还需要根据实际的业务需求去添加自定义的监控项,添加一个自定义监控的步骤也是非常简单的。
- 第一步:建立一个 ServiceMonitor 对象,用于 Prometheus 添加监控项;
- 第二步:为 ServiceMonitor 对象关联 metrics 数据接口的一个 Service 对象;
- 第三步:确保 Service 对象可以正确获取到 metrics 数据。
接下来我们就来看看如何添加 Etcd 集群的监控。无论是 Kubernetes 集群外的还是使用 Kubeadm 安装在集群内部的 Etcd 集群,我们这里都将其视作集群外的独立集群,因为对于二者的使用方法没什么特殊之处。
3.1、获取ETCD证书
对于 Etcd 集群,在搭建的时候我们就采用了https证书认证的方式,所以这里如果想用 Kube-Prometheus 访问到 Etcd 集群的监控数据,就需要添加证书。我们可以通过 systemctl status etcd 查看证书路径:
systemctl status etcd
接下来我们通过启动文件,查看ETCD证书路径:
cat /etc/systemd/system/etcd.service
注:这里我查阅了很多资料、发现大部分都是基于 Kubeadm 的监控;而这里我是针对 Kubernetes 二进制搭建的监控。基于 Kubeadm 这里就不再详细描述了、请大家自行百度。
接下来我们需要创建一个 secret,让 Kube-Prometheus pod 节点挂载:
kubectl create secret generic etcd-ssl --from-file=/etc/kubernetes/cert/ca.pem --from-file=/etc/etcd/cert/etcd.pem --from-file=/etc/etcd/cert/etcd-key.pem -n monitoring
创建完完成以后我们可以通过下面的命令来检查一下:
kubectl describe secrets -n monitoring etcd-ssl
我们可以看到证书已经创建成功,然后我们将 etcd-ssl secret 对象配置到 Kube-Prometheus 资源对象中。这里我们可以通过 edit 命令直接编辑 Kube-Prometheus 或者 修改 prometheus-prometheus.yaml 文件,然后更新:
# 通过 edit 命令直接编辑 kubectl edit prometheus k8s -n monitoring # 修改 prometheus-prometheus.yaml 文件 vim kube-prometheus-master/manifests/prometheus-prometheus.yaml
这里我们直接修改 prometheus-prometheus.yaml 资源文件,并添加内容如下:
apiVersion: monitoring.coreos.com/v1 kind: Prometheus ...... replicas: 2 secrets: - etcd-ssl# 添加secret名称 ......
修改完成之后我们直接更新上面的资源文件,然后我们就可以在 Kube-Prometheus pod 中查看到对象的目录了:
# 更新资源文件 kubectl apply -f prometheus-prometheus.yaml # 查看pod状态 kubectl get pod -n monitoring # 进入prometheus-k8s-0 Pod中 kubectl exec -it -n monitoring prometheus-k8s-0 /bin/sh # 查看证书 ls /etc/prometheus/secrets/etcd-ssl/
3.2、创建ServiceMonitor
前面Kube-Prometheus已经挂载了ETCD证书文件,下面我们就可以直接来创建ServiceMonitor了:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: etcd-k8s namespace: monitoring labels: k8s-app: etcd-k8s spec: jobLabel: k8s-app endpoints: - port: port interval: 30s scheme: https tlsConfig: caFile: /etc/prometheus/secrets/etcd-ssl/ca.pem #证书路径 (Pod里的路径) certFile: /etc/prometheus/secrets/etcd-ssl/etcd.pem keyFile: /etc/prometheus/secrets/etcd-ssl/etcd-key.pem insecureSkipVerify: true selector: matchLabels: k8s-app: etcd namespaceSelector: matchNames: - kube-system
上面这个文件我们匹配 Kube-system 这个命名空间下面具有 k8s-app=etcd 这个label标签的Service,job label用于检索job任务名称的标签。由于证书 serverName 和 etcd 中签发的证书可能不匹配,所以添加了 insecureSkipVerify=true 将不再对服务端的证书进行校验。接下来我们直接创建这个ServiceMonitor:
# 创建资源文件 kubectl apply -f prometheus-serviceMonitorEtcd.yaml # 查看servicemonitors资源 kubectl get servicemonitors -n monitoring |grep etcd
ServiceMonitor资源创建完成以后、我们等30秒然后刷新Prometheus Targets数据。此时我们发现新增了一条 monitoring/etcd-k8s/0 (0/0 up) ,但是没有任何数据。那是因为我们虽然创建了ServiceMonitor,但是还没有关联对应的Service对象,所以需要创建一个service对象。
我们需要定义一个 Service 对象和一个 Endpoints,对应资源文件如下。
prometheus-EtcdService.yaml:
apiVersion: v1 kind: Service metadata: name: etcd-k8s namespace: kube-system labels: k8s-app: etcd spec: type: ClusterIP clusterIP: None ports: - name: port port: 2379 protocol: TCP
prometheus-EtcdServiceEnpoints.yaml
apiVersion: v1 kind: Endpoints metadata: name: etcd-k8s namespace: kube-system labels: k8s-app: etcd subsets: - addresses: - ip: 172.16.200.11 #etcd节点名称 #nodeName: k8s-01 #kubelet名称 (kubectl get node)显示的名称 - ip: 172.16.200.12 #nodeName: k8s-02 - ip: 172.16.200.13 #nodeName: k8s-03 ports: - name: port port: 2379 protocol: TCP
然后我们直接创建上面的 Service 对象和 Endpoints 对象:
kubectl apply -f prometheus-EtcdService.yaml kubectl apply -f prometheus-EtcdServiceEnpoints.yaml # 查看ETCD状态 kubectl describe svc -n kube-system etcd-k8s
创建完成后,稍等一会我们可以去Prometheus 里面查看targets,便会出现etcd监控信息:
注:如果提示ip:2379 connection refused,首先检查本地Telnet 是否正常,在检查etcd配置文件是否是监听0.0.0.0:2379。
数据采集完成后,接下来可以在grafana中导入dashboard。这里我们可以导入 :https://grafana.com/grafana/dashboards/3070;还可以导入中文版ETCD集群插件:https://grafana.com/grafana/dashboards/9733;导入过程这里就不再详细描述了、不会的小伙伴请自行百度:
4、Kube-Prometheus数据持久化
前面我们需改完Prometheus的相关配置后,重启了 Prometheus 的 Pod,如果我们仔细观察的话会发现我们之前采集的数据已经没有了,这是因为我们通过 Prometheus 这个 CRD 创建的 Prometheus 并没有做数据的持久化,我们可以直接查看生成的 Prometheus Pod 的挂载情况就清楚了:
kubectl get pod prometheus-k8s-0 -n monitoring -o yaml
从上图我们可以看到 Prometheus 的数据目录 /prometheus 实际上是通过 emptyDir 进行挂载的,我们知道 emptyDir 挂载的数据的生命周期和 Pod 生命周期一致的,所以如果 Pod 挂掉了,数据也就丢失了,这也就是为什么我们重建 Pod 后之前的数据就没有了的原因,对应线上的监控数据肯定需要做数据的持久化的,同样的 prometheus 这个 CRD 资源也为我们提供了数据持久化的配置方法,由于我们的 Prometheus 最终是通过 Statefulset 控制器进行部署的,所以我们这里需要通过 storageclass 来做数据持久化,首先创建一个 StorageClass 对象(prometheus-storageclass.yaml):
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: prometheus-data-db provisioner: fuseim.pri/ifs
这里我们声明一个 StorageClass 对象,其中 provisioner=fuseim.pri/ifs,则是因为我们集群中使用的是 nfs 作为存储后端,而前面我们创建的 nfs-client-provisioner 中指定的 PROVISIONER_NAME 就为 fuseim.pri/ifs,这个名字不能随便更改。然后我们直接创建这个资源:
kubectl apply -f prometheus-storageclass.yaml kubectl get storageclass
然后在 prometheus 的 CRD 资源对象中添加如下配置:
storage:
volumeClaimTemplate:
spec:
storageClassName: prometheus-data-db
resources:
requests:
storage: 100Gi
注意这里的 storageClassName 名字为上面我们创建的 StorageClass 对象名称,然后更新 prometheus 这个 CRD 资源。更新完成后会自动生成两个 PVC 和 PV 资源对象:
# 更新资源文件 kubectl apply -f prometheus-prometheus.yaml # 查看PVC kubectl get pvc -n monitoring # 查看PV kubectl get pv
现在我们再去看 Prometheus Pod 的数据目录就可以看到是关联到一个 PVC 对象上了。现在即使我们的 Pod 挂掉了,数据也不会丢失了。
5、Kube-Prometheus数据持久时间
前面说了prometheus operator持久化的问题,但是还有一个问题很多人都忽略了,那就是prometheus operator数据保留天数,根据官方文档的说明,默认prometheus operator数据存储的时间为1d,这个时候无论你prometheus operator如何进行持久化,都没有作用,因为数据只保留了1天,那么你是无法看到更多天数的数据。
实际上我们修改 Kube-Prometheus 时间是通过 retention 参数进行修改,上面也提示了在prometheus.spec下填写。这里我们直接修改 prometheus-prometheus.yaml 文件,并添加下面的参数:
注:如果已经安装了可以直接修改 prometheus-prometheus.yaml 然后通过kubectl apply -f 刷新即可,修改完成以后记得检查Pod运行状态是否正常。
接下来可以访问grafana或者prometheus ui进行检查 (我这里修改完毕后等待2天,检查数据是否正常)。
修改前
修改后
6、Grafana数据持久化
前面我们介绍了关于prometheus的数据持久化、但是没有介绍如何针对Grafana做数据持久化;如果Grafana不做数据持久化、那么服务重启以后,Grafana里面配置的Dashboard、账号密码等信息将会丢失;所以Grafana做数据持久化也是很有必要的。
原始的数据是以 emptyDir 形式存放在pod里面,生命周期与pod相同;出现问题时,容器重启,在Grafana里面设置的数据就全部消失了。
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-storage
readOnly: false
...
volumes:
- emptyDir: {}
name: grafana-storage
从上图我们可以看出Grafana将dashboard、插件这些数据保存在/var/lib/grafana这个目录下面。做持久化的话,就需要对这个目录进行volume挂载声明。
我们把emptyDir修改为pvc方式:
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana
如果要使用一个 pvc 对象来持久化数据,我们就需要添加一个可用的 pv 供 pvc 绑定使用,grafana-volume.yaml内容如下:
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
nfs:
server: 172.16.200.10
path: /mnt/lv/k8s
然后我们直接创建上面的PV和PVC、更新 grafana-deployment.yaml 文件即可:
# kubectl apply -f grafana-volume.yaml
创建完成以后我们查看Pod状态,我们发现Pod状态一直是 CrashLoopBackOff 没有正常启动,我们再看一下这个 Pod 的日志,错误信息如下:
mkdir: cannot create directory '/var/lib/grafana/plugins': Permission denied
这个错误是 Grafana 5.1版本以后才会出现的。错误的原因很明显,就是 /var/lib/grafana 目录的权限不够。在 `` 中有这样一个属性:
securityContext: runAsNonRoot: true runAsUser: 65534
我们查看一下65534是哪个用户:
cat /etc/passwd | grep 65534
所以,我们只需要把 /mnt/lv/k8s 目录的用户改为 nfsnobody 就可以了。当然把属性改为 777 也没问题:
chown nfsnobody /mnt/lv/k8s
把刚才出错的那个 Pod 删除,新的 Grafana Pod 就成功启动了。然后就可以添加 Dashboard 了,现在Pod 重建也不会丢失数据了。
7、Kube-Prometheus服务发现
前面我们在 Kube-Prometheus 下面自定义一个监控项,以及自定义报警规则的使用。那么我们还能够直接使用前面课程中的自动发现功能吗?如果在我们的 Kubernetes 集群中有了很多的 Service/Pod,那么我们都需要一个一个的去建立一个对应的 ServiceMonitor 对象来进行监控吗?这样岂不是又变得麻烦起来了?
为解决上面的问题,Kube-Prometheus 为我们提供了一个额外的抓取配置的来解决这个问题,我们可以通过添加额外的配置来进行服务发现进行自动监控。和前面自定义的方式一样,我们可以在 Kube-Prometheus 当中去自动发现并监控具有prometheus.io/scrape=true 这个 annotations 的 Service,之前我们定义的 Prometheus 的配置如下:
- job_name: 'kubernetes-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
要想自动发现集群中的 Service,就需要我们在 Service 的 annotation 区域添加 prometheus.io/scrape=true 的声明,将上面文件直接保存为 prometheus-additional.yaml,然后通过这个文件创建一个对应的 Secret 对象:
kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml -n monitoring
然后我们需要在声明 prometheus 的资源对象文件中通过 additionalScrapeConfigs 属性添加上这个额外的配置:(prometheus-prometheus.yaml)
apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: labels: prometheus: k8s name: k8s namespace: monitoring spec: retention: 365d storage: volumeClaimTemplate: spec: storageClassName: prometheus-data-db resources: requests: storage: 100Gi alerting: alertmanagers: - name: alertmanager-main namespace: monitoring port: web baseImage: quay.io/prometheus/prometheus nodeSelector: kubernetes.io/os: linux podMonitorNamespaceSelector: {} podMonitorSelector: {} replicas: 2 secrets: - etcd-ssl resources: requests: memory: 400Mi ruleSelector: matchLabels: prometheus: k8s role: alert-rules securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: prometheus-k8s serviceMonitorNamespaceSelector: {} serviceMonitorSelector: {} version: v2.11.0 # 添加额外配置内容 additionalScrapeConfigs: name: additional-configs key: prometheus-additional.yam
添加完成后,直接更新 prometheus 这个 CRD 资源对象即可:
kubectl apply -f prometheus-prometheus.yaml
隔一小会儿,可以前往 Prometheus 的 Dashboard 中查看配置已经生效了:
但是我们切换到 targets 页面下面却并没有发现对应的监控任务,查看 Prometheus 的 Pod 日志:
kubectl logs -f prometheus-k8s-0 prometheus -n monitoring | grep cluster
可以看到有很多错误日志出现,都是 xxx is forbidden,这说明是 RBAC 权限的问题,通过 prometheus 资源对象的配置可以知道 Prometheus 绑定了一个名为 prometheus-k8s 的 ServiceAccount 对象,而这个对象绑定的是一个名为 prometheus-k8s 的 ClusterRole:(prometheus-clusterRole.yaml)
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus-k8s rules: - apiGroups: - "" resources: - nodes/metrics verbs: - get - nonResourceURLs: - /metrics verbs: - get
上面的权限规则中我们可以看到明显没有对 Service 或者 Pod 的 list 权限,所以报错了,要解决这个问题,我们只需要添加上需要的权限即可:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-k8s
rules:
- apiGroups:
- ""
resources:
- nodes
- services
- endpoints
- pods
- nodes/proxy
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
- nodes/metrics
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
更新上面的 ClusterRole 这个资源对象,然后重建 Prometheus 的所有 Pod,正常就可以看到 targets 页面下面有 kubernetes-endpoints 这个监控任务了:
从上图我们可以看到、上面抓取的目标是因为 Service 中有 prometheus.io/scrape=true 这个 annotation。
{ "__inputs": [ { "name": "DS_PROMETHEUS", "label": "Prometheus", "description": "", "type": "datasource", "pluginId": "prometheus", "pluginName": "Prometheus" } ], "__requires": [ { "type": "panel", "id": "bargauge", "name": "Bar Gauge", "version": "" }, { "type": "grafana", "id": "grafana", "name": "Grafana", "version": "6.7.3" }, { "type": "panel", "id": "graph", "name": "Graph", "version": "" }, { "type": "datasource", "id": "prometheus", "name": "Prometheus", "version": "1.0.0" }, { "type": "panel", "id": "singlestat", "name": "Singlestat", "version": "" }, { "type": "panel", "id": "table", "name": "Table", "version": "" } ], "annotations": { "list": [ { "$$hashKey": "object:5598", "builtIn": 1, "datasource": "-- Grafana --", "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "description": "【中文版本】2020.06.28更新,增加整体资源展示!支持 Grafana6&7,Node Exporter v0.16及以上的版本,优化重要指标展示。包含整体资源展示与资源明细图表:CPU 内存 磁盘 IO 网络等监控指标。https://github.com/starsliao/Prometheus", "editable": true, "gnetId": 8919, "graphTooltip": 0, "id": null, "iteration": 1593281500982, "links": [ { "icon": "external link", "tags": [], "targetBlank": true, "title": "更新node_exporter", "tooltip": "", "type": "link", "url": "https://github.com/prometheus/node_exporter/releases" }, { "icon": "external link", "tags": [], "targetBlank": true, "title": "更新当前仪表板", "tooltip": "", "type": "link", "url": "https://grafana.com/dashboards/8919" }, { "icon": "external link", "tags": [], "targetBlank": true, "title": "StarsL.cn", "tooltip": "", "type": "link", "url": "https://starsl.cn" }, { "asDropdown": true, "icon": "external link", "tags": [], "targetBlank": true, "title": "", "type": "dashboards" } ], "panels": [ { "collapsed": false, "datasource": "${DS_PROMETHEUS}", "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, "id": 187, "panels": [], "title": "资源总览(关联JOB项)当前选中主机:【$show_hostname】实例:$node", "type": "row" }, { "columns": [], "datasource": "${DS_PROMETHEUS}", "description": "分区使用率、磁盘读取、磁盘写入、下载带宽、上传带宽,如果有多个网卡或者多个分区,是采集的使用率最高的网卡或者分区的数值。", "fontSize": "100%", "gridPos": { "h": 12, "w": 24, "x": 0, "y": 1 }, "id": 185, "pageSize": 10, "showHeader": true, "sort": { "col": 5, "desc": false }, "styles": [ { "$$hashKey": "object:5955", "alias": "主机名", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 1, "link": false, "linkTooltip": "", "linkUrl": "", "mappingType": 1, "pattern": "nodename", "thresholds": [], "type": "string", "unit": "bytes" }, { "$$hashKey": "object:5956", "alias": "IP(链接到明细)", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "link": true, "linkTargetBlank": false, "linkTooltip": "浏览主机明细", "linkUrl": "/d/9CWBz0bik/node-exporter?orgId=1&var-job=${job}&var-hostname=All&var-node=${__cell}&var-device=All", "mappingType": 1, "pattern": "instance", "thresholds": [], "type": "number", "unit": "short" }, { "$$hashKey": "object:5957", "alias": "内存", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "link": false, "mappingType": 1, "pattern": "Value #B", "thresholds": [], "type": "number", "unit": "bytes" }, { "$$hashKey": "object:5958", "alias": "CPU核", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": null, "mappingType": 1, "pattern": "Value #C", "thresholds": [], "type": "number", "unit": "short" }, { "$$hashKey": "object:5959", "alias": " 运行时间", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #D", "thresholds": [], "type": "number", "unit": "s" }, { "$$hashKey": "object:5960", "alias": "分区使用率*", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #E", "thresholds": [ "70", "85" ], "type": "number", "unit": "percent" }, { "$$hashKey": "object:5961", "alias": "CPU使用率", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #F", "thresholds": [ "70", "85" ], "type": "number", "unit": "percent" }, { "$$hashKey": "object:5962", "alias": "内存使用率", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #G", "thresholds": [ "70", "85" ], "type": "number", "unit": "percent" }, { "$$hashKey": "object:5963", "alias": "磁盘读取*", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #H", "thresholds": [ "10485760", "20485760" ], "type": "number", "unit": "Bps" }, { "$$hashKey": "object:5964", "alias": "磁盘写入*", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #I", "thresholds": [ "10485760", "20485760" ], "type": "number", "unit": "Bps" }, { "$$hashKey": "object:5965", "alias": "下载带宽*", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #J", "thresholds": [ "30485760", "104857600" ], "type": "number", "unit": "bps" }, { "$$hashKey": "object:5966", "alias": "上传带宽*", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #K", "thresholds": [ "30485760", "104857600" ], "type": "number", "unit": "bps" }, { "$$hashKey": "object:5967", "alias": "5m负载", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "Value #L", "thresholds": [], "type": "number", "unit": "short" }, { "$$hashKey": "object:5968", "alias": "", "align": "right", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "decimals": 2, "pattern": "/.*/", "thresholds": [], "type": "hidden", "unit": "short" } ], "targets": [ { "expr": "node_uname_info{job=~\"$job\"} - 0", "format": "table", "instant": true, "interval": "", "legendFormat": "主机名", "refId": "A" }, { "expr": "sum(time() - node_boot_time_seconds{job=~\"$job\"})by(instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "运行时间", "refId": "D" }, { "expr": "node_memory_MemTotal_bytes{job=~\"$job\"} - 0", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "总内存", "refId": "B" }, { "expr": "count(node_cpu_seconds_total{job=~\"$job\",mode='system'}) by (instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "总核数", "refId": "C" }, { "expr": "node_load5{job=~\"$job\"}", "format": "table", "instant": true, "interval": "", "legendFormat": "5分钟负载", "refId": "L" }, { "expr": "(1 - avg(irate(node_cpu_seconds_total{job=~\"$job\",mode=\"idle\"}[5m])) by (instance)) * 100", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "CPU使用率", "refId": "F" }, { "expr": "(1 - (node_memory_MemAvailable_bytes{job=~\"$job\"} / (node_memory_MemTotal_bytes{job=~\"$job\"})))* 100", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "内存使用率", "refId": "G" }, { "expr": "max((node_filesystem_size_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"}-node_filesystem_free_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"}) *100/(node_filesystem_avail_bytes {job=~\"$job\",fstype=~\"ext.?|xfs\"}+(node_filesystem_size_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"}-node_filesystem_free_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"})))by(instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "分区使用率", "refId": "E" }, { "expr": "max(irate(node_disk_read_bytes_total{job=~\"$job\"}[5m])) by (instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "最大读取", "refId": "H" }, { "expr": "max(irate(node_disk_written_bytes_total{job=~\"$job\"}[5m])) by (instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "最大写入", "refId": "I" }, { "expr": "max(irate(node_network_receive_bytes_total{job=~\"$job\"}[5m])*8) by (instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "下载带宽", "refId": "J" }, { "expr": "max(irate(node_network_transmit_bytes_total{job=~\"$job\"}[5m])*8) by (instance)", "format": "table", "hide": false, "instant": true, "interval": "", "legendFormat": "上传带宽", "refId": "K" } ], "timeFrom": null, "timeShift": null, "title": "服务器资源总览表(每页10行)", "transform": "table", "type": "table" }, { "aliasColors": { "192.168.200.241:9100_Total": "dark-red", "Idle - Waiting for something to happen": "#052B51", "guest": "#9AC48A", "idle": "#052B51", "iowait": "#EAB839", "irq": "#BF1B00", "nice": "#C15C17", "sdb_每秒I/O操作%": "#d683ce", "softirq": "#E24D42", "steal": "#FCE2DE", "system": "#508642", "user": "#5195CE", "磁盘花费在I/O操作占比": "#ba43a9" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": null, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 0, "y": 13 }, "hiddenSeries": false, "id": 191, "legend": { "alignAsTable": false, "avg": false, "current": true, "hideEmpty": true, "hideZero": true, "max": false, "min": false, "rightSide": false, "show": true, "sideWidth": null, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "maxPerRow": 6, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "repeat": null, "seriesOverrides": [ { "alias": "总平均使用率", "lines": false, "pointradius": 1, "points": true, "yaxis": 2 }, { "alias": "总核数", "color": "#C4162A" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "count(node_cpu_seconds_total{job=~\"$job\", mode='system'})", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "总核数", "refId": "B", "step": 240 }, { "expr": "sum(node_load5{job=~\"$job\"})", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "总5分钟负载", "refId": "A", "step": 240 }, { "expr": "avg(1 - avg(irate(node_cpu_seconds_total{job=~\"$job\",mode=\"idle\"}[5m])) by (instance)) * 100", "format": "time_series", "hide": false, "interval": "30m", "intervalFactor": 1, "legendFormat": "总平均使用率", "refId": "F", "step": 240 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "$job:整体总负载与整体平均CPU使用率", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": null, "format": "short", "label": "总负载", "logBase": 1, "max": null, "min": null, "show": true }, { "decimals": 0, "format": "percent", "label": "平均使用率", "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "192.168.200.241:9100_总内存": "dark-red", "内存_Avaliable": "#6ED0E0", "内存_Cached": "#EF843C", "内存_Free": "#629E51", "内存_Total": "#6d1f62", "内存_Used": "#eab839", "可用": "#9ac48a", "总内存": "#bf1b00" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 1, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 8, "y": 13 }, "height": "300", "hiddenSeries": false, "id": 195, "legend": { "alignAsTable": false, "avg": false, "current": true, "max": false, "min": false, "rightSide": false, "show": true, "sort": "current", "sortDesc": false, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "总内存", "color": "#C4162A", "fill": 0 }, { "alias": "总平均使用率", "lines": false, "pointradius": 1, "points": true, "yaxis": 2 } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "sum(node_memory_MemTotal_bytes{job=~\"$job\"})", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "总内存", "refId": "A", "step": 4 }, { "expr": "sum(node_memory_MemTotal_bytes{job=~\"$job\"} - node_memory_MemAvailable_bytes{job=~\"$job\"})", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "总已用", "refId": "B", "step": 4 }, { "expr": "(sum(node_memory_MemTotal_bytes{job=~\"$job\"} - node_memory_MemAvailable_bytes{job=~\"$job\"}) / sum(node_memory_MemTotal_bytes{job=~\"$job\"}))*100", "format": "time_series", "hide": false, "interval": "30m", "intervalFactor": 1, "legendFormat": "总平均使用率", "refId": "H" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "$job:整体总内存与整体平均内存使用率", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": null, "format": "bytes", "label": "总内存量", "logBase": 1, "max": null, "min": "0", "show": true }, { "decimals": null, "format": "percent", "label": "平均使用率", "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 1, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 16, "y": 13 }, "hiddenSeries": false, "id": 197, "legend": { "alignAsTable": false, "avg": false, "current": true, "hideEmpty": false, "hideZero": false, "max": false, "min": false, "rightSide": false, "show": true, "sideWidth": null, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "总平均使用率", "lines": false, "pointradius": 1, "points": true, "yaxis": 2 }, { "alias": "总磁盘量", "color": "#C4162A" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "sum(avg(node_filesystem_size_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance))", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "总磁盘量", "refId": "E" }, { "expr": "sum(avg(node_filesystem_size_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance))", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "总使用量", "refId": "C" }, { "expr": "(sum(avg(node_filesystem_size_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance))) *100/(sum(avg(node_filesystem_avail_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance))+(sum(avg(node_filesystem_size_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~\"$job\",fstype=~\"xfs|ext.*\"})by(device,instance))))", "format": "time_series", "instant": false, "interval": "30m", "intervalFactor": 1, "legendFormat": "总平均使用率", "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "$job:整体总磁盘与整体平均磁盘使用率", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": 1, "format": "bytes", "label": "总磁盘量", "logBase": 1, "max": null, "min": "0", "show": true }, { "decimals": null, "format": "percent", "label": "平均使用率", "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "collapsed": false, "datasource": "${DS_PROMETHEUS}", "gridPos": { "h": 1, "w": 24, "x": 0, "y": 21 }, "id": 189, "panels": [], "title": "资源明细:【$show_hostname】", "type": "row" }, { "cacheTimeout": null, "colorBackground": false, "colorPostfix": false, "colorPrefix": false, "colorValue": true, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "datasource": "${DS_PROMETHEUS}", "decimals": 0, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "format": "s", "gauge": { "maxValue": 100, "minValue": 0, "show": false, "threshcisLabels": false, "threshcisMarkers": true }, "gridPos": { "h": 2, "w": 2, "x": 0, "y": 22 }, "hideTimeOverride": true, "id": 15, "interval": null, "links": [], "mappingType": 1, "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ], "maxDataPoints": 100, "nullPointMode": "null", "nullText": null, "pluginVersion": "6.4.2", "postfix": "", "postfixFontSize": "50%", "prefix": "", "prefixFontSize": "50%", "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ], "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": false, "lineColor": "rgb(31, 120, 193)", "show": false }, "tableColumn": "", "targets": [ { "expr": "avg(time() - node_boot_time_seconds{instance=~\"$node\"})", "format": "time_series", "hide": false, "instant": true, "interval": "", "intervalFactor": 1, "legendFormat": "", "refId": "A", "step": 40 } ], "threshciss": "1,2", "thresholds": "1,3", "title": "运行时间", "type": "singlestat", "valueFontSize": "70%", "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ], "valueName": "current" }, { "datasource": "${DS_PROMETHEUS}", "fieldConfig": { "defaults": { "color": { "mode": "thresholds" }, "custom": {}, "decimals": 2, "displayName": "", "mappings": [ { "from": "", "id": 1, "operator": "", "text": "N/A", "to": "", "type": 1, "value": "0" } ], "max": 100, "min": 0, "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 70 }, { "color": "#EAB839", "value": 90 } ] }, "unit": "percent" }, "overrides": [] }, "gridPos": { "h": 6, "w": 3, "x": 2, "y": 22 }, "id": 177, "options": { "displayMode": "lcd", "fieldOptions": { "calcs": [ "last" ], "defaults": { "decimals": 1, "mappings": [ { "from": "", "id": 1, "operator": "", "text": "N/A", "to": "", "type": 1, "value": "0" } ], "max": 100, "min": 0.1, "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "#EAB839", "value": 70 }, { "color": "red", "value": 90 } ] }, "unit": "percent" }, "overrides": [], "values": false }, "orientation": "horizontal", "reduceOptions": { "calcs": [ "mean" ], "values": false }, "showUnfilled": true }, "pluginVersion": "6.7.3", "targets": [ { "expr": "100 - (avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"idle\"}[5m])) * 100)", "instant": true, "interval": "", "legendFormat": "总CPU使用率", "refId": "A" }, { "expr": "avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"iowait\"}[5m])) * 100", "hide": true, "instant": true, "interval": "", "legendFormat": "IOwait使用率", "refId": "C" }, { "expr": "(1 - (node_memory_MemAvailable_bytes{instance=~\"$node\"} / (node_memory_MemTotal_bytes{instance=~\"$node\"})))* 100", "instant": true, "interval": "", "legendFormat": "内存使用率", "refId": "B" }, { "expr": "(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint=\"$maxmount\"}-node_filesystem_free_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint=\"$maxmount\"})*100 /(node_filesystem_avail_bytes {instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint=\"$maxmount\"}+(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint=\"$maxmount\"}-node_filesystem_free_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint=\"$maxmount\"}))", "hide": false, "instant": true, "interval": "", "legendFormat": "最大分区({{mountpoint}})使用率", "refId": "D" }, { "expr": "(1 - ((node_memory_SwapFree_bytes{instance=~\"$node\"} + 1)/ (node_memory_SwapTotal_bytes{instance=~\"$node\"} + 1))) * 100", "instant": true, "legendFormat": "交换分区使用率", "refId": "F" } ], "timeFrom": null, "timeShift": null, "title": "", "type": "bargauge" }, { "columns": [], "datasource": "${DS_PROMETHEUS}", "description": "本看板中的:磁盘总量、使用量、可用量、使用率保持和df命令的Size、Used、Avail、Use% 列的值一致,并且Use%的值会四舍五入保留一位小数,会更加准确。\n\n注:df中Use%算法为:(size - free) * 100 / (avail + (size - free)),结果是整除则为该值,非整除则为该值+1,结果的单位是%。\n参考df命令源码:", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fontSize": "100%", "gridPos": { "h": 6, "w": 10, "x": 5, "y": 22 }, "id": 181, "links": [ { "targetBlank": true, "title": "https://github.com/coreutils/coreutils/blob/master/src/df.c", "url": "https://github.com/coreutils/coreutils/blob/master/src/df.c" } ], "pageSize": null, "scroll": true, "showHeader": true, "sort": { "col": 6, "desc": false }, "styles": [ { "$$hashKey": "object:818", "alias": "分区", "align": "auto", "colorMode": null, "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "mappingType": 1, "pattern": "mountpoint", "thresholds": [ "" ], "type": "string", "unit": "bytes" }, { "$$hashKey": "object:819", "alias": "可用空间", "align": "auto", "colorMode": "value", "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 1, "mappingType": 1, "pattern": "Value #A", "thresholds": [ "10000000000", "20000000000" ], "type": "number", "unit": "bytes" }, { "$$hashKey": "object:820", "alias": "使用率", "align": "auto", "colorMode": "cell", "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "rgba(245, 54, 54, 0.9)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 1, "mappingType": 1, "pattern": "Value #B", "thresholds": [ "70", "85" ], "type": "number", "unit": "percent" }, { "$$hashKey": "object:821", "alias": "总空间", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 0, "link": false, "mappingType": 1, "pattern": "Value #C", "thresholds": [], "type": "number", "unit": "bytes" }, { "$$hashKey": "object:822", "alias": "文件系统", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "link": false, "mappingType": 1, "pattern": "fstype", "thresholds": [], "type": "string", "unit": "short" }, { "$$hashKey": "object:823", "alias": "设备名", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "dateFormat": "YYYY-MM-DD HH:mm:ss", "decimals": 2, "link": false, "mappingType": 1, "pattern": "device", "preserveFormat": false, "sanitize": false, "thresholds": [], "type": "string", "unit": "short" }, { "$$hashKey": "object:824", "alias": "", "align": "auto", "colorMode": null, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "decimals": 2, "pattern": "/.*/", "preserveFormat": true, "sanitize": false, "thresholds": [], "type": "hidden", "unit": "short" } ], "targets": [ { "expr": "node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}-0", "format": "table", "hide": false, "instant": true, "interval": "", "intervalFactor": 1, "legendFormat": "总量", "refId": "C" }, { "expr": "node_filesystem_avail_bytes {instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}-0", "format": "table", "hide": false, "instant": true, "interval": "10s", "intervalFactor": 1, "legendFormat": "", "refId": "A" }, { "expr": "(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}-node_filesystem_free_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}) *100/(node_filesystem_avail_bytes {instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}+(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}-node_filesystem_free_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}))", "format": "table", "hide": false, "instant": true, "interval": "", "intervalFactor": 1, "legendFormat": "", "refId": "B" } ], "title": "【$show_hostname】:各分区可用空间(EXT.*/XFS)", "transform": "table", "type": "table" }, { "cacheTimeout": null, "colorBackground": false, "colorValue": true, "colors": [ "rgba(50, 172, 45, 0.97)", "rgba(237, 129, 40, 0.89)", "#d44a3a" ], "datasource": "${DS_PROMETHEUS}", "decimals": 2, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "format": "percent", "gauge": { "maxValue": 100, "minValue": 0, "show": false, "thresholdLabels": false, "thresholdMarkers": true }, "gridPos": { "h": 2, "w": 2, "x": 15, "y": 22 }, "id": 20, "interval": null, "links": [], "mappingType": 1, "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ], "maxDataPoints": 100, "nullPointMode": "connected", "nullText": null, "pluginVersion": "6.4.2", "postfix": "", "postfixFontSize": "50%", "prefix": "", "prefixFontSize": "50%", "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ], "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": true, "lineColor": "#3274D9", "show": true, "ymax": null, "ymin": null }, "tableColumn": "", "targets": [ { "expr": "avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"iowait\"}[5m])) * 100", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "", "refId": "A", "step": 20 } ], "thresholds": "20,50", "timeFrom": null, "timeShift": null, "title": "CPU iowait", "type": "singlestat", "valueFontSize": "80%", "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ], "valueName": "avg" }, { "aliasColors": { "cn-shenzhen.i-wz9cq1dcb6zwc39ehw59_cni0_in": "light-red", "cn-shenzhen.i-wz9cq1dcb6zwc39ehw59_cni0_in下载": "green", "cn-shenzhen.i-wz9cq1dcb6zwc39ehw59_cni0_out上传": "yellow", "cn-shenzhen.i-wz9cq1dcb6zwc39ehw59_eth0_in下载": "purple", "cn-shenzhen.i-wz9cq1dcb6zwc39ehw59_eth0_out": "purple", "cn-shenzhen.i-wz9cq1dcb6zwc39ehw59_eth0_out上传": "blue" }, "bars": true, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "editable": true, "error": false, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "grid": {}, "gridPos": { "h": 6, "w": 7, "x": 17, "y": 22 }, "hiddenSeries": false, "id": 183, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": false, "show": false, "sort": "current", "sortDesc": true, "total": true, "values": true }, "lines": false, "linewidth": 2, "links": [], "nullPointMode": "null as zero", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 1, "points": false, "renderer": "flot", "repeat": null, "seriesOverrides": [ { "alias": "/.*_out上传$/", "transform": "negative-Y" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "increase(node_network_receive_bytes_total{instance=~\"$node\",device=~\"$device\"}[60m])", "interval": "60m", "intervalFactor": 1, "legendFormat": "{{device}}_in下载", "metric": "", "refId": "A", "step": 600, "target": "" }, { "expr": "increase(node_network_transmit_bytes_total{instance=~\"$node\",device=~\"$device\"}[60m])", "hide": false, "interval": "60m", "intervalFactor": 1, "legendFormat": "{{device}}_out上传", "refId": "B", "step": 600 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "每小时流量$device", "tooltip": { "msResolution": false, "shared": true, "sort": 0, "value_type": "cumulative" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "bytes", "label": "上传(-)/下载(+)", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "logBase": 1, "max": null, "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "cacheTimeout": null, "colorBackground": false, "colorPostfix": false, "colorValue": true, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "datasource": "${DS_PROMETHEUS}", "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "format": "short", "gauge": { "maxValue": 100, "minValue": 0, "show": false, "thresholdLabels": false, "thresholdMarkers": true }, "gridPos": { "h": 2, "w": 2, "x": 0, "y": 24 }, "id": 14, "interval": null, "links": [], "mappingType": 1, "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ], "maxDataPoints": 100, "maxPerRow": 6, "nullPointMode": "null", "nullText": null, "postfix": "", "postfixFontSize": "50%", "prefix": "", "prefixFontSize": "50%", "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ], "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": false, "lineColor": "rgb(31, 120, 193)", "show": false }, "tableColumn": "", "targets": [ { "expr": "count(node_cpu_seconds_total{instance=~\"$node\", mode='system'})", "format": "time_series", "instant": true, "interval": "", "intervalFactor": 1, "legendFormat": "", "refId": "A", "step": 20 } ], "thresholds": "1,2", "title": "CPU 核数", "type": "singlestat", "valueFontSize": "80%", "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ], "valueName": "current" }, { "cacheTimeout": null, "colorBackground": false, "colorPostfix": false, "colorValue": true, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "datasource": "${DS_PROMETHEUS}", "decimals": null, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "format": "short", "gauge": { "maxValue": 100, "minValue": 0, "show": false, "thresholdLabels": false, "thresholdMarkers": true }, "gridPos": { "h": 2, "w": 2, "x": 15, "y": 24 }, "id": 179, "interval": null, "links": [], "mappingType": 1, "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ], "maxDataPoints": 100, "maxPerRow": 6, "nullPointMode": "null", "nullText": null, "postfix": "", "postfixFontSize": "50%", "prefix": "", "prefixFontSize": "50%", "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ], "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": false, "lineColor": "rgb(31, 120, 193)", "show": false }, "tableColumn": "", "targets": [ { "expr": "avg(node_filesystem_files_free{instance=~\"$node\",mountpoint=\"$maxmount\",fstype=~\"ext.?|xfs\"})", "format": "time_series", "instant": true, "interval": "", "intervalFactor": 1, "legendFormat": "", "refId": "A", "step": 20 } ], "thresholds": "100000,1000000", "title": "剩余节点数:$maxmount ", "type": "singlestat", "valueFontSize": "70%", "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ], "valueName": "current" }, { "cacheTimeout": null, "colorBackground": false, "colorValue": true, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "datasource": "${DS_PROMETHEUS}", "decimals": 0, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "format": "bytes", "gauge": { "maxValue": 100, "minValue": 0, "show": false, "thresholdLabels": false, "thresholdMarkers": true }, "gridPos": { "h": 2, "w": 2, "x": 0, "y": 26 }, "id": 75, "interval": null, "links": [], "mappingType": 1, "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ], "maxDataPoints": 100, "maxPerRow": 6, "nullPointMode": "null", "nullText": null, "postfix": "", "postfixFontSize": "70%", "prefix": "", "prefixFontSize": "50%", "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ], "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": false, "lineColor": "rgb(31, 120, 193)", "show": false }, "tableColumn": "", "targets": [ { "expr": "sum(node_memory_MemTotal_bytes{instance=~\"$node\"})", "format": "time_series", "instant": true, "interval": "", "intervalFactor": 1, "legendFormat": "{{instance}}", "refId": "A", "step": 20 } ], "thresholds": "2,3", "title": "总内存", "type": "singlestat", "valueFontSize": "80%", "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ], "valueName": "current" }, { "cacheTimeout": null, "colorBackground": false, "colorPostfix": false, "colorValue": true, "colors": [ "rgba(245, 54, 54, 0.9)", "rgba(237, 129, 40, 0.89)", "rgba(50, 172, 45, 0.97)" ], "datasource": "${DS_PROMETHEUS}", "decimals": null, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "format": "locale", "gauge": { "maxValue": 100, "minValue": 0, "show": false, "thresholdLabels": false, "thresholdMarkers": true }, "gridPos": { "h": 2, "w": 2, "x": 15, "y": 26 }, "id": 178, "interval": null, "links": [], "mappingType": 1, "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ], "maxDataPoints": 100, "maxPerRow": 6, "nullPointMode": "null", "nullText": null, "postfix": "", "postfixFontSize": "50%", "prefix": "", "prefixFontSize": "50%", "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ], "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": false, "lineColor": "rgb(31, 120, 193)", "show": false }, "tableColumn": "", "targets": [ { "expr": "avg(node_filefd_maximum{instance=~\"$node\"})", "format": "time_series", "instant": true, "intervalFactor": 1, "legendFormat": "", "refId": "A", "step": 20 } ], "thresholds": "1024,10000", "title": "总文件描述符", "type": "singlestat", "valueFontSize": "70%", "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ], "valueName": "current" }, { "aliasColors": { "192.168.200.241:9100_Total": "dark-red", "Idle - Waiting for something to happen": "#052B51", "guest": "#9AC48A", "idle": "#052B51", "iowait": "#EAB839", "irq": "#BF1B00", "nice": "#C15C17", "sdb_每秒I/O操作%": "#d683ce", "softirq": "#E24D42", "steal": "#FCE2DE", "system": "#508642", "user": "#5195CE", "磁盘花费在I/O操作占比": "#ba43a9" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 0, "y": 28 }, "hiddenSeries": false, "id": 7, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "rightSide": false, "show": true, "sideWidth": null, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "maxPerRow": 6, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "repeat": null, "seriesOverrides": [ { "$$hashKey": "object:263", "alias": "/.*总使用率/", "color": "#C4162A", "fill": 0 } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"system\"}[5m])) by (instance) *100", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "系统使用率", "refId": "A", "step": 20 }, { "expr": "avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"user\"}[5m])) by (instance) *100", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "用户使用率", "refId": "B", "step": 240 }, { "expr": "avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"iowait\"}[5m])) by (instance) *100", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "磁盘IO使用率", "refId": "D", "step": 240 }, { "expr": "(1 - avg(irate(node_cpu_seconds_total{instance=~\"$node\",mode=\"idle\"}[5m])) by (instance))*100", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "总使用率", "refId": "F", "step": 240 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "CPU使用率", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:278", "decimals": 0, "format": "percent", "label": "", "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:279", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "192.168.200.241:9100_总内存": "dark-red", "使用率": "yellow", "内存_Avaliable": "#6ED0E0", "内存_Cached": "#EF843C", "内存_Free": "#629E51", "内存_Total": "#6d1f62", "内存_Used": "#eab839", "可用": "#9ac48a", "总内存": "#bf1b00" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 8, "y": 28 }, "height": "300", "hiddenSeries": false, "id": 156, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "rightSide": false, "show": true, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "总内存", "color": "#C4162A", "fill": 0 }, { "alias": "使用率", "color": "rgb(0, 209, 255)", "lines": false, "pointradius": 1, "points": true, "yaxis": 2 } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "node_memory_MemTotal_bytes{instance=~\"$node\"}", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "总内存", "refId": "A", "step": 4 }, { "expr": "node_memory_MemTotal_bytes{instance=~\"$node\"} - node_memory_MemAvailable_bytes{instance=~\"$node\"}", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "已用", "refId": "B", "step": 4 }, { "expr": "node_memory_MemAvailable_bytes{instance=~\"$node\"}", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "可用", "refId": "F", "step": 4 }, { "expr": "node_memory_Buffers_bytes{instance=~\"$node\"}", "format": "time_series", "hide": true, "intervalFactor": 1, "legendFormat": "内存_Buffers", "refId": "D", "step": 4 }, { "expr": "node_memory_MemFree_bytes{instance=~\"$node\"}", "format": "time_series", "hide": true, "intervalFactor": 1, "legendFormat": "内存_Free", "refId": "C", "step": 4 }, { "expr": "node_memory_Cached_bytes{instance=~\"$node\"}", "format": "time_series", "hide": true, "intervalFactor": 1, "legendFormat": "内存_Cached", "refId": "E", "step": 4 }, { "expr": "node_memory_MemTotal_bytes{instance=~\"$node\"} - (node_memory_Cached_bytes{instance=~\"$node\"} + node_memory_Buffers_bytes{instance=~\"$node\"} + node_memory_MemFree_bytes{instance=~\"$node\"})", "format": "time_series", "hide": true, "intervalFactor": 1, "refId": "G" }, { "expr": "(1 - (node_memory_MemAvailable_bytes{instance=~\"$node\"} / (node_memory_MemTotal_bytes{instance=~\"$node\"})))* 100", "format": "time_series", "hide": false, "interval": "30m", "intervalFactor": 10, "legendFormat": "使用率", "refId": "H" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "内存信息", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "bytes", "label": null, "logBase": 1, "max": null, "min": "0", "show": true }, { "format": "percent", "label": "内存使用率", "logBase": 1, "max": "100", "min": "0", "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "192.168.10.227:9100_em1_in下载": "super-light-green", "192.168.10.227:9100_em1_out上传": "dark-blue" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 16, "y": 28 }, "height": "300", "hiddenSeries": false, "id": 157, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "rightSide": false, "show": true, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/.*_out上传$/", "transform": "negative-Y" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "irate(node_network_receive_bytes_total{instance=~'$node',device=~\"$device\"}[5m])*8", "format": "time_series", "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_in下载", "refId": "A", "step": 4 }, { "expr": "irate(node_network_transmit_bytes_total{instance=~'$node',device=~\"$device\"}[5m])*8", "format": "time_series", "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_out上传", "refId": "B", "step": 4 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "每秒网络带宽使用$device", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "bps", "label": "上传(-)/下载(+)", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "15分钟": "#6ED0E0", "1分钟": "#BF1B00", "5分钟": "#CCA300" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "editable": true, "error": false, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 1, "grid": {}, "gridPos": { "h": 8, "w": 8, "x": 0, "y": 36 }, "height": "300", "hiddenSeries": false, "id": 13, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "rightSide": false, "show": true, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "maxPerRow": 6, "nullPointMode": "null as zero", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "repeat": null, "seriesOverrides": [ { "alias": "/.*总核数/", "color": "#C4162A" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "node_load1{instance=~\"$node\"}", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "1分钟负载", "metric": "", "refId": "A", "step": 20, "target": "" }, { "expr": "node_load5{instance=~\"$node\"}", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "5分钟负载", "refId": "B", "step": 20 }, { "expr": "node_load15{instance=~\"$node\"}", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "15分钟负载", "refId": "C", "step": 20 }, { "expr": " sum(count(node_cpu_seconds_total{instance=~\"$node\", mode='system'}) by (cpu,instance)) by(instance)", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "CPU总核数", "refId": "D", "step": 20 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "系统平均负载", "tooltip": { "msResolution": false, "shared": true, "sort": 2, "value_type": "cumulative" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "vda_write": "#6ED0E0" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "description": "Read bytes 每个磁盘分区每秒读取的比特数\nWritten bytes 每个磁盘分区每秒写入的比特数", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 1, "gridPos": { "h": 8, "w": 8, "x": 8, "y": 36 }, "height": "300", "hiddenSeries": false, "id": 168, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "show": true, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/.*_读取$/", "transform": "negative-Y" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "irate(node_disk_read_bytes_total{instance=~\"$node\"}[5m])", "format": "time_series", "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_读取", "refId": "A", "step": 10 }, { "expr": "irate(node_disk_written_bytes_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_写入", "refId": "B", "step": 10 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "每秒磁盘读写容量", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": null, "format": "Bps", "label": "读取(-)/写入(+)", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 1, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 8, "w": 8, "x": 16, "y": 36 }, "hiddenSeries": false, "id": 174, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "rightSide": false, "show": true, "sideWidth": null, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/Inodes.*/", "yaxis": 2 } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}-node_filesystem_free_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}) *100/(node_filesystem_avail_bytes {instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}+(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}-node_filesystem_free_bytes{instance=~'$node',fstype=~\"ext.*|xfs\",mountpoint !~\".*pod.*\"}))", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{mountpoint}}", "refId": "A" }, { "expr": "node_filesystem_files_free{instance=~'$node',fstype=~\"ext.?|xfs\"} / node_filesystem_files{instance=~'$node',fstype=~\"ext.?|xfs\"}", "hide": true, "interval": "", "legendFormat": "Inodes:{{instance}}:{{mountpoint}}", "refId": "B" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "磁盘使用率", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": null, "format": "percent", "label": "", "logBase": 1, "max": "100", "min": "0", "show": true }, { "decimals": 2, "format": "percentunit", "label": null, "logBase": 1, "max": "1", "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "vda_write": "#6ED0E0" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "description": "Reads completed: 每个磁盘分区每秒读完成次数\n\nWrites completed: 每个磁盘分区每秒写完成次数\n\nIO now 每个磁盘分区每秒正在处理的输入/输出请求数", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 9, "w": 8, "x": 0, "y": 44 }, "height": "300", "hiddenSeries": false, "id": 161, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "show": true, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/.*_读取$/", "transform": "negative-Y" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "irate(node_disk_reads_completed_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_读取", "refId": "A", "step": 10 }, { "expr": "irate(node_disk_writes_completed_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_写入", "refId": "B", "step": 10 }, { "expr": "node_disk_io_now{instance=~\"$node\"}", "format": "time_series", "hide": true, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}", "refId": "C" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "磁盘读写速率(IOPS)", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": null, "format": "iops", "label": "读取(-)/写入(+)I/O ops/sec", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "Idle - Waiting for something to happen": "#052B51", "guest": "#9AC48A", "idle": "#052B51", "iowait": "#EAB839", "irq": "#BF1B00", "nice": "#C15C17", "sdb_每秒I/O操作%": "#d683ce", "softirq": "#E24D42", "steal": "#FCE2DE", "system": "#508642", "user": "#5195CE", "磁盘花费在I/O操作占比": "#ba43a9" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": null, "description": "每一秒钟的自然时间内,花费在I/O上的耗时。(wall-clock time)\n\nnode_disk_io_time_seconds_total:\n磁盘花费在输入/输出操作上的秒数。该值为累加值。(Milliseconds Spent Doing I/Os)\n\nirate(node_disk_io_time_seconds_total[1m]):\n计算每秒的速率:(last值-last前一个值)/时间戳差值,即:1秒钟内磁盘花费在I/O操作的时间占比。", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 8, "x": 8, "y": 44 }, "hiddenSeries": false, "id": 175, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": false, "rightSide": false, "show": true, "sideWidth": null, "sort": null, "sortDesc": null, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "maxPerRow": 6, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "irate(node_disk_io_time_seconds_total{instance=~\"$node\"}[5m])", "format": "time_series", "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_每秒I/O操作%", "refId": "C" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "每1秒内I/O操作耗时占比", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "decimals": null, "format": "percentunit", "label": "", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "vda": "#6ED0E0" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "description": "Read time seconds 每个磁盘分区读操作花费的秒数\n\nWrite time seconds 每个磁盘分区写操作花费的秒数\n\nIO time seconds 每个磁盘分区输入/输出操作花费的秒数\n\nIO time weighted seconds每个磁盘分区输入/输出操作花费的加权秒数", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 1, "gridPos": { "h": 9, "w": 8, "x": 16, "y": 44 }, "height": "300", "hiddenSeries": false, "id": 160, "legend": { "alignAsTable": true, "avg": true, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": true, "show": true, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null as zero", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/,*_读取$/", "transform": "negative-Y" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "irate(node_disk_read_time_seconds_total{instance=~\"$node\"}[5m]) / irate(node_disk_reads_completed_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_读取", "refId": "B" }, { "expr": "irate(node_disk_write_time_seconds_total{instance=~\"$node\"}[5m]) / irate(node_disk_writes_completed_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_写入", "refId": "C" }, { "expr": "irate(node_disk_io_time_seconds_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": true, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}", "refId": "A", "step": 10 }, { "expr": "irate(node_disk_io_time_weighted_seconds_total{instance=~\"$node\"}[5m])", "format": "time_series", "hide": true, "interval": "", "intervalFactor": 1, "legendFormat": "{{device}}_加权", "refId": "D" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "每次IO读写的耗时(参考:小于100ms)(beta)", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "s", "label": "读取(-)/写入(+)", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": false } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "192.168.200.241:9100_TCP_alloc": "semi-dark-blue", "TCP": "#6ED0E0", "TCP_alloc": "blue" }, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "decimals": 2, "description": "Sockets_used - 已使用的所有协议套接字总量\n\nCurrEstab - 当前状态为 ESTABLISHED 或 CLOSE-WAIT 的 TCP 连接数\n\nTCP_alloc - 已分配(已建立、已申请到sk_buff)的TCP套接字数量\n\nTCP_tw - 等待关闭的TCP连接数\n\nUDP_inuse - 正在使用的 UDP 套接字数量\n\nRetransSegs - TCP 重传报文数\n\nOutSegs - TCP 发送的报文数\n\nInSegs - TCP 接收的报文数", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 8, "w": 16, "x": 0, "y": 53 }, "height": "300", "hiddenSeries": false, "id": 158, "interval": "", "legend": { "alignAsTable": true, "avg": false, "current": true, "hideEmpty": true, "hideZero": true, "max": true, "min": false, "rightSide": true, "show": true, "sideWidth": null, "sort": "current", "sortDesc": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/.*Sockets_used/", "color": "#E02F44", "lines": false, "pointradius": 1, "points": true, "yaxis": 2 } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "node_netstat_Tcp_CurrEstab{instance=~'$node'}", "format": "time_series", "hide": false, "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "CurrEstab", "refId": "A", "step": 20 }, { "expr": "node_sockstat_TCP_tw{instance=~'$node'}", "format": "time_series", "interval": "", "intervalFactor": 1, "legendFormat": "TCP_tw", "refId": "D" }, { "expr": "node_sockstat_sockets_used{instance=~'$node'}", "hide": false, "interval": "30m", "intervalFactor": 1, "legendFormat": "Sockets_used", "refId": "B" }, { "expr": "node_sockstat_UDP_inuse{instance=~'$node'}", "interval": "", "legendFormat": "UDP_inuse", "refId": "C" }, { "expr": "node_sockstat_TCP_alloc{instance=~'$node'}", "interval": "", "legendFormat": "TCP_alloc", "refId": "E" }, { "expr": "irate(node_netstat_Tcp_PassiveOpens{instance=~'$node'}[5m])", "hide": true, "interval": "", "legendFormat": "{{instance}}_Tcp_PassiveOpens", "refId": "G" }, { "expr": "irate(node_netstat_Tcp_ActiveOpens{instance=~'$node'}[5m])", "hide": true, "interval": "", "legendFormat": "{{instance}}_Tcp_ActiveOpens", "refId": "F" }, { "expr": "irate(node_netstat_Tcp_InSegs{instance=~'$node'}[5m])", "interval": "", "legendFormat": "Tcp_InSegs", "refId": "H" }, { "expr": "irate(node_netstat_Tcp_OutSegs{instance=~'$node'}[5m])", "interval": "", "legendFormat": "Tcp_OutSegs", "refId": "I" }, { "expr": "irate(node_netstat_Tcp_RetransSegs{instance=~'$node'}[5m])", "hide": false, "interval": "", "legendFormat": "Tcp_RetransSegs", "refId": "J" }, { "expr": "irate(node_netstat_TcpExt_ListenDrops{instance=~'$node'}[5m])", "hide": true, "interval": "", "legendFormat": "", "refId": "K" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "网络Socket连接信息", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "transformations": [], "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": "已使用的所有协议套接字总量", "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": { "filefd_192.168.200.241:9100": "super-light-green", "switches_192.168.200.241:9100": "semi-dark-red", "使用的文件描述符_10.118.72.128:9100": "red", "每秒上下文切换次数_10.118.71.245:9100": "yellow", "每秒上下文切换次数_10.118.72.128:9100": "yellow" }, "bars": false, "cacheTimeout": null, "dashLength": 10, "dashes": false, "datasource": "${DS_PROMETHEUS}", "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 0, "fillGradient": 1, "gridPos": { "h": 8, "w": 8, "x": 16, "y": 53 }, "hiddenSeries": false, "hideTimeOverride": false, "id": 16, "legend": { "alignAsTable": false, "avg": false, "current": true, "max": false, "min": false, "rightSide": false, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 2, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pluginVersion": "6.4.2", "pointradius": 1, "points": false, "renderer": "flot", "seriesOverrides": [ { "alias": "/每秒上下文切换次数.*/", "color": "#FADE2A", "lines": false, "pointradius": 1, "points": true, "yaxis": 2 }, { "alias": "/使用的文件描述符.*/", "color": "#F2495C" } ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "node_filefd_allocated{instance=~\"$node\"}", "format": "time_series", "instant": false, "interval": "", "intervalFactor": 5, "legendFormat": "使用的文件描述符", "refId": "B" }, { "expr": "irate(node_context_switches_total{instance=~\"$node\"}[5m])", "interval": "", "intervalFactor": 5, "legendFormat": "每秒上下文切换次数", "refId": "A" }, { "expr": " (node_filefd_allocated{instance=~\"$node\"}/node_filefd_maximum{instance=~\"$node\"}) *100", "format": "time_series", "hide": true, "instant": false, "interval": "", "intervalFactor": 5, "legendFormat": "使用的文件描述符占比_{{instance}}", "refId": "C" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "打开的文件描述符(左 )/每秒上下文切换次数(右)", "tooltip": { "shared": true, "sort": 2, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "label": "使用的文件描述符", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": "context_switches", "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } } ], "refresh": "", "schemaVersion": 22, "style": "dark", "tags": [ "Prometheus", "node_exporter", "StarsL.cn" ], "templating": { "list": [ { "allValue": null, "current": {}, "datasource": "${DS_PROMETHEUS}", "definition": "label_values(node_uname_info, job)", "hide": 0, "includeAll": false, "index": -1, "label": "JOB", "multi": false, "name": "job", "options": [], "query": "label_values(node_uname_info, job)", "refresh": 1, "regex": "", "skipUrlSync": false, "sort": 5, "tagValuesQuery": "", "tags": [], "tagsQuery": "", "type": "query", "useTags": false }, { "allValue": null, "current": {}, "datasource": "${DS_PROMETHEUS}", "definition": "label_values(node_uname_info{job=~\"$job\"}, nodename)", "hide": 0, "includeAll": true, "index": -1, "label": "主机名", "multi": false, "name": "hostname", "options": [], "query": "label_values(node_uname_info{job=~\"$job\"}, nodename)", "refresh": 1, "regex": "", "skipUrlSync": false, "sort": 5, "tagValuesQuery": "", "tags": [], "tagsQuery": "", "type": "query", "useTags": false }, { "allFormat": "glob", "allValue": null, "current": {}, "datasource": "${DS_PROMETHEUS}", "definition": "label_values(node_uname_info{job=~\"$job\",nodename=~\"$hostname\"},instance)", "hide": 0, "includeAll": false, "index": -1, "label": "Instance", "multi": true, "multiFormat": "regex values", "name": "node", "options": [], "query": "label_values(node_uname_info{job=~\"$job\",nodename=~\"$hostname\"},instance)", "refresh": 1, "regex": "", "skipUrlSync": false, "sort": 5, "tagValuesQuery": "", "tags": [], "tagsQuery": "", "type": "query", "useTags": false }, { "allFormat": "glob", "allValue": null, "current": {}, "datasource": "${DS_PROMETHEUS}", "definition": "label_values(node_network_info{device!~'tap.*|veth.*|br.*|docker.*|virbr.*|lo.*|cni.*'},device)", "hide": 0, "includeAll": true, "index": -1, "label": "网卡", "multi": true, "multiFormat": "regex values", "name": "device", "options": [], "query": "label_values(node_network_info{device!~'tap.*|veth.*|br.*|docker.*|virbr.*|lo.*|cni.*'},device)", "refresh": 1, "regex": "", "skipUrlSync": false, "sort": 1, "tagValuesQuery": "", "tags": [], "tagsQuery": "", "type": "query", "useTags": false }, { "allValue": null, "current": {}, "datasource": "${DS_PROMETHEUS}", "definition": "query_result(topk(1,sort_desc (max(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.?|xfs\",mountpoint!~\".*pods.*\"}) by (mountpoint))))", "hide": 2, "includeAll": false, "index": -1, "label": "最大挂载目录", "multi": false, "name": "maxmount", "options": [], "query": "query_result(topk(1,sort_desc (max(node_filesystem_size_bytes{instance=~'$node',fstype=~\"ext.?|xfs\",mountpoint!~\".*pods.*\"}) by (mountpoint))))", "refresh": 2, "regex": "/.*\\\"(.*)\\\".*/", "skipUrlSync": false, "sort": 5, "tagValuesQuery": "", "tags": [], "tagsQuery": "", "type": "query", "useTags": false }, { "allValue": null, "current": {}, "datasource": "${DS_PROMETHEUS}", "definition": "label_values(node_uname_info{job=~\"$job\",instance=~\"$node\"}, nodename)", "hide": 2, "includeAll": false, "index": -1, "label": "展示使用的主机名", "multi": false, "name": "show_hostname", "options": [], "query": "label_values(node_uname_info{job=~\"$job\",instance=~\"$node\"}, nodename)", "refresh": 1, "regex": "", "skipUrlSync": false, "sort": 5, "tagValuesQuery": "", "tags": [], "tagsQuery": "", "type": "query", "useTags": false } ] }, "time": { "from": "now-12h", "to": "now" }, "timepicker": { "hidden": false, "now": true, "refresh_intervals": [ "15s", "30s", "1m", "5m", "15m", "30m" ], "time_options": [ "5m", "15m", "1h", "6h", "12h", "24h", "2d", "7d", "30d" ] }, "timezone": "browser", "title": "1 Node Exporter for Prometheus Dashboard CN v20200628", "uid": "9CWBz0bik", "variables": { "list": [] }, "version": 91 }