Prometheus对接VictoriaMetrics

转载自博客：https://blog.csdn.net/alex_yangchuansheng/article/details/107852927

4. 实践
确定好了方案之后，下面来进行动手实践。

部署 VictoriaMetrics
首先部署一个单实例的 VictoriaMetrics，完整的 yaml 如下：

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: victoriametrics
  namespace: kube-system
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: victoriametrics
  name: victoriametrics
  namespace: kube-system
spec:
  serviceName: pvictoriametrics
  selector:
    matchLabels:
      app: victoriametrics
  replicas: 1
  template:
    metadata:
      labels:
        app: victoriametrics
    spec:
      nodeSelector:
        blog: "true"
      containers:    
      - args:
        - --storageDataPath=/storage
        - --httpListenAddr=:8428
        - --retentionPeriod=1
        image: victoriametrics/victoria-metrics
        imagePullPolicy: IfNotPresent
        name: victoriametrics
        ports:
        - containerPort: 8428
          protocol: TCP
        readinessProbe:
          httpGet:
            path: /health
            port: 8428
          initialDelaySeconds: 30
          timeoutSeconds: 30
        livenessProbe:
          httpGet:
            path: /health
            port: 8428
          initialDelaySeconds: 120
          timeoutSeconds: 30
        resources:
          limits:
            cpu: 2000m
            memory: 2000Mi
          requests:
            cpu: 2000m
            memory: 2000Mi
        volumeMounts:
        - mountPath: /storage
          name: storage-volume
      restartPolicy: Always
      priorityClassName: system-cluster-critical
      volumes:
      - name: storage-volume
        persistentVolumeClaim:
          claimName: victoriametrics
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: victoriametrics
  name: victoriametrics
  namespace: kube-system
spec:
  ports:
  - name: http
    port: 8428
    protocol: TCP
    targetPort: 8428
  selector:
    app: victoriametrics
  type: ClusterIP

有几个启动参数需要注意：

storageDataPath : 数据目录的路径。VictoriaMetrics 将所有数据存储在此目录中。

retentionPeriod : 数据的保留期限（以月为单位）。旧数据将自动删除。默认期限为1个月。

httpListenAddr : 用于监听 HTTP 请求的 TCP 地址。默认情况下，它在所有网络接口上监听端口 8428。

给 namespace 打标签
为了限定抓取 target 的 namespace，我们需要给 namespace 打上标签，使每个 Prometheus 实例只抓取特定 namespace 的指标。根据上文的方案，需要给 kube-system 打上标签 monitoring-role=system：

$ kubectl label ns kube-system monitoring-role=system
给其他的 namespace 打上标签 monitoring-role=others。例如：

$ kubectl label ns monitoring monitoring-role=others
$ kubectl label ns default monitoring-role=other

拆分 Prometheus
下一步是拆分 Prometheus 实例，根据上面的方案需要拆分成两个实例，一个用来监控 kube-system namespace，另一个用来监控其他 namespace：

# prometheus-prometheus-system.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    prometheus: system 
  name: system
  namespace: monitoring
spec:
  remoteWrite:
    - url: http://victoriametrics.kube-system.svc.cluster.local:8428/api/v1/write
      queueConfig:
        maxSamplesPerSend: 10000
  retention: 2h 
  alerting:
    alertmanagers:
    - name: alertmanager-main
      namespace: monitoring
      port: web
  image: quay.io/prometheus/prometheus:v2.17.2
  nodeSelector:
    beta.kubernetes.io/os: linux
  podMonitorNamespaceSelector:
    matchLabels:
      monitoring-role: system 
  podMonitorSelector: {}
  replicas: 1 
  resources:
    requests:
      memory: 400Mi
    limits:
      memory: 2Gi
  ruleSelector:
    matchLabels:
      prometheus: system 
      role: alert-rules
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: prometheus-k8s
  serviceMonitorNamespaceSelector: 
    matchLabels:
      monitoring-role: system 
  serviceMonitorSelector: {}
  version: v2.17.2
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    prometheus: others
  name: others
  namespace: monitoring
spec:
  remoteWrite:
    - url: http://victoriametrics.kube-system.svc.cluster.local:8428/api/v1/write
      queueConfig:
        maxSamplesPerSend: 10000
  retention: 2h
  alerting:
    alertmanagers:
    - name: alertmanager-main
      namespace: monitoring
      port: web
  image: quay.io/prometheus/prometheus:v2.17.2
  nodeSelector:
    beta.kubernetes.io/os: linux
  podMonitorNamespaceSelector: 
    matchLabels:
      monitoring-role: others 
  podMonitorSelector: {}
  replicas: 1
  resources:
    requests:
      memory: 400Mi
    limits:
      memory: 2Gi
  ruleSelector:
    matchLabels:
      prometheus: others 
      role: alert-rules
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: prometheus-k8s
  serviceMonitorNamespaceSelector:
    matchLabels:
      monitoring-role: others 
  serviceMonitorSelector: {}
  additionalScrapeConfigs:
    name: additional-scrape-configs
    key: prometheus-additional.yaml
  version: v2.17.2

需要注意的配置：

通过 remoteWrite 指定 remote write 写入的远程存储。

通过 ruleSelector 指定 PrometheusRule。

限制内存使用上限为 2Gi，可根据实际情况自行调整。

通过 retention 指定数据在本地磁盘的保存时间为 2 小时。因为指定了远程存储，本地不需要保存那么长时间，尽量缩短。

Prometheus 的自定义配置可以通过 additionalScrapeConfigs 在 others 实例中指定，当然你也可以继续拆分，放到其他实例中。

修改 Grafana 数据源
Prometheus 拆分成功之后，最后还要修改 Grafana 的数据源为 VictoriaMetrics 的地址，这样就可以在 Grafana 中查看全局视图，也能聚合查询。

打开 Grafana 的设置页面，将数据源修改为 http://victoriametrics.kube-system.svc.cluster.local:8428：

脚注
[1]

VictoriaMetrics: https://github.com/VictoriaMetrics/VictoriaMetrics

[2]
VictoriaMetrics: https://github.com/VictoriaMetrics/VictoriaMetrics

[3]
prometheus-rules-system.yaml: https://gist.github.com/yangchuansheng/4310ae9f41513899dc5f0176cdf804b1

[4]
prometheus-rules-others.yaml: https://gist.github.com/yangchuansheng/102595fc50436cf4a2ce18744467718c

posted on 2024-07-24 22:14 luzhouxiaoshuai 阅读(109) 评论(0) 编辑收藏举报

刷新页面返回顶部

luzhouxiaoshuai

Prometheus对接VictoriaMetrics

导航

公告