k8s servicemonitor 采集超时配置
背景说明
我们有时候在编写exporter时,其中某个采集的metrics接口获取数据很慢,可能需要达到10~20S,基于此种情况,如果我们按照ServiceMonitor默认的配置进行,这里默认scrapeTimeout为10S,采集时会超时,对应Prometheus的Target会提示,servicemonitor对应的State状态为Down。这里我们需要修改ServiceMonitor超时时间。
修改
方案1:直接修改Prometheus的全局Global配置,这里修改全局配置对所有的exporter和servicemonitor生效。
prometheus.yml: |
global:
scrape_interval: 30s
scrape_timeout: 30s
evaluation_interval: 30s
方案2:修改对应的ServiceMonitor的配置
kind: Service
apiVersion: v1
metadata:
name: external-test-exporter
labels:
app: external-test-exporter
namespace: monitoring
spec:
type: ClusterIP
ports:
- name: metrics
port: 20666
protocol: TCP
targetPort: 20666
---
apiVersion: v1
kind: Endpoints
metadata:
name: external-test-exporter
labels:
app: external-test-exporter
namespace: monitoring
subsets:
- addresses:
- ip: 10.5.5.6
ports:
- name: metrics
port: 20666
protocol: TCP
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: external-test-exporter
spec:
selector:
matchLabels:
app: external-test-exporter
namespaceSelector:
matchNames:
- monitoring
endpoints:
- port: metrics
path: /info
interval: 90s
- port: metrics
path: /util
interval: 90s
- port: metrics
path: /report
interval: 120s
scrapeTimeout: 30s # 这里配置采集的超时时间