Prometheus自定义metrics监控进程存活状态
一. 监控进程存活
有时候我们需要监控进程的状态,由于我们常用的node_exporter并不能覆盖所有监控项,这里我们使用自定义的方式对进程进行监控。
二. 自定义Python脚本定义metrics值
2.1 安装pip
yum install -y python-pip
2.2 编写py脚本
# coding: utf-8 import sys import psutil from prometheus_client import CollectorRegistry, Gauge, write_to_textfile monitor_list = [{'name': 'gitlab','desc':'gitlab-process'},{'name': 'nginx','desc':'nginx-process'},] def checkProcessCount(process_name): count = 0 for proc in psutil.process_iter(): try: if process_name.lower() in proc.name().lower(): count +=1 except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess): pass print count return count def save_metrics(): registry = CollectorRegistry() gauge = Gauge('process_number', 'Number of Process',['name'], registry=registry) for p in monitor_list: count = checkProcessCount(p['name']) gauge.labels(name=p['name']).set(count) write_to_textfile('/var/lib/node_exporter/textfile_collector/metadata.prom', registry) if __name__ == '__main__': save_metrics()
把脚本加入到定时任务中,根据自定义定的metrics值去匹配做rules报警