谷歌云监控告警

谷歌没有提供钉钉告警之类的,用习惯了钉钉告警的有点不习惯,于是自己研究看文档写了一个。

(1)用谷歌云pub/sub创建日志流

(2)添加pub/sub监控告警订阅消息通知

(3)基于pub/sub消息订阅编写cloud function函数处理实时消息。

脚本日下:

## ==================================================
##    让读书成为一种生活方式。就像吃喝拉撒每天必须要干的事,
## 终有一天你的举止、言谈、气质会不一样。 
##                                        —- 5sdba 
##
## Created Date: Saturday, 2021-05-08, 11:55:05 am
## copyright (c):    SZWW Tech. LTD. 
## Engineer:   async 
## Module Name:   
## Revision:   v0.01 
## Description:
##   
## Revision History : 
## Revision  editor date         Description         
## v0.01  async  2021-05-08 File Created
## ==================================================
import json
import os
import re
import datetime, time
import requests
import base64
import pytz
tz = pytz.timezone('Asia/Shanghai')

def send_msg(event, context):
    url="xxxxxxx"  #dingtalk url
    es_str=base64.b64decode(event['data']).decode('utf-8')
    event=json.loads(es_str)
    s_time=event['incident']['started_at']
    e_time=event['incident']['ended_at']
    # s_region=event['incident']['resource']['labels']['region'] # area
    s_resource_name=event['incident']['resource_name']# resource_name
    s_policy_name=event['incident']['condition_name']    # policy_name
    s_max_value = re.findall(r'value of(\s*)(.*)', event['incident']['summary']).__str__().split(',')[-1].replace(')','').replace(']', '').replace("'", '').rstrip('.')
    s_thresholdvalue = event['incident']['condition']['conditionThreshold']['thresholdValue'].__str__() # thresholdvalue
    s_cnt=event['incident']['condition']['conditionThreshold']['trigger']['count'] # cnt
    s_time=datetime.datetime.now(tz).strftime('%Y-%m-%d %H:%M:%S') # time
    s_state=event['incident']['state'] # state
    if "open" == s_state:
        title = "<font color=#FF0000 size=3>Google_cloud 报警触发</font>"
        tag_word = " 连续"
    elif "closed" == s_state:
        title = "<font color=#008000 size=3>Google_cloud 报警恢复</font>"
        tag_word = " 未连续"
    elif "null" == s_state:
        title = "Google_cloud 报警异常(数据不足)"
        tag_word = " , Insufficient Data 未连续"


    pagrem = {
        "msgtype": "markdown",
        "markdown": {
            "title": "AWS告警" + "....",
            "text": "@15527453712 \n" +
                    "报警主题 :" + title +
                    "\n\n>监控指标:" + s_policy_name  +
                    "\n\n>报警时间:" +  str(s_time) +
                    "\n\n>报警资源:" + s_resource_name +
                    "\n\n>报警信息:" + "当前值=" + s_thresholdvalue + tag_word + str(s_cnt) + "次达到 " + "阀值=" + str(size_b_to_other(float(s_max_value)))
            },
        "at": {
            "atMobiles": [15527453712],
            "isAtAll": False}
    }
    headers = {'Content-Type': 'application/json'}
    requests.post(url, data=json.dumps(pagrem), headers=headers)
    print(json.dumps(pagrem))


def size_b_to_other(size):
    """用于转换容量单位"""
    units = ['B', 'KB', 'MB', 'GB', 'TB']
    # 处理异常
    if size < 1024:
        return size

    # 遍历单位的位置并用取整除赋值
    for unit in units:
        if size >= 1024:
            size //= 1024
        else:
            size_h = '{} {}'.format(size, unit)
            return size_h

    size_h = '{} {}'.format(size, unit)
    return size_h

最终告警消息如下:

posted @ 2021-05-16 18:58  5sdba  阅读(192)  评论(0编辑  收藏  举报