用ELK分析每天4亿多条腾讯云MySQL审计日志(2)--EQL

    上一篇介绍了用ELK分析4亿多条审计日志过程,现在介绍如何用Python3分析ES的程序

    需要分析的核心库审计数据:

         1,950多张表,几十个账号,

         2,5种操作类型(select,update,insert,delete,replace),14个实例(1主13从库)

   分析说明:

    select汇总: 必须包含from关键字:排除INSERT INTO 表 SELECT '609818','1' 这样情况

    insert汇总: 必须有into关键字: 排除select中有insert关键字

    update汇总: 不包含for 关键字: 排查 for update 的select查询

   delete汇总: 不包含into 关键字: 排除insert等内容里包含delete的数据情况

    replace汇总: 必须有into 关键字: 排除select查询中有replace函数

   分析方法:

   为保证把以上数据都能分析出来, 将950多张表,存放到数据表中,循环950多次表,每个表循环5次类型:  

    sql="select id,name,ea_time from tab order by id desc --查询全部表
    dml={'select','update','insert','delete','replace'} --每个表循环5次类型

    后来研发发现,分析程序表ol_list统计,但"库名.表",如line.ol_list,不会统计出来。后来修改EQL解决,具体代码如下:

    下列是“”select“查询EQL的代码:        

 if op.find('select')>=0:  # select,包含from
     body ={"query":{
        "bool":{ "must":[{
                "match":{"Sql":'{op}'.format(op=op) }},{
                "match":{"Sql": 'from'}}],
            "should": [{
                "term": {"Sql": '{name}'.format(name=tabname)}}, {
                "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
            "minimum_should_match": 1,
            "filter":{
                "range":{
                    "Timestamp.keyword":{
                        "lte": "{date}".format(date=end_time),
                        "gte": "{begindate}".format(begindate=begin_time),
                    }
                } }}},
    "size":0,  "aggs":{ "aggr_mame":{
            "terms":{
                "field":"User.keyword",
                "size":2000
            },
            "aggs":{
                "aggr_der":{
                    "terms":{
                        "field":"PolicyName.keyword"
                    },
                    "aggs":{
                        "top_tag_hits":{
                            "top_hits":{
                                "size":1
                            }
                        }  }}}}}
       }
View Code

  说明:

    1,使用:"minimum_should_match": 1,   ,这个是兼容: "表名","库名.表名“

    5种类型的全部EQL:   

 if op.find('select')>=0:  # select,包含from
     body ={"query":{
        "bool":{ "must":[{
                "match":{"Sql":'{op}'.format(op=op) }},{
                "match":{"Sql": 'from'}}],
            "should": [{
                "term": {"Sql": '{name}'.format(name=tabname)}}, {
                "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
            "minimum_should_match": 1,
            "filter":{
                "range":{
                    "Timestamp.keyword":{
                        "lte": "{date}".format(date=end_time),
                        "gte": "{begindate}".format(begindate=begin_time),
                    }
                } }}},
    "size":0,  "aggs":{ "aggr_mame":{
            "terms":{
                "field":"User.keyword",
                "size":2000
            },
            "aggs":{
                "aggr_der":{
                    "terms":{
                        "field":"PolicyName.keyword"
                    },
                    "aggs":{
                        "top_tag_hits":{
                            "top_hits":{
                                "size":1
                            }
                        }  }}}}}
       }
    elif op.find('update')>=0:  # update 不能有for关键字
      body = {"query": {
         "bool": {"must": [{
             "match": {"Sql": '{op}'.format(op=op)}},{
             "match": {"PolicyName.keyword": 'd8t'}}],
             "must_not": [{"match": {"Sql": "for"}}],
             "should": [{
                 "term": {"Sql": '{name}'.format(name=tabname)}}, {
                 "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
             "minimum_should_match": 1,
             "filter": {
                 "range": {
                     "Timestamp.keyword": {
                         "lte": "{date}".format(date=end_time),
                         "gte": "{begindate}".format(begindate=begin_time),
                     }
                 }}}},
         "size": 0, "aggs": {"aggr_mame": {
             "terms": {
                 "field": "User.keyword",
                 "size": 2000
             },
             "aggs": {
                 "aggr_der": {
                     "terms": {
                         "field": "PolicyName.keyword"
                     },
                     "aggs": {
                         "top_tag_hits": {
                             "top_hits": {
                                 "size": 1
                             }
                         }}}}}}
     }
    elif op.find('replace') >= 0:  # replace  必须有into关键字
        body = {"query": {
            "bool": {"must": [{
                "match": {"Sql": '{op}'.format(op=op)}}, {
                "match": {"PolicyName.keyword": 'd8t'}},{
                "match": {"Sql": 'into'}}],
                "should": [{
                    "term": {"Sql": '{name}'.format(name=tabname)}}, {
                    "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                "minimum_should_match": 1,
                "filter": {
                    "range": {
                        "Timestamp.keyword": {
                            "lte": "{date}".format(date=end_time),
                            "gte": "{begindate}".format(begindate=begin_time),
                        }
                    }}}},
            "size": 0, "aggs": {"aggr_mame": {
                "terms": {
                    "field": "User.keyword",
                    "size": 2000
                },
                "aggs": {
                    "aggr_der": {
                        "terms": {
                            "field": "PolicyName.keyword"
                        },
                        "aggs": {
                            "top_tag_hits": {
                                "top_hits": {
                                    "size": 1
                                }
                            }}}}}}
    }
    elif op.find('insert') >= 0:  # insert  必须有into关键字
        body = {"query": {
            "bool": {"must": [{
                "match": {"Sql": '{op}'.format(op=op)}}, {
                "match": {"PolicyName.keyword": 'd8t'}},{
                "match": {"Sql": 'into'}}],
                "should": [{
                    "term": {"Sql": '{name}'.format(name=tabname)}}, {
                    "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                "minimum_should_match":1,
                "filter": {
                    "range": {
                        "Timestamp.keyword": {
                            "lte": "{date}".format(date=end_time),
                            "gte": "{begindate}".format(begindate=begin_time),
                        }
                    }}}},
            "size": 0, "aggs": {"aggr_mame": {
                "terms": {
                    "field": "User.keyword",
                    "size": 2000
                },
                "aggs": {
                    "aggr_der": {
                        "terms": {
                            "field": "PolicyName.keyword"
                        },
                        "aggs": {
                            "top_tag_hits": {
                                "top_hits": {
                                    "size": 1
                                }
                            }}}}}}
    }
    else:  # delete  不能有into关键字
        body = {"query": {
            "bool": {"must": [{
                "match": {"Sql": '{op}'.format(op=op)}},{
                "match": {"PolicyName.keyword": 'd8t'}}],
                "must_not": [{"match": {"Sql": "into"}}],
                "should": [{
                    "term": {"Sql": '{name}'.format(name=tabname)}}, {
                    "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                "minimum_should_match": 1,
                "filter": {
                    "range": {
                        "Timestamp.keyword": {
                            "lte": "{date}".format(date=end_time),
                            "gte": "{begindate}".format(begindate=begin_time),
                        }
                    }}}},
            "size": 0, "aggs": {"aggr_mame": {
                "terms": {
                    "field": "User.keyword",
                    "size": 2000
                },
                "aggs": {
                    "aggr_der": {
                        "terms": {
                            "field": "PolicyName.keyword"
                        },
                        "aggs": {
                            "top_tag_hits": {
                                "top_hits": {
                                    "size": 1
                                }
                            }}}}}}
        }
View Code

  写入统计数据Py:

    doc = res["aggregations"]["aggr_mame"]['buckets']
    cn = conn()
    cur = cn.cursor()
    if len(doc):
      for item in doc:
        user=item['key']  # 账号
        total=str(item['doc_count'])  # 该账号在全部实例下的调用次数
        if len(item["aggr_der"]["buckets"]):
           for bucket in item["aggr_der"]["buckets"]:
             server=bucket['key']  # 服务器实例
             s_total =str(bucket['doc_count'])  # 该服务器实例下的调用次数
             sql=bucket["top_tag_hits"]["hits"]["hits"][0]["_source"]["Sql"] # 样例Sql
             sql=emoji.demojize(transferContent(sql))  #转义并去掉表情符号
             tsql="replace into ea_tj(tab,username,op,num,server,sqltext,dt) "\
               " values ('{0}','{1}','{2}','{3}','{4}','{5}','{6}')".format(tabname,user,op,s_total,server,sql,dt)
             cur.execute(tsql)
             cn.commit()
View Code

存放分析结果表:

CREATE TABLE `ea_tj` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tab` varchar(200) NOT NULL COMMENT '表名',
  `username` varchar(200) NOT NULL COMMENT '账号',
  `op` varchar(50) DEFAULT NULL COMMENT '操作类型',
  `num` bigint(11) NOT NULL COMMENT '次数',
  `server` varchar(200) NOT NULL COMMENT '实例策略名',
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  `sqltext` text COMMENT '样例SQL',
  `dt` date DEFAULT NULL COMMENT '线上SQL执行日期',
  `update_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
  PRIMARY KEY (`id`) USING BTREE,
  UNIQUE KEY `un` (`tab`,`username`,`op`,`server`,`dt`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4
View Code

   具体汇总数据:

    1,每天13个实例,表,账号,操作之间关系汇总(用来Online表拆分)
    2,每天13个数据库实例账号的连接IP汇总(用来迁移VPC)
    3,统计调用总次数 (用来分析调用次数异常)

    汇总1的结果数据:

   

  

    相关文档:
 
posted @ 2021-08-12 11:23  zping  阅读(236)  评论(0编辑  收藏  举报