完整的多项匹配tomcat access日志的正则

<pre name="code" class="html"><pre name="code" class="html"> 多项选择

有时候我们会碰上一个日志有多种可能格式的情况。这时候要写成单一正则就比较困难,或者全用 | 隔开又比较丑陋。这时候,logstash 的语法提供给我们一个有趣的解决方式。
文档中,都说明 logstash/filters/grok 插件的 match 参数应该接受的是一个 Hash 值。但是因为早期的 logstash 语法中 Hash 值也是用 [] 这种方式书写的,所以其实现在传递 Array 值给 match 参数也完全没问题。所以,我们这里其实可以传递多个正则来匹配同一个字段:
match => [
    "message", "(?<request_time>\d+(?:\.\d+)?)",
    "message", "%{SYSLOGBASE} %{DATA:message}",
    "message", "(?m)%{WORD}"
]
logstash 会按照这个定义次序依次尝试匹配,到匹配成功为止。虽说效果跟用 | 分割写个大大的正则是一样的,但是可阅读性好了很多。
 [elk@dr-mysql01 api-access]$ cat /usr/local/logstash-2.3.4/config/api-access/logstash_access.conf 
input {
        file {
                type => "zj_api_access"
                path => ["/data01/applog_backup/zjzc_log/zj-api*access*"]
        }
    
       file { 
                type => "wj_api_access" 
                path => ["/data01/applog_backup/winfae_log/wj-api*access*"] 
        } 


 
}
filter {
    grok {
        match => [
             "message" , "\s*%{IPORHOST:clientip}\s+\-\s+\-\s+\[%{HTTPDATE:time}\]\s+\"%{WORD:verb}\s+(?<api>(\S+))\?.*\s+HTTP/%{NUMBER:httpversion}\"\s+%{NUMBER:http_status_code}\s+%{NUMBER:bytes}\s+(%{BASE16FLOAT:request_time})\s+%{IPORHOST:remoteip}",
              "message" ,"\s*%{IPORHOST:clientip}\s+\-\s+\-\s+\[%{HTTPDATE:time}\]\s+\"%{WORD:verb}\s+(?<api>(\S+))\s+HTTP/%{NUMBER:httpversion}\"\s+%{NUMBER:http_status_code}\s+%{NUMBER:bytes}\s+(%{BASE16FLOAT:request_time})\s+%{IPORHOST:remoteip}",
             "message" ,"\s*%{IPORHOST:clientip}\s+\-\s+\-\s+\[%{HTTPDATE:time}\]\s+\"%{WORD:verb}\s+(?<api>(\S+))\s+HTTP/%{NUMBER:httpversion}\"\s+%{NUMBER:http_status_code}\s+\-\s+(%{BASE16FLOAT:request_time})\s+%{IPORHOST:remoteip}",
             "message","\s*%{IPORHOST:clientip}\s+\-\s+\-\s+\[%{HTTPDATE:time}\]\s+\"%{WORD:verb}\s+(?<api>(\S+))\s+HTTP/%{NUMBER:httpversion}\"\s+%{NUMBER:http_status_code}\s+\-\s+(%{BASE16FLOAT:request_time})\s+(%{IPORHOST:remoteip}|-)"
        ]
    }   
        mutate {
                        convert => [ "request_time", "float"]
                       add_field =>["response_time","%{request_time}"]
                        remove_field =>["request_time"]
                       add_field => [ "[@metadata][zabbix_key]" , "logstash-api-access" ]
                       add_field => [ "[@metadata][zabbix_host]" , "dr-mysql01" ]
                        add_field =>["messager","%{type}-%{message}"]
                         remove_field =>["message"]
                }
   date {
        match => ["time", "dd/MMM/yyyy:HH:mm:ss Z"]
    }
     
}






output {


     if [response_time] >= 5  {
          zabbix {
		zabbix_host => "[@metadata][zabbix_host]"
		zabbix_key => "[@metadata][zabbix_key]"
        zabbix_server_host => "192.168.32.55"
        zabbix_server_port => "10051"
		zabbix_value => "messager"
        }
          }
     if [type] == "zj_api_access" { 
        redis {
                host => "192.168.32.67"
                data_type => "list"
                key => "zj_api_access:redis"
                port=>"6379"
                password => "1234567"
        }
}
      else if [type] == "wj_api_access"{
       redis { 
                host => "192.168.32.67" 
                data_type => "list" 
                key => "wj_api_access:redis" 
                port=>"6379" 
                password => "1234567" 
        } 
}
}




   

posted @ 2016-09-07 09:38  czcb  阅读(309)  评论(0编辑  收藏  举报