Logstash

 Logstash:

  • 支持多数据获取机制,通过TCP/UDP协议、文件、syslog、windows EventLogs及STDIN等;获取到数据后,它支持对数据执行过滤、修改等操作;
  • 基于JRuby语言,运行于JVM上;
    • agent/server模型  

 配置框架:

input {
...
}

filter {
...
}

output {
...
}   
  • 四种类型的插件:
input
filter
codec
output   
  • 数据类型:
Array:[item1, item2,...]
Boolean:true, false
Bytes:
Codec:编码器
Hash:key => value
Number:
Password:
Path:文件系统路径;
String:字符串  
  • 字段引用:
[ ]  
  • 条件判断:
==, !=, <, <=, >, >=
=~, !~
in, not in
and, or
()   

 Logstash的工作流程:

input | filter | output, 类似于管道模式,如无需对数据进行额外处理,filter可省略;  

 

input {
    stdin {}
}

output {
    stdout {
        codec => rubydebug
    }
}  

 Logstash的插件:

  • input插件:
1、File:从指定的文件中读取事件流;
    使用FileWatch(Ruby Gem库)监听文件的变化。
    .sincedb:记录了每个被监听的文件的inode, major number, minor nubmer, pos; 

 

input {
    file {
        path => ["/var/log/messages"]
        type => "system"
        start_position => "beginning"
    }
}

output {
    stdout {
        codec	=> rubydebug
    }
}  

 

2、udp:通过udp协议从网络连接来读取Message,其必备参数为port,用于指明自己监听的端口,host则用指明自己监听的地址;  

 

collectd:性能监控程序,通过udp协议可向logstash发送当前主机的性能信息; 

CentOS 7 epel源:
# yum install collectd -y

# vim /etc/collectd.conf

    Hostname    "node3.magedu.com"
    LoadPlugin syslog
    LoadPlugin cpu
    LoadPlugin df
    LoadPlugin interface
    LoadPlugin load
    LoadPlugin memory
    LoadPlugin network
    <Plugin network>
            <Server "172.16.100.70" "25826">
            # 172.16.100.70是logstash主机的地址,25826是其监听的udp端口;
            </Server>
    </Plugin>
    Include "/etc/collectd.d"

# systemctl start collectd.service

 

# logstash端:

input {
    udp {
        port 	=> 25826
        codec 	=> collectd {}
        type	=> "collectd"
    }
}

output {
    stdout {
        codec	=> rubydebug
    }
}  

 

3、redis插件:
    从redis读取数据,支持redis channel和lists两种方式  

 

input {
    redis {
        host    => "localhost"
        port 	=> "6379"
        data_type 	=> "list"
        key	=> "redisdata"
    }
}

output {
    stdout {
        codec	=> rubydebug
    }
}
  • filter插件:
    • 用于在将event通过output发出之前对其实现某些处理功能。grok  
    • grok:用于分析并结构化文本数据;目前 是logstash中将非结构化日志数据转化为结构化的可查询数据的不二之选。  
      • syslog, apache, nginx    
      • 模式定义位置:/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-0.3.0/patterns/grok-patterns    
语法格式:

%{SYNTAX:SEMANTIC}
    SYNTAX:预定义模式名称;
    SEMANTIC:匹配到的文本的自定义标识符;   

 

1.1.1.1 GET /index.html 30 0.23

%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}

input {
    stdin {}
}

filter {
    grok {
        match => { "message" => "%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
    }
}

output {
    stdout {
        codec 	=> rubydebug
    }
}


{
       "message" => "1.1.1.1 GET /index.html 30 0.23",
      "@version" => "1",
    "@timestamp" => "2018-01-27T02:13:52.558Z",
          "host" => "node1",
      "clientip" => "1.1.1.1",
        "method" => "GET",
       "request" => "/index.html",
         "bytes" => "30",
      "duration" => "0.23"
}  

 

grok:过滤message消息
    remove_field  => "message"

grok {
        match => { "message" => "XXX" }
        remove_field  => "message"
}    

 

自定义grok的模式:
    grok的模式是基于正则表达式编写,其元字符与其它用到正则表达式的工具awk/sed/grep/pcre差别不大。

    PATTERN_NAME (?the pattern here)  

 

# 匹配apache log

input {
    file {
        path    => ["/var/log/httpd/access_log"]
        type    => "apachelog"
        start_position => "beginning"
    }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
}

output {
    stdout {
        codec   => rubydebug
    }
}  

 

# 匹配nginx log

nginx log的匹配方式:
    将如下信息添加至 /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-0.3.0/patterns/grok-patterns文件的尾部:

    NGUSERNAME [a-zA-Z\.\@\-\+_%]+
    NGUSER %{NGUSERNAME}
    NGINXACCESS %{IPORHOST:clientip} - %{NOTSPACE:remote_user} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NOTSPACE:http_x_forwarded_for}										
    
input {
    file {
        path 	=> ["/var/log/nginx/access.log"]
        type	=> "nginxlog"
        start_position => "beginning"
    }
}

filter {
    grok {
        match => { "message" => "%{NGINXACCESS}" }
    }
}

output {
    stdout {
        codec	=> rubydebug
    }
} 
  • output插件:
1、redis插件:
  写入数据到redis中,支持redis channel和lists两种方式

  

input {
    file {
    path => ["/var/log/nginx/access.log"]
    type => "nginxlog"
    start_position => "beginning"
    }
}

output {
    redis {
        host    => "localhost"
        port 	=> "6379"
        data_type 	=> "list"
        key	=> "logstash-%{type}"
    }
}

  

2、elasticsearch插件:
    写入数据到elasticsearch集群中

  

input {
    redis {
        host    => "localhost""
        port 	=> "6379"
        data_type 	=> "list"
        key	=> "logstash-nginxlog"
    }
}

output {
    elasticsearch {
        cluster    => "loges"
        index 	=> "logstath-%{+YYYY.MM.dd}"
    }
} 

  

posted @ 2018-02-02 14:37  evescn  阅读(213)  评论(0编辑  收藏  举报