Logstash简介与配置&logstash收集Java日志
1.简介与工作流程
Logstash是采用ruby语言开发的。logstash与beats一样,是一个data shipper,只不过logstash比较重量级,支持的功能也多。
1.简介
官方的解释是:转换和存储数据
Logstash 是免费且开放的服务器端数据处理管道,能够从多个来源采集数据,转换数据,然后将数据发送到您最喜欢的“存储库”中。
Logstash 能够动态地采集、转换和传输数据,不受格式或复杂度的影响。利用 Grok 从非结构化数据中派生出结构,从 IP 地址解码出地理坐标,匿名化或排除敏感字段,并简化整体处理过程。
2.工作流程
1. Input输入-采集各种样式、大小和来源的数据
数据往往以各种各样的形式,或分散或集中地存在于很多系统中。Logstash 支持各种输入选择,可以同时从众多常用来源捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、Web 应用、数据存储以及各种 AWS 服务采集数据。
关于其输入支持的插件参考:输入插件
2.Filter筛选-实时解析和转换数据
数据从源传输到存储库的过程中,Logstash 过滤器能够解析各个事件,识别已命名的字段以构建结构,并将它们转换成通用格式,以便进行更强大的分析和实现商业价值。
Logstash 能够动态地转换和解析数据,不受格式或复杂度的影响:利用 Grok 从非结构化数据中派生出结构、从 IP 地址破译出地理坐标、将 PII 数据匿名化,完全排除敏感字段、简化整体处理,不受数据源、格式
或架构的影响。
使用丰富的过滤器库和功能多样的 Elastic Common Schema,可以实现无限丰富的可能。
3. Output输出-选择存储库,导出数据
Elasticsearch 是首选输出方向,能够为搜索和分析带来无限可能,但它并非唯一选择。Logstash 提供众多输出选择,您可以将数据发送到您要指定的地方,并且能够灵活地解锁众多下游用例。
2.下载安装
1. 下载logstash
2. 解压后目录如下:
3. 查看logstash/config 目录:
logstash-sample.conf样本配置如下:
# Sample Logstash configuration for creating a simple # Beats -> Logstash -> Elasticsearch pipeline. input { beats { port => 5044 } } output { elasticsearch { hosts => ["http://localhost:9200"] index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}" #user => "elastic" #password => "changeme" } }
3. 入门
1. 收集nginx的访问日志
以控制台的方式进行搜集,便于调试
(1)查看logstash的两条日志(因为我装的有git,所以windows可以用linux的相关命令)
liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs $ pwd /e/nginx/nginx-1.12.2/logs liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs $ head -n 2 ./access.log 127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36" 127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] "GET /Test.html HTTP/1.1" 200 142 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
(2) $logstash/config/ 目录下创建logstash_nginx.conf,内容如下:
input {
stdin { }
}
filter {
grok {
match => {
"message" => '%{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] "%{WORD:request_action} %{DATA:request} HTTP/%{NUMBER:http_version}" %{NUMBER:response} %{NUMBER:bytes} "%{DATA:referrer}" "%{DATA:agent}"'
}
}
date {
match => [ "time", "dd/MMM/YYYY:HH:mm:ss Z" ]
locale => en
}
geoip {
source => "remote_ip"
target => "geoip"
}
useragent {
source => "agent"
target => "user_agent"
}
}
output {
stdout {
codec => rubydebug
}
}
grok 将非格式化的日志信息转化为JSON格式的信息。
date:转换时间
geoip:获取地理位置
useragent:提取用户的来源设备
(3) 测试日志收集:
liqiang@root MINGW64 /e/ELK/logstash-7.6.2 $ head -n 2 /e/nginx/nginx-1.12.2/logs/access.log | /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_nginx.conf Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties [2020-08-23T12:31:18,218][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified [2020-08-23T12:31:18,857][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.6.2"} [2020-08-23T12:31:25,229][INFO ][org.reflections.Reflections] Reflections took 122 ms to scan 1 urls, producing 20 keys and 40 values [2020-08-23T12:31:36,465][INFO ][logstash.filters.geoip ][main] Using geoip database {:path=>"E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/logstash-filter-geoip-6.0.3-java/vendor/GeoLite2-City.mmdb"} [2020-08-23T12:31:36,994][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team. [2020-08-23T12:31:37,019][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_nginx.conf"], :thread=>"#<Thread:0x1e1b9b66 run>"} [2020-08-23T12:31:40,502][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"} [2020-08-23T12:31:40,731][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated { "user_name" => "-", "@version" => "1", "host" => "root", "http_version" => "1.1", "bytes" => "142", "tags" => [ [0] "_geoip_lookup_failure" ], "request_action" => "GET", "referrer" => "-", "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36", "response" => "200", "geoip" => {}, "time" => "09/Mar/2018:17:48:00 +0800", "request" => "/Test.html", "remote_ip" => "127.0.0.1", "@timestamp" => 2018-03-09T09:48:00.000Z, "message" => "127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] \"GET /Test.html HTTP/1.1\" 200 142 \"-\" \"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36\"\r", "user_agent" => { "minor" => "0", "build" => "", "name" => "Chrome", "os_major" => "8", "os_minor" => "1", "device" => "Other", "os_name" => "Windows", "major" => "64", "patch" => "3282", "os" => "Windows" } } { "user_name" => "-", "@version" => "1", "host" => "root", "http_version" => "1.1", "bytes" => "612", "tags" => [ [0] "_geoip_lookup_failure" ], "request_action" => "GET", "referrer" => "-", "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36", "response" => "200", "geoip" => {}, "time" => "09/Mar/2018:17:45:59 +0800", "request" => "/", "remote_ip" => "127.0.0.1", "@timestamp" => 2018-03-09T09:45:59.000Z, "message" => "127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36\"\r", "user_agent" => { "minor" => "0", "build" => "", "name" => "Chrome", "os_major" => "8", "os_minor" => "1", "device" => "Other", "os_name" => "Windows", "major" => "64", "patch" => "3282", "os" => "Windows" } } [2020-08-23T12:31:43,718][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2020-08-23T12:31:44,453][INFO ][logstash.runner ] Logstash shut down.
2. 收集java日志
将java日志收集到ES中。
1. springboot的web项目使用logback直接输出到logstash中
(1) 配置logstash,监听tcp端口4560并且启动logstash
$logstash/config目录下新建logstash_java.conf
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
tcp {
mode => "server"
host => "127.0.0.1"
port => 4560
codec => json_lines
}
}
output {
elasticsearch {
hosts => "127.0.0.1:9200"
index => "springboot-logstash-%{+YYYY.MM.dd}"
}
}
(2)启动logstash
$ /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_java.conf
(3)springboot项目pom中引入依赖
<!--logStash --> <dependency> <groupId>net.logstash.logback</groupId> <artifactId>logstash-logback-encoder</artifactId> <version>5.3</version> </dependency>
(4)src/main/resources下新建logback-spring.xml
<?xml version="1.0" encoding="UTF-8"?> <configuration> <include resource="org/springframework/boot/logging/logback/base.xml" /> <appender name="LOGSTASH" class="net.logstash.logback.appender.LogstashTcpSocketAppender"> <!--配置logStash 服务地址 --> <destination>127.0.0.1:4560</destination> <!-- 日志输出编码 --> <encoder charset="UTF-8" class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder"> <providers> <timestamp> <timeZone>UTC</timeZone> </timestamp> <pattern> <pattern> { "logLevel": "%level", "serviceName": "${springAppName:-}", "pid": "${PID:-}", "thread": "%thread", "class": "%logger{40}", "rest": "%message" } </pattern> </pattern> </providers> </encoder> </appender> <root level="DEBUG"> <appender-ref ref="LOGSTASH" /> <appender-ref ref="CONSOLE" /> </root> </configuration>
(5)启动应用后查看日志:
(6)kibana创建index pattern后分析
第一步:
第二步:
查看:
3.收集Java log4j生成的日志文件
1. 日志文件格式如下:
2020/08/13-13:09:09 [main] INFO com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\xiangmu\icc-server\trunk\target\classes started by Administrator in E:\xiangmu\icc-server\trunk)
2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE
2. 编写conf文件logstash_file.conf,标准输入输出测试
input { stdin { } } filter { grok { match => { "message" => '%{DATESTAMP:time} \[%{WORD:threadName}\] %{WORD:logLevel} %{GREEDYDATA:syslog_message}' } } date { match => [ "time", "YYYY/MM/dd-HH:mm:ss" ] locale => en } } output { stdout { codec => rubydebug } }
测试如下:
$ head -n 2 /g/logs/test.log | ./bin/logstash -f ./config/logstash_file.conf Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties [2020-08-25T20:44:25,769][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified [2020-08-25T20:44:26,519][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.6.2"} [2020-08-25T20:44:33,369][INFO ][org.reflections.Reflections] Reflections took 220 ms to scan 1 urls, producing 20 keys and 40 values [2020-08-25T20:44:40,149][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team. [2020-08-25T20:44:40,189][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_file.conf"], :thread=>"#<Thread:0x5f39f2d0 run>"} [2020-08-25T20:44:43,482][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"} [2020-08-25T20:44:43,692][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]} E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated { "@timestamp" => 0020-08-13T05:03:26.000Z, "time" => "20/08/13-13:09:09", "syslog_message" => " com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\\xiangmu\\icc-server\\trunk\\target\\classes started by Administrator in E:\\xiangmu\\icc-server\\trunk)\r", "@version" => "1", "message" => "2020/08/13-13:09:09 [main] INFO com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\\xiangmu\\icc-server\\trunk\\target\\classes started by Administrator in E:\\xiangmu\\icc-server\\trunk)\r", "threadName" => "main", "host" => "root", "logLevel" => "INFO" } { "@timestamp" => 0020-08-13T05:03:26.000Z, "time" => "20/08/13-13:09:09", "syslog_message" => "com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE\r", "@version" => "1", "message" => "2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE\r", "threadName" => "main", "host" => "root", "logLevel" => "DEBUG" } [2020-08-25T20:44:45,492][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2020-08-25T20:44:45,902][INFO ][logstash.runner ] Logstash shut down.
3. 修改logstash_file.conf文件,读取log文件,同时存入es生成索引
input { file { path => "G:/logs/test.log" type => "testfile" start_position => "beginning" } } filter { grok { match => { "message" => '%{DATESTAMP:time} \[%{WORD:threadName}\] %{WORD:logLevel} %{GREEDYDATA:syslog_message}' } } date { match => ["time", "YYYY/MM/dd-HH:mm:ss"] locale => en } } output { stdout { codec => rubydebug } elasticsearch { hosts => ["127.0.0.1:9200", "127.0.0.1:19200"] index => "testfile-%{+YYYY.MM.dd}" template_overwrite => true } }
执行如下:
liqiang@root MINGW64 /e/ELK/logstash-7.6.2 $ ./bin/logstash -f ./config/logstash_file.conf
4. kibana查看索引字段映射如下:
{ "mapping": { "_doc": { "properties": { "@timestamp": { "type": "date" }, "@version": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "host": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "logLevel": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "message": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "path": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "syslog_message": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "threadName": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "time": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "type": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } } }
总结:
grok常用模式参考阿里