Logstash简介与配置&logstash收集Java日志

1.简介与工作流程

   Logstash是采用ruby语言开发的。logstash与beats一样,是一个data shipper,只不过logstash比较重量级,支持的功能也多。

1.简介

  官方的解释是:转换和存储数据

  Logstash 是免费且开放的服务器端数据处理管道,能够从多个来源采集数据,转换数据,然后将数据发送到您最喜欢的“存储库”中。

  Logstash 能够动态地采集、转换和传输数据,不受格式或复杂度的影响。利用 Grok 从非结构化数据中派生出结构,从 IP 地址解码出地理坐标,匿名化或排除敏感字段,并简化整体处理过程。

2.工作流程

1.  Input输入-采集各种样式、大小和来源的数据

  数据往往以各种各样的形式,或分散或集中地存在于很多系统中。Logstash 支持各种输入选择,可以同时从众多常用来源捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、Web 应用、数据存储以及各种 AWS 服务采集数据。

  关于其输入支持的插件参考:输入插件

2.Filter筛选-实时解析和转换数据

  数据从源传输到存储库的过程中,Logstash 过滤器能够解析各个事件,识别已命名的字段以构建结构,并将它们转换成通用格式,以便进行更强大的分析和实现商业价值。

  Logstash 能够动态地转换和解析数据,不受格式或复杂度的影响:利用 Grok 从非结构化数据中派生出结构、从 IP 地址破译出地理坐标、将 PII 数据匿名化,完全排除敏感字段、简化整体处理,不受数据源、格式

或架构的影响。

  使用丰富的过滤器库和功能多样的 Elastic Common Schema,可以实现无限丰富的可能。

3. Output输出-选择存储库,导出数据

   Elasticsearch 是首选输出方向,能够为搜索和分析带来无限可能,但它并非唯一选择。Logstash 提供众多输出选择,您可以将数据发送到您要指定的地方,并且能够灵活地解锁众多下游用例。

 

2.下载安装

1. 下载logstash

 2. 解压后目录如下:

3. 查看logstash/config 目录:

 

 logstash-sample.conf样本配置如下:

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    #user => "elastic"
    #password => "changeme"
  }
}

 3. 入门

1. 收集nginx的访问日志

  以控制台的方式进行搜集,便于调试

(1)查看logstash的两条日志(因为我装的有git,所以windows可以用linux的相关命令)

liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs
$ pwd
/e/nginx/nginx-1.12.2/logs

liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs
$ head -n 2 ./access.log
127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] "GET /Test.html HTTP/1.1" 200 142 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"

 

(2) $logstash/config/ 目录下创建logstash_nginx.conf,内容如下:

input {
  stdin { }
}

filter {
  grok {
    match => {
      "message" => '%{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] "%{WORD:request_action} %{DATA:request} HTTP/%{NUMBER:http_version}" %{NUMBER:response} %{NUMBER:bytes} "%{DATA:referrer}" "%{DATA:agent}"'
    }
  }

  date {
    match => [ "time", "dd/MMM/YYYY:HH:mm:ss Z" ]
    locale => en
  }

  geoip {
    source => "remote_ip"
    target => "geoip"
  }

  useragent {
    source => "agent"
    target => "user_agent"
  }
}

output {
stdout {
 codec => rubydebug 
 }
}

  grok 将非格式化的日志信息转化为JSON格式的信息。

  date:转换时间

  geoip:获取地理位置

  useragent:提取用户的来源设备

(3) 测试日志收集:

liqiang@root MINGW64 /e/ELK/logstash-7.6.2
$ head -n 2 /e/nginx/nginx-1.12.2/logs/access.log | /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_nginx.conf
Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties
[2020-08-23T12:31:18,218][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-08-23T12:31:18,857][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
[2020-08-23T12:31:25,229][INFO ][org.reflections.Reflections] Reflections took 122 ms to scan 1 urls, producing 20 keys and 40 values
[2020-08-23T12:31:36,465][INFO ][logstash.filters.geoip   ][main] Using geoip database {:path=>"E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/logstash-filter-geoip-6.0.3-java/vendor/GeoLite2-City.mmdb"}
[2020-08-23T12:31:36,994][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-08-23T12:31:37,019][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_nginx.conf"], :thread=>"#<Thread:0x1e1b9b66 run>"}
[2020-08-23T12:31:40,502][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-08-23T12:31:40,731][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
         "user_name" => "-",
          "@version" => "1",
              "host" => "root",
      "http_version" => "1.1",
             "bytes" => "142",
              "tags" => [
        [0] "_geoip_lookup_failure"
    ],
    "request_action" => "GET",
          "referrer" => "-",
             "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
          "response" => "200",
             "geoip" => {},
              "time" => "09/Mar/2018:17:48:00 +0800",
           "request" => "/Test.html",
         "remote_ip" => "127.0.0.1",
        "@timestamp" => 2018-03-09T09:48:00.000Z,
           "message" => "127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] \"GET /Test.html HTTP/1.1\" 200 142 \"-\" \"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36\"\r",
        "user_agent" => {
           "minor" => "0",
           "build" => "",
            "name" => "Chrome",
        "os_major" => "8",
        "os_minor" => "1",
          "device" => "Other",
         "os_name" => "Windows",
           "major" => "64",
           "patch" => "3282",
              "os" => "Windows"
    }
}
{
         "user_name" => "-",
          "@version" => "1",
              "host" => "root",
      "http_version" => "1.1",
             "bytes" => "612",
              "tags" => [
        [0] "_geoip_lookup_failure"
    ],
    "request_action" => "GET",
          "referrer" => "-",
             "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
          "response" => "200",
             "geoip" => {},
              "time" => "09/Mar/2018:17:45:59 +0800",
           "request" => "/",
         "remote_ip" => "127.0.0.1",
        "@timestamp" => 2018-03-09T09:45:59.000Z,
           "message" => "127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] \"GET / HTTP/1.1\" 200 612 \"-\" \"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36\"\r",
        "user_agent" => {
           "minor" => "0",
           "build" => "",
            "name" => "Chrome",
        "os_major" => "8",
        "os_minor" => "1",
          "device" => "Other",
         "os_name" => "Windows",
           "major" => "64",
           "patch" => "3282",
              "os" => "Windows"
    }
}
[2020-08-23T12:31:43,718][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-08-23T12:31:44,453][INFO ][logstash.runner          ] Logstash shut down.

2. 收集java日志

  将java日志收集到ES中。

 1. springboot的web项目使用logback直接输出到logstash中

(1) 配置logstash,监听tcp端口4560并且启动logstash

$logstash/config目录下新建logstash_java.conf

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
 
input {
 tcp {
 mode => "server"
 host => "127.0.0.1"
 port => 4560
 codec => json_lines
 }
}
output {
 elasticsearch {
 hosts => "127.0.0.1:9200"
 index => "springboot-logstash-%{+YYYY.MM.dd}"
 }
}

(2)启动logstash

$ /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_java.conf

(3)springboot项目pom中引入依赖

        <!--logStash -->
        <dependency>
            <groupId>net.logstash.logback</groupId>
            <artifactId>logstash-logback-encoder</artifactId>
            <version>5.3</version>
        </dependency>

(4)src/main/resources下新建logback-spring.xml

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <include resource="org/springframework/boot/logging/logback/base.xml" />
    <appender name="LOGSTASH"
        class="net.logstash.logback.appender.LogstashTcpSocketAppender">
        <!--配置logStash 服务地址 -->
        <destination>127.0.0.1:4560</destination>
        <!-- 日志输出编码 -->
        <encoder charset="UTF-8"
            class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
            <providers>
                <timestamp>
                    <timeZone>UTC</timeZone>
                </timestamp>
                <pattern>
                    <pattern>
                        {
                        "logLevel": "%level",
                        "serviceName": "${springAppName:-}",
                        "pid": "${PID:-}",
                        "thread": "%thread",
                        "class": "%logger{40}",
                        "rest": "%message"
                        }
                    </pattern>
                </pattern>
            </providers>
        </encoder>
    </appender>

    <root level="DEBUG">
        <appender-ref ref="LOGSTASH" />
        <appender-ref ref="CONSOLE" />
    </root>
</configuration>

(5)启动应用后查看日志:

(6)kibana创建index pattern后分析

第一步:

 第二步:

查看:

 

 3.收集Java log4j生成的日志文件

1. 日志文件格式如下:

2020/08/13-13:09:09 [main] INFO  com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\xiangmu\icc-server\trunk\target\classes started by Administrator in E:\xiangmu\icc-server\trunk)
2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE

2. 编写conf文件logstash_file.conf,标准输入输出测试

input {
  stdin { }
}

filter {
  grok {
    match => {
      "message" => '%{DATESTAMP:time} \[%{WORD:threadName}\] %{WORD:logLevel} %{GREEDYDATA:syslog_message}'
    }
  }
  
  date {
    match => [ "time", "YYYY/MM/dd-HH:mm:ss" ]
    locale => en
  }  
}

output {
stdout {
 codec => rubydebug 
 }
}

测试如下:

$ head -n 2 /g/logs/test.log | ./bin/logstash -f ./config/logstash_file.conf
Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties
[2020-08-25T20:44:25,769][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-08-25T20:44:26,519][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
[2020-08-25T20:44:33,369][INFO ][org.reflections.Reflections] Reflections took 220 ms to scan 1 urls, producing 20 keys and 40 values
[2020-08-25T20:44:40,149][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-08-25T20:44:40,189][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_file.conf"], :thread=>"#<Thread:0x5f39f2d0 run>"}
[2020-08-25T20:44:43,482][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-08-25T20:44:43,692][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
        "@timestamp" => 0020-08-13T05:03:26.000Z,
              "time" => "20/08/13-13:09:09",
    "syslog_message" => " com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\\xiangmu\\icc-server\\trunk\\target\\classes started by Administrator in E:\\xiangmu\\icc-server\\trunk)\r",
          "@version" => "1",
           "message" => "2020/08/13-13:09:09 [main] INFO  com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\\xiangmu\\icc-server\\trunk\\target\\classes started by Administrator in E:\\xiangmu\\icc-server\\trunk)\r",
        "threadName" => "main",
              "host" => "root",
          "logLevel" => "INFO"
}
{
        "@timestamp" => 0020-08-13T05:03:26.000Z,
              "time" => "20/08/13-13:09:09",
    "syslog_message" => "com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE\r",
          "@version" => "1",
           "message" => "2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE\r",
        "threadName" => "main",
              "host" => "root",
          "logLevel" => "DEBUG"
}
[2020-08-25T20:44:45,492][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-08-25T20:44:45,902][INFO ][logstash.runner          ] Logstash shut down.

 3. 修改logstash_file.conf文件,读取log文件,同时存入es生成索引

input {
    file {
        path => "G:/logs/test.log"
        type => "testfile"
        start_position => "beginning"
    }
}

filter {
    grok {
        match => {
            "message" => '%{DATESTAMP:time} \[%{WORD:threadName}\] %{WORD:logLevel} %{GREEDYDATA:syslog_message}'
        }
    }

    date {
        match => ["time", "YYYY/MM/dd-HH:mm:ss"]
        locale => en
    }
}

output {
    stdout {
        codec => rubydebug
    }
    elasticsearch {
        hosts => ["127.0.0.1:9200", "127.0.0.1:19200"] 
        index => "testfile-%{+YYYY.MM.dd}"
        template_overwrite => true
    }
}

执行如下:

liqiang@root MINGW64 /e/ELK/logstash-7.6.2
$ ./bin/logstash -f ./config/logstash_file.conf

 

4. kibana查看索引字段映射如下:

{
  "mapping": {
    "_doc": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "@version": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "host": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "logLevel": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "path": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "syslog_message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "threadName": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "time": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "type": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

 

总结:
  grok常用模式参考阿里

posted @ 2020-08-25 22:01  QiaoZhi  阅读(6130)  评论(0编辑  收藏  举报