logstash 入门及架构介绍

Pipeline

input / filter / output

 

Input Plugins

  • Stdin/File
  • Log4j / jdbc / kafka

 

Output Plugins

 将 Event 发送到特定的目的地,是 Pipeline 的最后一个阶段

常见的 Output Plugins

  • ElasticSearch
  • Kafka

 

Codec Plugin

将原始数据 decode 成 Event;将 Event encode 成目标数据。

内置的 Codec 插件

  • Line / MultipleLIne
  • Json / Avro
  • Dots / Rubydebug
  • Line/json

 

Filter Plugin

处理 Event

内置的 Filter 插件

  • Mutate - 操作 Event
  • Metrics - Agregate Metrics
  • Ruby - 执行 ruby 代码

 

Queue

 

In Memory Queue (进程 Crash、机器宕机会引起数据丢失)

Persistent Queue

 

示例:

① 读取单行数据,将转换成 event。 点击查看

logstash -e "input{stdin{codec=>json}}output{stdout{codec=>rubydebug}}"

 

② 读取多行数据

multiline.conf

input {
  stdin {
    codec => multiline {
      pattern => "^\s"
      what => "previous"
    }
  }
}


filter {}

output {
  stdout { codec => rubydebug }
}

 

③ 综合应用

下载 csv 文件 https://grouplens.org/datasets/movielens/

input {
  file {
    path => "movies.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}
filter {
  csv {
    separator => ","
    columns => ["id","content","genre"]
  }

  mutate {
    split => { "genre" => "|" }
    remove_field => ["path", "host","@timestamp","message"]
  }

  mutate {

    split => ["content", "("]
    add_field => { "title" => "%{[content][0]}"}
    add_field => { "year" => "%{[content][1]}"}
  }

  mutate {
    convert => {
      "year" => "integer"
    }
    strip => ["title"]
    remove_field => ["path", "host","@timestamp","message","content"]
  }

}
output {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "movies"
     document_id => "%{id}"
   }
  stdout {}
}

 

233

posted on 2020-11-01 15:39  Lemo_wd  阅读(357)  评论(0编辑  收藏  举报

导航