Logstash
Logstash:
- 支持多数据获取机制,通过TCP/UDP协议、文件、syslog、windows EventLogs及STDIN等;获取到数据后,它支持对数据执行过滤、修改等操作;
- 基于JRuby语言,运行于JVM上;
- agent/server模型
配置框架:
1 2 3 4 5 6 7 8 9 10 11 | input { ... } filter { ... } output { ... } |
- 四种类型的插件:
1 2 3 4 | input filter codec output |
- 数据类型:
1 2 3 4 5 6 7 8 9 | Array:[item1, item2,...] Boolean: true , false Bytes: Codec:编码器 Hash:key => value Number: Password: Path:文件系统路径; String:字符串 |
- 字段引用:
1 | [ ] |
- 条件判断:
1 2 3 4 5 | ==, !=, <, <=, >, >= =~, !~ in , not in and, or () |
Logstash的工作流程:
1 | input | filter | output, 类似于管道模式,如无需对数据进行额外处理,filter可省略; |
1 2 3 4 5 6 7 8 9 | input { stdin {} } output { stdout { codec => rubydebug } } |
Logstash的插件:
- input插件:
1 2 3 | 1、File:从指定的文件中读取事件流; 使用FileWatch(Ruby Gem库)监听文件的变化。 .sincedb:记录了每个被监听的文件的inode, major number, minor nubmer, pos; |
1 2 3 4 5 6 7 8 9 10 11 12 13 | input { file { path => [ "/var/log/messages" ] type => "system" start_position => "beginning" } } output { stdout { codec => rubydebug } } |
1 | 2、udp:通过udp协议从网络连接来读取Message,其必备参数为port,用于指明自己监听的端口,host则用指明自己监听的地址; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | collectd:性能监控程序,通过udp协议可向logstash发送当前主机的性能信息; CentOS 7 epel源: # yum install collectd -y # vim /etc/collectd.conf Hostname "node3.magedu.com" LoadPlugin syslog LoadPlugin cpu LoadPlugin df LoadPlugin interface LoadPlugin load LoadPlugin memory LoadPlugin network <Plugin network> <Server "172.16.100.70" "25826" > # 172.16.100.70是logstash主机的地址,25826是其监听的udp端口; < /Server > < /Plugin > Include "/etc/collectd.d" # systemctl start collectd.service |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # logstash端: input { udp { port => 25826 codec => collectd {} type => "collectd" } } output { stdout { codec => rubydebug } } |
1 2 | 3、redis插件: 从redis读取数据,支持redis channel和lists两种方式 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | input { redis { host => "localhost" port => "6379" data_type => "list" key => "redisdata" } } output { stdout { codec => rubydebug } } |
- filter插件:
- 用于在将event通过output发出之前对其实现某些处理功能。grok
- grok:用于分析并结构化文本数据;目前 是logstash中将非结构化日志数据转化为结构化的可查询数据的不二之选。
- syslog, apache, nginx
- 模式定义位置:/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-0.3.0/patterns/grok-patterns
1 2 3 4 5 | 语法格式: %{SYNTAX:SEMANTIC} SYNTAX:预定义模式名称; SEMANTIC:匹配到的文本的自定义标识符; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | 1.1.1.1 GET /index .html 30 0.23 %{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration} input { stdin {} } filter { grok { match => { "message" => "%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } } } output { stdout { codec => rubydebug } } { "message" => "1.1.1.1 GET /index.html 30 0.23" , "@version" => "1" , "@timestamp" => "2018-01-27T02:13:52.558Z" , "host" => "node1" , "clientip" => "1.1.1.1" , "method" => "GET" , "request" => "/index.html" , "bytes" => "30" , "duration" => "0.23" } |
1 2 3 4 5 6 7 | grok:过滤message消息 remove_field => "message" grok { match => { "message" => "XXX" } remove_field => "message" } |
1 2 3 4 | 自定义grok的模式: grok的模式是基于正则表达式编写,其元字符与其它用到正则表达式的工具 awk /sed/grep/pcre 差别不大。 PATTERN_NAME (?the pattern here) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | # 匹配apache log input { file { path => [ "/var/log/httpd/access_log" ] type => "apachelog" start_position => "beginning" } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } } output { stdout { codec => rubydebug } } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | # 匹配nginx log nginx log的匹配方式: 将如下信息添加至 /opt/logstash/vendor/bundle/jruby/1 .9 /gems/logstash-patterns-core-0 .3.0 /patterns/grok-patterns 文件的尾部: NGUSERNAME [a-zA-Z\.\@\-\+_%]+ NGUSER %{NGUSERNAME} NGINXACCESS %{IPORHOST:clientip} - %{NOTSPACE:remote_user} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NOTSPACE:http_x_forwarded_for} input { file { path => [ "/var/log/nginx/access.log" ] type => "nginxlog" start_position => "beginning" } } filter { grok { match => { "message" => "%{NGINXACCESS}" } } } output { stdout { codec => rubydebug } } |
- output插件:
1 2 | 1、redis插件: 写入数据到redis中,支持redis channel和lists两种方式 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | input { file { path => [ "/var/log/nginx/access.log" ] type => "nginxlog" start_position => "beginning" } } output { redis { host => "localhost" port => "6379" data_type => "list" key => "logstash-%{type}" } } |
1 2 | 2、elasticsearch插件: 写入数据到elasticsearch集群中 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | input { redis { host => "localhost" " port => "6379" data_type => "list" key => "logstash-nginxlog" } } output { elasticsearch { cluster => "loges" index => "logstath-%{+YYYY.MM.dd}" } } |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· Vue3状态管理终极指南:Pinia保姆级教程