Fluentd直接传输日志给Elasticsearch
官方文档地址:https://docs.fluentd.org/output/elasticsearch
td-agent的v3.0.1版本以后自带包含out_elasticsearch插件,不用再安装了,可以直接使用。
若是使用的是Fluentd,则需要安装这个插件:
$ fluent-gem install fluent-plugin-elasticsearch
配置示例
<match my.logs>
@type elasticsearch
host localhost
port 9200
logstash_format true
</match>
参数说明
- @type:必填,elasticsearch
- host:可选,elasticsearch连接地址,默认是localhost
- port:可选,elasticsearch使用的端口,默认是9200
- hosts:可选,连接多个elasticsearch时使用,若是使用这个,host和port配置的则会被忽略,则用法如下:
hosts host1:port1,host2:port2,host3:port3
# or
hosts https://customhost.com:443/path,https://username:password@host-failover.com:443
- user:可选,默认nil
- password:可选,默认nil
- scheme:可选,连接协议,默认http
- path: 可选,Elasticsearch的REST API端点,用于发布写请求,默认nil
- index_name,可选,索引名称,默认fluentd,用法示例:
# index by tags
index_name fluentd.${tag}
# by tags and timestamps
# 这种形式的还需要在chunk_keys中设置tag和time,如下所示:
index_name fluentd.${tag}.%Y%m%d
<match my.logs>
@type elasticsearch
host localhost
port 9200
index_name fluentd.${tag}.%Y%m%d => fluentd.my.logs.20201105
<buffer tag,time>
timekey 1m
</buffer>
</match>
- logstash_format:可选,默认false,若为true,则索引名称格式是logstash-%Y.%m.%d,比index_name优先级高
- logstash_prefix:可选,logstash前缀索引名,用于在logstash_format为true时,默认logstash
- @log_level:可选,日志等级,参数有fatal, error, warn, info, debug, trace
其他
可以使用%{}
样式占位符来转义URL编码所需的字符
比如:
# 有效配置
user %{demo+}
password %{@secret}
hosts https://%{j+hn}:%{passw@rd}@host1:443/elastic/,http://host2
# 无效配置
user demo+
password @secret
实际使用案例
收集openresty(nginx)日志
# cat /etc/td-agent/td-agent.conf
<source>
@type tail
@id input_tail
<parse>
@type nginx
</parse>
path /usr/local/openresty/nginx/logs/host.access.log
tag td.nginx.access
</source>
<match td.nginx.access>
@type elasticsearch
host localhost
port 9200
index_name fluentd.${tag}.%Y%m%d
<buffer tag,time>
timekey 1m
</buffer>
</match>
关于@type nginx日志过滤的内容
官方文档地址:https://docs.fluentd.org/parser/nginx
使用的正则表达式:
expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>[^ ]+))?)?$/
time_format %d/%b/%Y:%H:%M:%S %z
remote, user, method, path, code, size, referer, agent and http_x_forwarded_for 都包含在record中,时间用于事件时间
# 日志内容
127.0.0.1 192.168.0.1 - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0" -
# 过滤后的结果
time:
1362020400 (28/Feb/2013:12:00:00 +0900)
record:
{
"remote" : "127.0.0.1",
"host" : "192.168.0.1",
"user" : "-",
"method" : "GET",
"path" : "/",
"code" : "200",
"size" : "777",
"referer" : "-",
"agent" : "Opera/12.0",
"http_x_forwarded_for": "-"
}
假设不用这个参数的话,假若删除
<parse>
@type nginx
</parse>
启动后则会报错:
<parse> section is required
只得使用none替换:
<parse>
@type none
</parse>