ELK 日志收集 存储 分析 展示

ELK 日志收集 存储 分析 展示
20180813 Chenxin
ELK简单搭建简单 https://www.cnblogs.com/huangxincheng/p/7918722.html
java日志收集简单 https://blog.csdn.net/bluetjs/article/details/78770447
ELK搭建-晋级 https://www.cnblogs.com/yuhuLin/p/7018858.html
Elasticsearch-Head https://www.sojson.com/blog/85.html 可以增删改查elasticsearch的数据,是一个插件(生产环境不建议安装)
Logstash最佳实践 https://doc.yonyoucloud.com/doc/logstash-best-practice-cn/get_start/hello_world.html

安装与配置
logstash: 本地收集日志,并发送给elasticsearch
elasticsearch: 收集logstash发来的日志,并检索,存储
kibana: 展示elasticsearch的数据

添加组与用户
因为elasticsearch进程不允许在root账号下启动,故创建单独账号.
groupadd elasticsearch
useradd elasticsearch -g elasticsearch -p elasticsearch
echo "xxx" | passwd --stdin elasticsearch

下载软件(官网下载 https://www.elastic.co/cn/ ),安装软件
cd /opt/
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.3.2-linux-x86_64.tar.gz
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.3.2.tar.gz
tar xzvf elasticsearch-6.3.2.tar.gz
tar xzvf logstash-6.3.2.tar.gz
tar xzvf kibana-6.3.2-linux-x86_64.tar.gz
mv elasticsearch-6.3.2 /usr/local/elasticsearch
mv logstash-6.3.2 /usr/local/logstash
mv kibana-6.3.2-linux-x86_64 /usr/local/kibana

logstash 默认开启9600
配置
cd logstash/config
vim logstash.conf (新建文件) 自动配置请见最下方脚本
[elasticsearch@MiWiFi-R3P-srv config]$ cat logstash.conf
见最下方logstash.conf内容

logstash在目录中收集所有日志(单层目录/多层目录)
固定文件夹
input{
file {
path => "/usr/local/log_test/.log"
start_position => "beginning"
}
}
output {
stdout { } #标准输出,会输出到屏幕
}
多级目录
创建配置文件,然后设定格式为/
/.log的方式,其中//表示的是一个目录,多级目录,需要使用多个//
input{
file {
path => "/usr/local/log_test/
/*.log"
start_position => "beginning"
}
}
其他配置文件说明
logstash.yml 文件里可以配置logstash的绑定IP地址与端口
jvm.options 配置启动logstash的jvm参数(默认1GB)
startup.options 只适用于yum安装的方式,这里无需关注
log4j2.properties 为log for java 第二版,是apache退出的一个标准化日志插件,这里是配置文件.此处无需关注.

启动logstash
./bin/logstash -f config/logstash.conf

elasticsearch 默认开启9200 9300
配置与启动elasticsearch(不可以在root身份下启动)
配置采用默认,不做修改
cd elasticsearch
cd bin/
nohup ./elasticsearch > nohup.out 2>&1 &

其他配置文件说明
elasticsearch.yml 配置绑定IP和端口,数据存放路径.生成环境需要修改此配置.
访问 http://192.168.31.129:9200/ 会显示一个json格式的节点信息

报错处理
cat /etc/sysctl.conf
vm.max_map_count=262144
sysctl -p

cat /etc/security/limits.conf

  • soft nproc 65536
  • hard nproc 65536
  • soft nofile 65536
  • hard nofile 65536
    退出当前终端,重新进入.

bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks #当绑定非localhost时,提示.(不处理也无妨).
修改配置文件,在配置文件添加一项参数.
vim /etc/elasticsearch/elasticsearch.yml
bootstrap.system_call_filter: false

kibana 默认开启5601
配置与启动kibana
vim kibana/config/kibana.yml
cat kibana.yml |grep -v "#"
server.host: 0.0.0.0
elasticsearch.url: "http://localhost:9200"

cd bin/
nohup ./kibana >nohup.out 2>&1 &
kibana平台的使用
logstash会将配置好的日志发给ES,kibana从ES里取数据.
打开kibana页面,http://192.168.31.129:5601 ,进入management,先建立对应的index.然后进入discover,就可以展示对应index的内容了.
支持一些正则匹配查询等(比如message : * ),以及时间选择(比如最后15分钟的)

关于登陆认证方式的说明
收费认证方式
ELK默认用户身份认证属于收费功能(5.0以后版本)
若购买了lisence后,执行
修改配置文件 kibana.yml
elasticsearch.username: "elastic"
elasticsearch.password: "xxx"
重启kibana后,执行以下curl,对elasticsearch的密码进行更新(默认用户名为elastic,默认密码为changeme)
curl -H "Content-Type: application/json" -XPUT -u elastic '192.168.31.129:9200/_xpack/security/user/kibana/_password' -d '{
"password" : "xxx"
}'
若是免费使用者,报错如下:

免费认证方式
可以使用nginx反向代理的方式.nginx做前端,后端挂elasticsearch和kibana,利用nginx的认证来做访问控制(如果外网知道后端机器IP,应该也是可以访问的吧).
https://www.cnblogs.com/configure/p/7607302.html
https://birdben.github.io/2017/02/08/Kibana/

安装Nginx:
yum -y install nginx
安装Apache密码生产工具:
yum install -y httpd-tools

生成密码文件:  
mkdir -p /etc/nginx/passwd; htpasswd -c -b /etc/nginx/passwd/kibana.passwd yunwei xxx

配置Nginx:
vim /etc/nginx/nginx.conf 增加如下内容:
server {
listen 10.0.0.30:18080; #对外开放端口
auth_basic "Kibana Auth";
auth_basic_user_file /etc/nginx/passwd/kibana.passwd;
location / {
proxy_pass http://10.0.0.30:25601;
proxy_redirect off;
}
}

修改Kibana配置文件:  
vim /usr/local/elk/kibana/config/kibana.yml
server.port: 25601
server.host: "10.0.0.30"
elasticsearch.url: "http://10.0.0.30:9200"

ES元数据说明
查看ES节点信息 http://192.168.31.129:9200/
查看日志文件 http://192.168.31.129:9200/_cat/indices

ES定期清理数据
先查看日志文件名,
curl http://192.168.31.129:9200/_cat/indices
curl http://10.0.0.201:9200/_cat/indices

执行(注意date的格式)
curl -X DELETE http://192.168.31.129:9200/systemdate +%Y.%m.%d -d "-0 days" #删除当天system打头的
返回值 {"acknowledged":true}
curl -X DELETE http://192.168.31.129:9200/system
date +%Y.%m.%d -d "-1 days" #删除昨天

eg:删除19号logstash的所有数据
curl -X DELETE 'http://127.0.0.1:9200/logstash-2017.06.19'
eg:删除2个月之前的数据
_last_data=date -d '-2 months' +%Y.%m
curl -X DELETE 'http://127.0.0.1:9200/
-'${_last_data}'-'
删除指定月份:
curl -X DELETE 'http://10.0.0.201:9200/
-2018.09*'

删除第30天前的那天的日志(crontab)
10 18 * * * /usr/bin/curl -X DELETE http://10.0.0.201:9200/-date +%Y.%m.%d -d "-30 days" #在cron里好像生效有问题
10 17 * * * /usr/bin/curl -X DELETE "http://10.0.0.201:9200/
-date +%Y.%m.%d -d \"-15 days\"*" #后来加的,等待一段时间再验证是否在cron里生效

能否以IP或其他方式展示,便于区分是哪个机器
是不是需要在logstash/config/logstash.conf 文件中用 type => "app1" 这种方式来确认呢?

以下是关于自动化的优化,可以不看
配置文件或脚本文件备份(在测试服上的实现)2018-0814
logstash的配置文件:

es配置文件:

另外,将jvm配置文件里的jvm参数调小

kibana配置文件:

ES上nginx的配置文件:

3个服务启动或关停脚本文件:

ES和kibana机器上配置(2个服务在同一台机器)
在root下启动es
sudo -u elasticsearch -E /usr/local/elasticsearch/bin/elasticsearch.sh --start #这里 -E 指的是不是有sudoer里的环境变量(最安全的一种环境变量),而使用root默认的环境变量.

加入到/etc/rc.local 文件
则需要给出javahome.因为rc.local运行在操作系统完全引导成功但是尚未启动login shell之前,所以我们配置在/etc/profiles或bashrc里的环境变量并未得到执行,因此在rc.local执行阶段看不到任何环境变量.将以下内容加入rc.local:
export JAVA_HOME=/usr/local/jvm;sudo -u elasticsearch -E /usr/local/elasticsearch/bin/elasticsearch.sh --start
/usr/local/kibana/bin/kibana.sh --start
service nginx start

为了下载文件方便,可以搭建一个nginx下载服
server {
listen 18081; #端口
server_name 10.0.0.201; #服务名
charset utf-8; # 避免中文乱码
root /opt/download; #显示的根索引目录,注意这里要改成你自己的,目录要存在
location / {
autoindex on; #开启索引功能
autoindex_exact_size off; # 关闭计算文件确切大小(单位bytes),只显示大概大小(单位kb、mb、gb)
autoindex_localtime on; # 显示本机时间而非 GMT 时间
}
}

向所有游戏服添加logstash服务
下载/解压/放置到/usr/local/;替换jvm/替换yml/替换logstash.conf/添加logstash.sh/添加/etc/rc.local

cd /opt/
wget http://13.251.64.203:18081/logstash-6.3.2.tar.gz
tar xzvf logstash-6.3.2.tar.gz
mv logstash-6.3.2 /usr/local/logstash
cd /usr/local/logstash/config/
mv jvm.options logstash.yml /home/admin/
wget http://13.251.64.203:18081/jvm.options
wget http://13.251.64.203:18081/logstash.yml
wget http://13.251.64.203:18081/logstash.conf
cd /usr/local/logstash/bin/;
wget http://13.251.64.203:18081/logstash.sh
chmod 755 logstash.sh
echo "/usr/local/logstash/bin/logstash.sh --start" >> /etc/rc.local
cd /usr/local/logstash/config/

修改logstash.conf文件内容
到kibana里去添加对应index

错误处理
因当初ES的核心机器配置比较低,CPU剩余额度经常为0.
最后造成logstash的信息发送不畅(ES接收那里有问题,部分机器被阻塞,个别机器的日志还可以发到ES里).
通过升级ES机器的配置到t3.mediam,以及磁盘升级到50GB.
然后,到各个logstash的主机 上重启java进程,予以解决.

[root@ip-10-30-0-100 ~]# /usr/local/logstash/bin/logstash.sh --stop

[root@ip-10-30-0-100 ~]# vim /usr/local/logstash/logs/logstash-plain.log
重启前报错日志:
[2018-09-25T11:53:24,302][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow
delete (api)];"})
[2018-09-25T11:53:24,302][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}
[2018-09-25T11:53:24,302][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow
delete (api)];"})
[2018-09-25T11:53:24,302][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}

重启后的正常信息:
[2018-09-25T11:54:01,275][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-09-25T11:54:01,749][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.3.2"}
[2018-09-25T11:54:04,007][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-09-25T11:54:04,424][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://10.0.0.201:9200/]}}
[2018-09-25T11:54:04,433][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.0.0.201:9200/, :path=>"/"}
[2018-09-25T11:54:04,749][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://10.0.0.201:9200/"}
[2018-09-25T11:54:04,872][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-09-25T11:54:04,875][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}
[2018-09-25T11:54:04,901][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//10.0.0.201:9200"]}
[2018-09-25T11:54:04,924][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-09-25T11:54:04,926][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://10.0.0.201:9200/]}}

posted @ 2020-04-21 14:56  ChanixChen  阅读(1091)  评论(0编辑  收藏  举报