Open-falcon监控
https://book.open-falcon.org/zh_0_2/
本文档记录了CentOS7.4下open-falcon-v2监控系统的部署流程,以及一些需要注意的地方。
环境准备
安装Redis
0.InstallRedis3.2.sh redis-3.2.3.tar.gz
安装mysql
略
将root让全部访问设置为%,将/tmp/mysql.sock链接到/app/mysqldata/3306/mysql.sock,将参数bind_address=127.0.0.1
初始化MySQL表结构
cd /tmp/ && git clone git://github.com/open-falcon/falcon-plus.git
cd /tmp/falcon-plus/scripts/mysql/db_schema/
mysql -h 127.0.0.1 -uroot -pmsds007 < 1_uic-db-schema.sql
mysql -h 127.0.0.1 -uroot -pmsds007 < 2_portal-db-schema.sql
mysql -h 127.0.0.1 -uroot -pmsds007 < 3_dashboard-db-schema.sql
mysql -h 127.0.0.1 -uroot -pmsds007 < 4_graph-db-schema.sql
mysql -h 127.0.0.1 -uroot -pmsds007 < 5_alarms-db-schema.sql
rm -rf /tmp/falcon-plus/
启动后端
open-falcon-v0.2.1.tar.gz 二进制包
创建工作目录
export FALCON_HOME=/home/work
export WORKSPACE=$FALCON_HOME/open-falcon
mkdir -p $WORKSPACE
解压二进制包
tar -xzvf open-falcon-v0.2.1.tar.gz -C $WORKSPACE
在一台机器上启动所有的后端组件
修改配置文件,确认配置文件中数据库账号密码与实际相同
# cd $WORKSPACE
# grep -Ilr 3306 ./
./nodata/config/cfg.json
./graph/config/cfg.json
./hbs/config/cfg.json
./alarm/config/cfg.json
./aggregator/config/cfg.json
./api/config/cfg.json
部分模块依赖连接数据库,需要修改配置文件
vim /home/work/open-falcon/aggregator/config/cfg.json
vim /home/work/open-falcon/graph/config/cfg.json
vim /home/work/open-falcon/hbs/config/cfg.json
vim /home/work/open-falcon/nodata/config/cfg.json
vim /home/work/open-falcon/api/config/cfg.json
vim /home/work/open-falcon/alarm/config/cfg.json
启动
cd $WORKSPACE
./open-falcon start
# 检查所有模块的启动状况
./open-falcon check
falcon-graph UP 3962
falcon-hbs UP 3970
falcon-judge UP 3978
falcon-transfer UP 3984
falcon-nodata UP 3990
falcon-aggregator UP 3996
falcon-agent UP 4003
falcon-gateway UP 4009
falcon-api UP 4016
falcon-alarm UP 4024
安装前端
python2.7安装pip
# unzip setuptools-38.5.1.zip
# cd setuptools-38.5.1
# python setup.py install
# tar -zxvf pip-9.0.1.tar.gz
# cd pip-9.0.1
# python setup.py install
克隆前端组件代码
cd $WORKSPACE
git clone git://github.com/open-falcon/dashboard.git
安装依赖包
yum install -y python-virtualenv
yum install -y python-devel
yum install -y openldap-devel
yum install -y mysql-devel
yum groupinstall "Development tools" -y
cd $WORKSPACE/dashboard/
virtualenv ./env
pip install -r pip_requirements.txt
ln -s /usr/local/mysql/lib/libmysqlclient.so.20 /usr/lib64/libmysqlclient.so.20
MySQL-python-1.2.5.zip会用到
修改配置
dashboard的配置文件为: '/home/work/open-falcon/dashboard/rrd/config.py',请根据实际情况修改
## API_ADDR 表示后端api组件的地址
API_ADDR = "http://127.0.0.1:8080/api/v1"
## 根据实际情况,修改PORTAL_DB_*, 默认用户名为root,默认密码为""
## 根据实际情况,修改ALARM_DB_*, 默认用户名为root,默认密码为""
在生产环境启动
bash control start
bash control tail
http://127.0.0.1:8081
系统初始化
前端用户需要注册,第一个注册的root账户就是管理员,所以第一件事就是先注册root用户密码123456
注册完root用户,最好把注册功能关闭,可以通过root创建其他用户。需要修改api模块的配置文件,signup_disable改为true就可以了
部署agent
将agent目录复制到要监控的主机上
scp -r /home/work/open-falcon/agent/ mydb2:/home/work/open-falcon
scp -r /home/work/open-falcon/agent/ mydb3:/home/work/open-falcon
将open-falcon复制到要监控的主机上
scp /home/work/open-falcon/open-falcon mydb2:/home/work/open-falcon
scp /home/work/open-falcon/open-falcon mydb3:/home/work/open-falcon
编辑配置文件
# cat cfg.json
{
"debug": true,
"hostname": "mydb2",
"ip": "192.168.1.102",
"plugin": {
"enabled": false,
"dir": "./plugin",
"git": "https://github.com/open-falcon/plugin.git",
"logs": "./logs"
},
"heartbeat": {
"enabled": true,
"addr": "192.168.1.101:6030",
"interval": 60,
"timeout": 1000
},
"transfer": {
"enabled": true,
"addrs": [
"192.168.1.101:8433"
],
"interval": 60,
"timeout": 1000
},
"http": {
"enabled": true,
"listen": ":1988",
"backdoor": false
},
"collector": {
"ifacePrefix": ["eth", "em"],
"mountPoint": []
},
"default_tags": {
},
"ignore": {
"cpu.busy": true,
"df.bytes.free": true,
"df.bytes.total": true,
"df.bytes.used": true,
"df.bytes.used.percent": true,
"df.inodes.total": true,
"df.inodes.free": true,
"df.inodes.used": true,
"df.inodes.used.percent": true,
"mem.memtotal": true,
"mem.memused": true,
"mem.memused.percent": true,
"mem.memfree": true,
"mem.swaptotal": true,
"mem.swapused": true,
"mem.swapfree": true
}
}
/home/work/open-falcon/open-falcon start agent 启动进程
/home/work/open-falcon/open-falcon stop agent 停止进程
/home/work/open-falcon/open-falcon monitor agent 查看日志
mydb1是单MySQL
mydb2,mydb3是复制
监控MySQL
# tar -zxvf mymon.tar.gz
修改配置文件
# cat mon.cfg
[default]
log_file=/soft/mymon.log
# Panic 0
# Fatal 1
# Error 2
# Warn 3
# Info 4
# Debug 5
log_level=5
falcon_client=http://127.0.0.1:1988/v1/push
#自定义endpoint
endpoint=mydb2
[mysql]
user=root
password=msds007
host=127.0.0.1
port=3306
# cat slavestatus.sh
#!/bin/bash
source /etc/profile
ts=`date +%s`;
MySql_CMD="/usr/local/mysql/bin/mysql"
User=root
passwd=msds007
LogLevel="error"
#LogLevel="debug"
Endpoint=`hostname`
Port=3306
Slave_IO_Metric="Slave_IO_Status"
Slave_SQL_Metric="Slave_SQL_Status"
Normal_Value=1
NonNormal_Value=0
CurrentPath=$(dirname $0)
Logfile=$CurrentPath/out.log
#Ip_Address=`ifconfig -a|grep "inet addr"|grep 10|awk '{print $2}'|cut -d : -f 2`
Ip_Address=127.0.0.1
IOSTATUS=`$MySql_CMD -u$User -p$passwd -S /app/mysqldata/3306/mysql.sock -e "show slave status \G "|grep Slave_IO_Running|cut -d ':' -f 2|sed -e 's/ //g'`
SQLSTATUS=`$MySql_CMD -u$User -p$passwd -S /app/mysqldata/3306/mysql.sock -e "show slave status \G "|grep Slave_SQL_Running:|cut -d ':' -f 2|sed -e 's/ //g'`
echo $IOSTATUS
echo $SQLSTATUS
if [ "$LogLevel" == "debug" ]; then
if [ "$IOSTATUS" == ""Yes ]; then
echo "curl -X POST -d "[{\"metric\": \"$Slave_IO_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $Normal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push" >> $Logfile
else
echo "curl -X POST -d "[{\"metric\": \"$Slave_IO_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $NonNormal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push" >> $Logfile
fi
if [ "$SQLSTATUS" == ""Yes ]; then
echo "curl -X POST -d "[{\"metric\": \"$Slave_SQL_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $Normal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push" >> $Logfile
else
echo "curl -X POST -d "[{\"metric\": \"$Slave_SQL_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $NonNormal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push" >> $Logfile
fi
else
if [ "$IOSTATUS" == ""Yes ]; then
curl -X POST -d "[{\"metric\": \"$Slave_IO_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $Normal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push
else
curl -X POST -d "[{\"metric\": \"$Slave_IO_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $NonNormal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push
fi
if [ "$SQLSTATUS" == ""Yes ]; then
curl -X POST -d "[{\"metric\": \"$Slave_SQL_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $Normal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push
else
curl -X POST -d "[{\"metric\": \"$Slave_SQL_Metric\", \"endpoint\": \"$Endpoint\", \"timestamp\": $ts,\"step\": 60,\"value\": $NonNormal_Value,\"counterType\": \"GAUGE\",\"tags\": \"port=$Port\"}]" http://$Ip_Address:1988/v1/push
fi
fi
crontab中
* * * * * /soft/mymon -c /soft/etc/mon.cfg
* * * * * sh /soft/etc/slavestatus.sh
添加模板,添加主机组,添加screen
模板
主机组
Screen->Graph部分
Com_delete/port=3309
Com_insert/port=3309
Com_select/port=3309
Com_update/port=3309
Innodb_row_lock_current_waits/port=3309
Innodb_row_lock_time/port=3309
Innodb_row_lock_time_avg/port=3309
Innodb_row_lock_time_max/port=3309
Innodb_row_lock_waits/port=3309
Innodb_rows_deleted/port=3309
Innodb_rows_inserted/port=3309
Innodb_rows_updated/port=3309
Queries/port=3309
Questions/port=3309
Seconds_Behind_Master/port=3309
Slave_IO_Status/port=3309
Slave_SQL_Status/port=3309
Slow_queries/port=3309
Threads_connected/port=3309
Threads_running/port=3309
cpu.idle
disk.io.read_bytes/device=sdb
disk.io.read_requests/device=sdb
disk.io.util/device=sdb
disk.io.write_bytes/device=sdb
disk.io.write_requests/device=sdb
innodb_autoinc_lock_mode/port=3309
innodb_lock_wait_timeout/port=3309
mem.memfree.percent
mem.swapfree.percent
mem.swapused.percent
Innodb_row_lock_current_waits:当前正在等待锁定的数量;
Innodb_row_lock_time :从系统启动到现在锁定的总时间长度,单位ms;
Innodb_row_lock_time_avg :每次等待所花平均时间;
Innodb_row_lock_time_max:从系统启动到现在等待最长的一次所花的时间;
Innodb_row_lock_waits :从系统启动到现在总共等待的次数
监控Redis
下载相关脚本
测试运行
根据实际部署情况,修改有注释位置附近的配置
测试: python redis-monitor.py
将脚本加入crontab执行即可
* * * * * /usr/bin/python /soft/redis-monitor.py
注意redis-cli全路径,修改redis-monitor.py
注意redis.conf全路径,修改redis-monitor.py
启动redis
/usr/local/redis/bin/redis-server /etc/redis/6379.conf
监控MongoDB
Python 2.6
PyYAML > 3.10
python-requests > 0.11
pip install pyyaml
pip install requests
pip install pymongo
解压目录到/soft/mongomon
配置当前服务器的MongoDB多实例(mongod,配置节点,mongos)信息,/soft/mongomon/conf/mongomon.conf 每行记录一个实例: 端口,用户名,密码
{port: 27017, user: "root",password: "abc123"}
配置crontab, 修改mongomon/conf/mongomon_cron文件中mongomon安装path; cp mongomon_cron /etc/cron.d/
几分钟后,可从open-falcon的dashboard中查看MongoDB metric
endpoint默认是hostname
启动MongoDB
/usr/local/mongodb/bin/mongod -f /app/mongodb/27017/mongodb.config