EC2自定义监控推到cloudwatch

已ec2自建gp连接数监控为例:

编写监控脚本monitor.sh配置文件:

 1 #!/usr/bin/env bash
 2 . /home/gpadmin/.bashrc
 3 export PGPASSWORD=xxxxx
 4 home=/home/gpadmin/scripts/
 5 host=xxxx
 6 ## transactions
 7 psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.transactions.sql > /home/gpadmin/scripts/logs/transactions.txt
 8 ## diskfree
 9 psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.diskfree.sql|sort -r > /home/gpadmin/scripts/logs/diskfree.txt
10 ## connections
11 psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.connections.sum.sql > /home/gpadmin/scripts/logs/connections.txt
12 ## bgwriter
13 #psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.bgwriter.sql > /home/gpadmin/scripts/logs/bgwriter.txt
14 ## tps_qps
15 #psql -qtAX -h$host -p5432 -Udev_user -dcamel -f /home/gpadmin/scripts/sql/pgsql.tpqs.sql > /home/gpadmin/scripts/logs/tqps.txt
16 transactions_idle=$(cat /$home/logs/transactions.txt|jq .idle)
17 transactions_active=$(cat /$home/logs/transactions.txt|jq .active)
18 transactions_waiting=$(cat /$home/logs/transactions.txt|jq .waiting)
19 transactions_prepared=$(cat /$home/logs/transactions.txt|jq .prepared)
20 active_cnt=$(cat /$home/logs/connections.txt|jq .active)
21 idle_cnt=$(cat /$home/logs/connections.txt|jq .idle)
22 total_cnt=$(cat /$home/logs/connections.txt|jq .total)
27 #tps=$(cat /home/gpadmin/scripts/logs/tqps.txt|jq .segment_disk_free_gb|sed -n 4p)
28 
29 #checkpoints_timed=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed)
30 #checkpoints_req=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed)
31 #buffers_checkpoint=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed)
32 #buffers_backend=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed)
33 #buffers_alloc=$(cat /home/gpadmin/scripts/logs/bgwriter.txt|jq .checkpoints_timed)
34  
35 aws cloudwatch put-metric-data --metric-name transactions_idle --dimensions Instance=i-xxxx  --namespace "GreenPlum" --value $transactions_idle
36 aws cloudwatch put-metric-data --metric-name transactions_active --dimensions Instance=i-xxxx  --namespace "GreenPlum" --value $transactions_active
37 aws cloudwatch put-metric-data --metric-name transactions_waiting --dimensions Instance=i-xxx  --namespace "GreenPlum" --value $transactions_waiting
38 aws cloudwatch put-metric-data --metric-name transactions_prepared --dimensions Instance=i-xxx  --namespace "GreenPlum" --value $transactions_prepared
39 aws cloudwatch put-metric-data --metric-name active_cnt --dimensions Instance=i-xxx  --namespace "GreenPlum" --value $active_cnt
40 aws cloudwatch put-metric-data --metric-name idle_cnt --dimensions Instance=i-xxx  --namespace "GreenPlum" --value $idle_cnt
41 aws cloudwatch put-metric-data --metric-name total_cnt --dimensions Instance=i-xxx  --namespace "GreenPlum" --value $total_cnt

以上监控了gp会话,可用磁盘空间以及事物情况,可以自己根据实际需求修改。

然后把再服务器上放置好配置文件里面所需.sql查询语句。

 1 gpadmin@ip-10-10-34-163:~/scripts$ cat sql/pgsql.connections.sum.sql 
 2 DO LANGUAGE plpgsql $$
 3 DECLARE
 4         ver integer;
 5         res text;
 6 BEGIN
 7         SELECT current_setting('server_version_num') INTO ver;
 8 
 9         IF (ver >= 90600) THEN
10                 SELECT row_to_json(T) INTO res from (
11                         SELECT
12                                 sum(CASE WHEN state = 'active' THEN 1 ELSE 0 END) AS active,
13                                 sum(CASE WHEN state = 'idle' THEN 1 ELSE 0 END) AS idle,
14                                 sum(CASE WHEN state = 'idle in transaction' THEN 1 ELSE 0 END) AS idle_in_transaction,
15                                 count(*) AS total,
16                                 count(*)*100/(SELECT current_setting('max_connections')::int) AS total_pct,
17                                 sum(CASE WHEN waiting_reason IS NOT NULL THEN 1 ELSE 0 END) AS waiting,
18                                 (SELECT count(*) FROM pg_prepared_xacts) AS prepared
19                         FROM pg_stat_activity WHERE datid is not NULL
20                         ) T;
21 
22         ELSE
23                 SELECT row_to_json(T) INTO res from (
24                         SELECT
25                                 sum(CASE WHEN state = 'active' THEN 1 ELSE 0 END) AS active,
26                                 sum(CASE WHEN state = 'idle' THEN 1 ELSE 0 END) AS idle,
27                                 sum(CASE WHEN state = 'idle in transaction' THEN 1 ELSE 0 END) AS idle_in_transaction,
28                                 count(*) AS total,
29                                 count(*)*100/(SELECT current_setting('max_connections')::int) AS total_pct,
30                                 sum(CASE WHEN waiting IS TRUE THEN 1 ELSE 0 END) AS waiting,
31                                 (SELECT count(*) FROM pg_prepared_xacts) AS prepared
32                         FROM pg_stat_activity
33                         ) T;
34         END IF;
35 
36         perform set_config('conn_json_res', res, false);
37 END $$;
38 
39 select current_setting('conn_json_res');

每一个监控指标数据,都可以用sql写出来,写好sql查询指标放到指定目录,然后配置好awsconfig,然后在服务器上就可以取到所需值了。

最后把monitor.sh加到crontab,1min推送一次。

1 gpadmin@ip-10-10-34-163:~/scripts$ crontab -l
2 
3 #*/1 * * * * script -c "/home/gpadmin/scripts/monitor.sh >>/tmp/crontab.log"
4 */1 * * * * /home/gpadmin/scripts/monitor.sh >>/tmp/crontab.log  2>&1

最后查看cloudwatch指标,发现之前的监控图所示:

有数据收集上来表示没有问题。

posted @ 2021-01-18 21:15  5sdba  阅读(326)  评论(0编辑  收藏  举报