GreenPlum 大数据平台--segment 失效问题排查
01,segment
检查一:
在master节点上检查失效的segment
正常情况下:
1 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -e 2 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44' 3 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15' 4 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master... 5 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Gathering data fromsegments... 6 .. 7 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:----------------------------------------------------- 8 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-Segment Mirroring Status Report 9 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:----------------------------------------------------- 10 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-All segments are running normally
检查二:
psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
[gpadmin@greenplum01 ~]$ psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';" dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port ------+---------+------+----------------+------+--------+------+----------+---------+------------------ (0 rows)
检查三:
gpstate -m
1 0190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -m 2 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44' 3 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15' 4 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master... 5 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-------------------------------------------------------------- 6 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--Current GPDB mirror list and status 7 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--Type = Group 8 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-------------------------------------------------------------- 9 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status 10 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum03 /greenplum/data/mirror/gpseg0 43000 Passive Synchronized 11 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum03 /greenplum/data/mirror/gpseg1 43001 Passive Synchronized 12 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum03 /greenplum/data2/mirror/gpseg2 43002 Passive Synchronized 13 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum03 /greenplum/data2/mirror/gpseg3 43003 Passive Synchronized 14 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum02 /greenplum/data/mirror/gpseg4 43000 Passive Synchronized 15 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum02 /greenplum/data/mirror/gpseg5 43001 Passive Synchronized 16 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum02 /greenplum/data2/mirror/gpseg6 43002 Passive Synchronized 17 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:- greenplum02 /greenplum/data2/mirror/gpseg7 43003 Passive Synchronized 18 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
检查四:日志检查
gplogfilter -t
1 [gpadmin@greenplum01 ~]$ gplogfilter -t 2 requested timestamp range from beginning of data to end of data 3 ---------- /greenplum/data/master/gpseg-1/pg_log/startup.log ---------- 4 in: 21 lines, 21 log entries; timestamps from 2019-07-11 11:30:36.331409 to 2019-07-11 11:31:29.331627 5 match: 0 lines 6 out: 0 lines, 0 log entries 7 ---------- /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113036.csv ---------- 8 2019-07-11 11:31:22.514551 CST|||p10925|th2011645824||||0|||seg-1|||||FATAL: |57P01|terminating connection due to administrator command|||||||0||postgres.c|3670| 9 in: 88 lines, 88 log entries; timestamps from 2019-07-11 11:30:36.469747 to 2019-07-11 11:31:22.514551 10 match: 1 lines, 1 log entries; timestamps from 2019-07-11 11:31:22.514551 to 2019-07-11 11:31:22.514551 11 out: 1 lines, 1 log entries; timestamps from 2019-07-11 11:31:22.514551 to 2019-07-11 11:31:22.514551 12 ---------- /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113124.csv ---------- 13 in: 63 lines, 63 log entries; timestamps from 2019-07-11 11:31:24.020944 to 2019-07-11 11:31:25.144209 14 match: 0 lines 15 out: 0 lines, 0 log entries 16 ---------- /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113129.csv ---------- 17 2019-07-11 13:53:29.393443 CST|gpadmin|gpdb|p21035|th280524672|[local]||2019-07-11 13:53:29 CST|0|con20||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790| 18 2019-07-11 14:02:15.111734 CST|kingle|gpdb|p21208|th280524672|[local]||2019-07-11 14:02:15 CST|0|con22||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "gpdb", SSL off|||||||0||auth.c|623| 19 2019-07-11 14:02:39.905762 CST|gpadmin|gpdb|p21274|th280524672|[local]||2019-07-11 14:02:39 CST|0|con23||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790| 20 2019-07-11 14:03:15.951249 CST|kingle|gpdb|p21283|th280524672|[local]||2019-07-11 14:03:15 CST|0|con25||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "gpdb", SSL off|||||||0||auth.c|623| 21 2019-07-11 14:03:26.389797 CST|kingle|postgres|p21289|th280524672|[local]||2019-07-11 14:03:26 CST|0|con26||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 22 2019-07-11 14:06:12.037982 CST|kingle|postgres|p21541|th280524672|192.168.0.221|2702|2019-07-11 14:06:12 CST|0|con27||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 23 2019-07-11 14:07:01.948006 CST|kingle|postgres|p21561|th280524672|192.168.0.221|2720|2019-07-11 14:07:01 CST|0|con28||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 24 2019-07-11 14:07:13.876319 CST|kingle|postgres|p21564|th280524672|192.168.0.221|2722|2019-07-11 14:07:13 CST|0|con29||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 25 2019-07-11 14:08:18.729975 CST|gpadmin|gpdb|p21582|th280524672|[local]||2019-07-11 14:08:18 CST|0|con30||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790| 26 2019-07-11 14:08:50.351436 CST|gpadmin|gpdb|p21609|th280524672|[local]||2019-07-11 14:08:50 CST|0|con33||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790| 27 2019-07-11 14:09:05.416505 CST|gpadmin|postgres|p21614|th280524672|[local]||2019-07-11 14:09:05 CST|0|con35|cmd1|seg-1||dx11||sx1|ERROR: |42P04|database "gpdb" already exists||||||CREATE DATABASE gpdb; 28 |0||dbcommands.c|901| 29 2019-07-11 14:09:48.636153 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd1|seg-1||dx12||sx1|ERROR: |42601|syntax error at or near ";"||||||grant 30 ;|8||scan.l|982| 31 2019-07-11 14:10:17.089067 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd2|seg-1||dx13||sx1|ERROR: |42601|syntax error at or near ";"||||||grant 32 ;|8||scan.l|982| 33 2019-07-11 14:10:32.484569 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd5|seg-1||dx15||sx1|ERROR: |3D000|database "demo" does not exist||||||GRANT all on database demo to kingle 34 ;|0||dbcommands.c|2519| 35 2019-07-11 14:11:10.314802 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd9|seg-1||dx18||sx1|ERROR: |3F000|schema "gpdb" does not exist||||||GRANT USAGE on SCHEMA gpdb to kingle 36 ;|0||aclchk.c|598| 37 2019-07-11 14:13:19.213757 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd21|seg-1||dx25||sx1|ERROR: |42P07|relation "test001" already exists||||||create table test001(id int,name varchar(128));|0||heap.c|1546| 38 2019-07-11 14:13:43.227208 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd25|seg-1||dx29||sx1|ERROR: |42601|syntax error at or near ","||||||create table test005(id int primary,name varchar(128));|36||scan.l|982| 39 2019-07-11 14:14:05.356883 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd39|seg-1||dx36||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286| 40 2019-07-11 14:14:12.261512 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd40|seg-1||dx37||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286| 41 2019-07-11 14:14:25.038044 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd41|seg-1||dx38||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286| 42 2019-07-11 14:14:48.737385 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd42|seg-1||dx39||sx1|ERROR: |42P01|relation "test1" does not exist||||||select * from test1 x,test2 y where x.id=y.id;|15||namespace.c|286| 43 2019-07-11 14:47:15.035344 CST|kingle|postgres|p22272|th280524672|192.168.0.221|3476|2019-07-11 14:47:15 CST|0|con38||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 44 2019-07-11 14:52:35.122438 CST|kingle|postgres|p22360|th280524672|192.168.0.221|3558|2019-07-11 14:52:35 CST|0|con39||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 45 2019-07-11 14:52:41.158396 CST|kingle|postgres|p22378|th280524672|[local]||2019-07-11 14:52:41 CST|0|con40||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 46 2019-07-11 14:52:51.572521 CST|kingle|postgres|p22380|th280524672|192.168.0.221|3576|2019-07-11 14:52:51 CST|0|con41||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 47 2019-07-11 14:53:06.302376 CST|kingle|postgres|p22383|th280524672|192.168.0.221|3578|2019-07-11 14:53:06 CST|0|con42||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623| 48 2019-07-11 15:20:56.537899 CST|kingle|postgres|p22922|th280524672|192.168.0.221|4066|2019-07-11 15:20:40 CST|0|con46|cmd1|seg-1||dx41||sx1|ERROR: |42P01|relation "test0001" does not exist||||||select * from test0001 49 ;|15||namespace.c|286| 50 2019-07-11 15:30:44.055204 CST|kingle|gpdb|p23075|th280524672|192.168.0.221|4212|2019-07-11 15:30:24 CST|0|con47|cmd3|seg-1||dx46||sx1|ERROR: |42501|permission denied for relation test001||||||select * from test001;|0||aclchk.c|1870| 51 2019-07-11 15:34:19.475082 CST|kingle|gpdb|p23075|th280524672|192.168.0.221|4212|2019-07-11 15:30:24 CST|0|con47|cmd6|seg-1||dx48||sx1|ERROR: |42501|permission denied for relation test001||||||GRANT all on TABLE test001 to kingle;|0||aclchk.c|1870| 52 2019-07-11 15:35:33.309475 CST|kingle|postgres|p23222|th280524672|192.168.0.221|4310|2019-07-11 15:35:21 CST|0|con49|cmd1|seg-1||dx50||sx1|ERROR: |42P01|relation "test001" does not exist||||||select * from test001 53 ;|15||namespace.c|286| 54 2019-07-11 15:45:58.918525 CST|gpadmin|gpdb|p23517|th280524672|[local]||2019-07-11 15:45:37 CST|0|con56|cmd1|seg-1||dx56||sx1|ERROR: |42P01|relation "schema" does not exist||||||grant all on schema to kingle 55 ;|0||namespace.c|286| 56 2019-07-11 16:03:46.770910 CST|gpadmin|gpdb|p23944|th280524672|[local]||2019-07-11 16:02:16 CST|0|con57|cmd16|seg-1||dx60||sx1|ERROR: |42P01|relation "table_name" does not exist||||||SELECT gp_segment_id, count(*) 57 FROM table_name GROUP BY gp_segment_id;|41||namespace.c|286| 58 2019-07-11 16:03:48.903080 CST|gpadmin|gpdb|p23944|th280524672|[local]||2019-07-11 16:02:16 CST|0|con57|cmd17|seg-1||dx61||sx1|ERROR: |42P01|relation "table_name" does not exist||||||SELECT gp_segment_id, count(*) 59 FROM table_name GROUP BY gp_segment_id;|41||namespace.c|286| 60 2019-07-11 16:11:22.854459 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd1|seg-1||dx72||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';" 61 q 62 ;|1||scan.l|982| 63 2019-07-11 16:11:33.673982 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd2|seg-1||dx73||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d'; 64 65 \q 66 67 " 68 ;|1||scan.l|982| 69 2019-07-11 16:11:54.103690 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd3|seg-1||dx74||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';" 70 ' 71 ';|1||scan.l|982| 72 in: 460 lines, 460 log entries; timestamps from 2019-07-11 11:31:29.463761 to 2019-07-11 16:12:19.107822 73 match: 36 lines, 36 log entries; timestamps from 2019-07-11 13:53:29.393443 to 2019-07-11 16:11:54.103690 74 out: 36 lines, 36 log entries; timestamps from 2019-07-11 13:53:29.393443 to 2019-07-11 16:11:54.103690 75 ---------- /greenplum/data/master/gpseg-1/pg_log/gp_era ---------- 76 in: 3 lines, 1 log entries; no timestamps found 77 match: 0 lines 78 out: 0 lines, 0 log entries
对于WARNING、ERROR、FATAL或者PANIC日志级别的消息,使用gplogfilter检查Master的日志文件
每个Segment实例上的WARNING、ERROR、FATAL或者PANIC日志级别的消息,使用gpssh检查
gpssh -f seg_hosts -e 'source /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t /data1/primary/*/pg_log/gpdb*.log' > seglog.out
1 [gpadmin@greenplum01 conf]$ gpssh -f seg_hosts_file -e 'source 2 > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 3 > gpssh -f seg_hosts_file -e 'source 4 /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t ^C 5 [gpadmin@greenplum01 conf]$ gpssh -f seg_hosts -e 'source 6 > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 7 > /data1/primary/*/pg_log/gpdb*.log' > seglog.out 8 [gpadmin@greenplum01 conf]$ more seglog.out 9 [greenplum02] > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 10 [greenplum02] > /data1/primary/*/pg_log/gpdb*.log"; source 11 [greenplum02] source 12 [greenplum02] /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 13 [greenplum02] /data1/primary/*/pg_log/gpdb*.log 14 [greenplum02] -bash: source: filename argument required 15 [greenplum02] source: usage: source filename [arguments] 16 [greenplum03] > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 17 [greenplum03] > /data1/primary/*/pg_log/gpdb*.log"; source 18 [greenplum03] source 19 [greenplum03] /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 20 [greenplum03] /data1/primary/*/pg_log/gpdb*.log 21 [greenplum03] -bash: source: filename argument required 22 [greenplum03] source: usage: source filename [arguments]
人生就像一滴水,非要落下才后悔!
--kingle