GreenPlum 大数据平台--segment 失效问题排查

01,segment

  检查一:

  在master节点上检查失效的segment

  正常情况下:

 1 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -e
 2 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
 3 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
 4 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
 5 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Gathering data fromsegments...
 6 ..
 7 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-----------------------------------------------------
 8 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-Segment Mirroring Status Report
 9 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-----------------------------------------------------
10 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-All segments are running normally
View Code

  检查二:

psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
[gpadmin@greenplum01 ~]$ psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
 dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port
------+---------+------+----------------+------+--------+------+----------+---------+------------------
(0 rows)

  检查三:

gpstate -m
 1 0190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -m
 2 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
 3 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
 4 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
 5 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
 6 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--Current GPDB mirror list and status
 7 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--Type = Group
 8 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
 9 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   Mirror        Datadir  Port    Status    Data Status
10 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg0  43000   Passive   Synchronized
11 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg1  43001   Passive   Synchronized
12 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg2  43002   Passive   Synchronized
13 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg3  43003   Passive   Synchronized
14 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg4  43000   Passive   Synchronized
15 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg5  43001   Passive   Synchronized
16 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg6  43002   Passive   Synchronized
17 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg7  43003   Passive   Synchronized
18 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
View Code

  检查四:日志检查

gplogfilter -t
 1 [gpadmin@greenplum01 ~]$ gplogfilter -t
 2 requested timestamp range from beginning of data to end of data
 3 ----------  /greenplum/data/master/gpseg-1/pg_log/startup.log ----------
 4        in:      21 lines,      21 log entries; timestamps from 2019-07-11 11:30:36.331409 to 2019-07-11 11:31:29.331627
 5     match:       0 lines
 6       out:       0 lines,       0 log entries
 7 ----------  /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113036.csv ----------
 8 2019-07-11 11:31:22.514551 CST|||p10925|th2011645824||||0|||seg-1|||||FATAL: |57P01|terminating connection due to administrator command|||||||0||postgres.c|3670|
 9        in:      88 lines,      88 log entries; timestamps from 2019-07-11 11:30:36.469747 to 2019-07-11 11:31:22.514551
10     match:       1 lines,       1 log entries; timestamps from 2019-07-11 11:31:22.514551 to 2019-07-11 11:31:22.514551
11       out:       1 lines,       1 log entries; timestamps from 2019-07-11 11:31:22.514551 to 2019-07-11 11:31:22.514551
12 ----------  /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113124.csv ----------
13        in:      63 lines,      63 log entries; timestamps from 2019-07-11 11:31:24.020944 to 2019-07-11 11:31:25.144209
14     match:       0 lines
15       out:       0 lines,       0 log entries
16 ----------  /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113129.csv ----------
17 2019-07-11 13:53:29.393443 CST|gpadmin|gpdb|p21035|th280524672|[local]||2019-07-11 13:53:29 CST|0|con20||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
18 2019-07-11 14:02:15.111734 CST|kingle|gpdb|p21208|th280524672|[local]||2019-07-11 14:02:15 CST|0|con22||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "gpdb", SSL off|||||||0||auth.c|623|
19 2019-07-11 14:02:39.905762 CST|gpadmin|gpdb|p21274|th280524672|[local]||2019-07-11 14:02:39 CST|0|con23||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
20 2019-07-11 14:03:15.951249 CST|kingle|gpdb|p21283|th280524672|[local]||2019-07-11 14:03:15 CST|0|con25||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "gpdb", SSL off|||||||0||auth.c|623|
21 2019-07-11 14:03:26.389797 CST|kingle|postgres|p21289|th280524672|[local]||2019-07-11 14:03:26 CST|0|con26||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
22 2019-07-11 14:06:12.037982 CST|kingle|postgres|p21541|th280524672|192.168.0.221|2702|2019-07-11 14:06:12 CST|0|con27||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
23 2019-07-11 14:07:01.948006 CST|kingle|postgres|p21561|th280524672|192.168.0.221|2720|2019-07-11 14:07:01 CST|0|con28||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
24 2019-07-11 14:07:13.876319 CST|kingle|postgres|p21564|th280524672|192.168.0.221|2722|2019-07-11 14:07:13 CST|0|con29||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
25 2019-07-11 14:08:18.729975 CST|gpadmin|gpdb|p21582|th280524672|[local]||2019-07-11 14:08:18 CST|0|con30||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
26 2019-07-11 14:08:50.351436 CST|gpadmin|gpdb|p21609|th280524672|[local]||2019-07-11 14:08:50 CST|0|con33||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
27 2019-07-11 14:09:05.416505 CST|gpadmin|postgres|p21614|th280524672|[local]||2019-07-11 14:09:05 CST|0|con35|cmd1|seg-1||dx11||sx1|ERROR: |42P04|database "gpdb" already exists||||||CREATE DATABASE gpdb;
28 |0||dbcommands.c|901|
29 2019-07-11 14:09:48.636153 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd1|seg-1||dx12||sx1|ERROR: |42601|syntax error at or near ";"||||||grant
30 ;|8||scan.l|982|
31 2019-07-11 14:10:17.089067 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd2|seg-1||dx13||sx1|ERROR: |42601|syntax error at or near ";"||||||grant
32 ;|8||scan.l|982|
33 2019-07-11 14:10:32.484569 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd5|seg-1||dx15||sx1|ERROR: |3D000|database "demo" does not exist||||||GRANT all on database demo to kingle
34 ;|0||dbcommands.c|2519|
35 2019-07-11 14:11:10.314802 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd9|seg-1||dx18||sx1|ERROR: |3F000|schema "gpdb" does not exist||||||GRANT USAGE  on SCHEMA  gpdb to kingle
36 ;|0||aclchk.c|598|
37 2019-07-11 14:13:19.213757 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd21|seg-1||dx25||sx1|ERROR: |42P07|relation "test001" already exists||||||create table test001(id int,name varchar(128));|0||heap.c|1546|
38 2019-07-11 14:13:43.227208 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd25|seg-1||dx29||sx1|ERROR: |42601|syntax error at or near ","||||||create table test005(id int primary,name varchar(128));|36||scan.l|982|
39 2019-07-11 14:14:05.356883 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd39|seg-1||dx36||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286|
40 2019-07-11 14:14:12.261512 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd40|seg-1||dx37||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286|
41 2019-07-11 14:14:25.038044 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd41|seg-1||dx38||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286|
42 2019-07-11 14:14:48.737385 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd42|seg-1||dx39||sx1|ERROR: |42P01|relation "test1" does not exist||||||select * from test1 x,test2 y where x.id=y.id;|15||namespace.c|286|
43 2019-07-11 14:47:15.035344 CST|kingle|postgres|p22272|th280524672|192.168.0.221|3476|2019-07-11 14:47:15 CST|0|con38||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
44 2019-07-11 14:52:35.122438 CST|kingle|postgres|p22360|th280524672|192.168.0.221|3558|2019-07-11 14:52:35 CST|0|con39||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
45 2019-07-11 14:52:41.158396 CST|kingle|postgres|p22378|th280524672|[local]||2019-07-11 14:52:41 CST|0|con40||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
46 2019-07-11 14:52:51.572521 CST|kingle|postgres|p22380|th280524672|192.168.0.221|3576|2019-07-11 14:52:51 CST|0|con41||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
47 2019-07-11 14:53:06.302376 CST|kingle|postgres|p22383|th280524672|192.168.0.221|3578|2019-07-11 14:53:06 CST|0|con42||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
48 2019-07-11 15:20:56.537899 CST|kingle|postgres|p22922|th280524672|192.168.0.221|4066|2019-07-11 15:20:40 CST|0|con46|cmd1|seg-1||dx41||sx1|ERROR: |42P01|relation "test0001" does not exist||||||select * from test0001
49 ;|15||namespace.c|286|
50 2019-07-11 15:30:44.055204 CST|kingle|gpdb|p23075|th280524672|192.168.0.221|4212|2019-07-11 15:30:24 CST|0|con47|cmd3|seg-1||dx46||sx1|ERROR: |42501|permission denied for relation test001||||||select * from test001;|0||aclchk.c|1870|
51 2019-07-11 15:34:19.475082 CST|kingle|gpdb|p23075|th280524672|192.168.0.221|4212|2019-07-11 15:30:24 CST|0|con47|cmd6|seg-1||dx48||sx1|ERROR: |42501|permission denied for relation test001||||||GRANT all on TABLE test001 to kingle;|0||aclchk.c|1870|
52 2019-07-11 15:35:33.309475 CST|kingle|postgres|p23222|th280524672|192.168.0.221|4310|2019-07-11 15:35:21 CST|0|con49|cmd1|seg-1||dx50||sx1|ERROR: |42P01|relation "test001" does not exist||||||select * from test001
53 ;|15||namespace.c|286|
54 2019-07-11 15:45:58.918525 CST|gpadmin|gpdb|p23517|th280524672|[local]||2019-07-11 15:45:37 CST|0|con56|cmd1|seg-1||dx56||sx1|ERROR: |42P01|relation "schema" does not exist||||||grant all on schema to kingle
55 ;|0||namespace.c|286|
56 2019-07-11 16:03:46.770910 CST|gpadmin|gpdb|p23944|th280524672|[local]||2019-07-11 16:02:16 CST|0|con57|cmd16|seg-1||dx60||sx1|ERROR: |42P01|relation "table_name" does not exist||||||SELECT gp_segment_id, count(*)
57    FROM table_name GROUP BY gp_segment_id;|41||namespace.c|286|
58 2019-07-11 16:03:48.903080 CST|gpadmin|gpdb|p23944|th280524672|[local]||2019-07-11 16:02:16 CST|0|con57|cmd17|seg-1||dx61||sx1|ERROR: |42P01|relation "table_name" does not exist||||||SELECT gp_segment_id, count(*)
59    FROM table_name GROUP BY gp_segment_id;|41||namespace.c|286|
60 2019-07-11 16:11:22.854459 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd1|seg-1||dx72||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
61 q
62 ;|1||scan.l|982|
63 2019-07-11 16:11:33.673982 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd2|seg-1||dx73||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';
64 
65 \q
66 
67 "
68 ;|1||scan.l|982|
69 2019-07-11 16:11:54.103690 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd3|seg-1||dx74||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
70 '
71 ';|1||scan.l|982|
72        in:     460 lines,     460 log entries; timestamps from 2019-07-11 11:31:29.463761 to 2019-07-11 16:12:19.107822
73     match:      36 lines,      36 log entries; timestamps from 2019-07-11 13:53:29.393443 to 2019-07-11 16:11:54.103690
74       out:      36 lines,      36 log entries; timestamps from 2019-07-11 13:53:29.393443 to 2019-07-11 16:11:54.103690
75 ----------  /greenplum/data/master/gpseg-1/pg_log/gp_era ----------
76        in:       3 lines,       1 log entries; no timestamps found
77     match:       0 lines
78       out:       0 lines,       0 log entries
View Code

  对于WARNINGERRORFATAL或者PANIC日志级别的消息,使用gplogfilter检查Master的日志文件

  每个Segment实例上的WARNINGERRORFATAL或者PANIC日志级别的消息,使用gpssh检查

gpssh -f seg_hosts -e 'source 
/usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 
/data1/primary/*/pg_log/gpdb*.log' > seglog.out
 1 [gpadmin@greenplum01 conf]$ gpssh -f seg_hosts_file -e 'source
 2 > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
 3 > gpssh -f seg_hosts_file -e 'source
 4 /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t ^C
 5 [gpadmin@greenplum01 conf]$ gpssh -f seg_hosts -e 'source
 6 > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
 7 > /data1/primary/*/pg_log/gpdb*.log' > seglog.out
 8 [gpadmin@greenplum01 conf]$ more seglog.out
 9 [greenplum02] > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
10 [greenplum02] > /data1/primary/*/pg_log/gpdb*.log"; source
11 [greenplum02] source
12 [greenplum02] /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
13 [greenplum02] /data1/primary/*/pg_log/gpdb*.log
14 [greenplum02] -bash: source: filename argument required
15 [greenplum02] source: usage: source filename [arguments]
16 [greenplum03] > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
17 [greenplum03] > /data1/primary/*/pg_log/gpdb*.log"; source
18 [greenplum03] source
19 [greenplum03] /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
20 [greenplum03] /data1/primary/*/pg_log/gpdb*.log
21 [greenplum03] -bash: source: filename argument required
22 [greenplum03] source: usage: source filename [arguments]
View Code

  

posted on 2019-07-11 16:26  kingle-l  阅读(1150)  评论(0编辑  收藏  举报

levels of contents