重启CDH的方法以及问题解决
重启CDH的方法如下: service cloudera-scm-server-db restart service cloudera-scm-server restart service cloudera-scm-agent restart(这个还需要在每个slave上执行) 启动服务cloudera-scm-server时会遇到过一段时间自己挂掉,并返回cloudera-scm-server dead but pid file exists的问题。 以下为根源在cloudera-scm-server-db没有正常启动的情况。 【过程】 cloudera-scm-server启动后过一段时间自己挂掉 [html] view plain copy [root@gyvm-4 data]# service cloudera-scm-server start Starting cloudera-scm-server: [ OK ] [root@gyvm-4 data]# [root@gyvm-4 data]# service cloudera-scm-server status cloudera-scm-server (pid 60761) is running... [root@gyvm-4 data]# service cloudera-scm-server status cloudera-scm-server (pid 60761) is running... [root@gyvm-4 data]# service cloudera-scm-server status cloudera-scm-server (pid 60761) is running... [root@gyvm-4 data]# service cloudera-scm-server status cloudera-scm-server dead but pid file exists 这时候想要完整重启cloudera-scm server-db/server 发现cloudera-scm-server-db无法重启 [html] view plain copy [root@gyvm-4 data]# service cloudera-scm-server-db stop waiting for server to shut down............................................................... failed pg_ctl: server does not shut down 无法停止server-db的原因是残留了一个pid文件,status显示不正确,删除该文件,通过status查看,server-db其实已经停止了。 [html] view plain copy [root@gyvm-4 data]# cd /var/lib/cloudera-scm-server-db/data [root@gyvm-4 data]# service cloudera-scm-server-db status pg_ctl: server is running (PID: 17378) /usr/bin/postgres "-D" "/var/lib/cloudera-scm-server-db/data" [root@gyvm-4 data]# rm postmaster.pid rm: remove regular file `postmaster.pid'? y [root@gyvm-4 data]# service cloudera-scm-server-db status pg_ctl: no server running 此时启动server-db,失败 [html] view plain copy [root@gyvm-4 data]# service cloudera-scm-server-db start DB initialization done. waiting for server to start...............................................................could not start server 查看log,tcp/ip端口7432 被占用 [html] view plain copy [root@gyvm-4 cloudera-scm-server]# tail db.log LOG: could not bind IPv4 socket: Address already in use HINT: Is another postmaster already running on port 7432? If not, wait a few seconds and retry. LOG: could not bind IPv6 socket: Address already in use HINT: Is another postmaster already running on port 7432? If not, wait a few seconds and retry. WARNING: could not create listen socket for "*" FATAL: could not create any TCP/IP sockets 杀掉占用该端口的进程 [html] view plain copy [root@gyvm-4 cloudera-scm-server]# netstat -ntp | grep 7432 tcp 0 0 192.168.1.17:7432 192.168.1.17:49784 ESTABLISHED 37118/postgres tcp 0 0 192.168.1.17:7432 192.168.1.8:35818 ESTABLISHED 36807/postgres tcp 0 0 192.168.1.17:7432 192.168.1.17:49779 ESTABLISHED 37060/postgres tcp 0 0 192.168.1.17:49783 192.168.1.17:7432 ESTABLISHED 36306/java tcp 0 0 192.168.1.17:7432 192.168.1.8:35813 ESTABLISHED 36778/postgres tcp 0 0 192.168.1.17:49779 192.168.1.17:7432 ESTABLISHED 36306/java tcp 0 0 192.168.1.17:49784 192.168.1.17:7432 ESTABLISHED 36306/java tcp 0 0 192.168.1.17:49778 192.168.1.17:7432 ESTABLISHED 36306/java tcp 0 0 192.168.1.17:7432 192.168.1.17:49778 ESTABLISHED 37059/postgres tcp 0 0 192.168.1.17:7432 192.168.1.8:35814 ESTABLISHED 36779/postgres tcp 0 0 192.168.1.17:7432 192.168.1.8:35817 ESTABLISHED 36804/postgres tcp 0 0 192.168.1.17:7432 192.168.1.17:49783 ESTABLISHED 37117/postgres [root@gyvm-4 cloudera-scm-server]# kill -9 37118 再次开启server-db,成功,启动server,成功。 [html] view plain copy [root@gyvm-4 data]# service cloudera-scm-server-db start DB initialization done. waiting for server to start.... done server started [root@gyvm-4 data]# service cloudera-scm-server start Starting cloudera-scm-server: [ OK ] 此时,cloudera管理界面可以正常访问。 【结论】 究其原因,是cloudera-server-db没有正常启动,但是残留了pid文件postmaster.pid。 所以查看cloudera-server-db状态时,显示有误,返回cloudera-server-db是启动的状态。 在此基础上,每次启动cloudera-server就会失败。 而cloudera-server-db启动失败的原因是该服务需要的端口号被占用。
########## 今天的苦逼是为了不这样一直苦逼下去!##########