DoubleLi

qq: 517712484 wx: ldbgliet

  博客园 :: 首页 :: 博问 :: 闪存 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::

最近项目的看门狗经历了三个版本。

第一个版本:

用ps -ef,如果程序挂了就启动

 

第二个版本:

程序由于运行时会出现不再监听7901端口,所以不能简单判断机器是不是挂了,而是判断此端口是否有监听

 

第三个版本:

当7901端口不再监听,就先把原来的killall再启动,每次输出到文件的内容都加日期,要不然根本不知道这事情啥时候发生的

 

第四个版本:

使用nohup让程序和监控程序的echo输出到非标准设备而是文件,这样彻底脱离shell,从而退出一个shell的时候真正实现后台运行

 

老版本如下:

 

#!/bin/sh
set +x

source env.sh

PRMGRAM=scp_platform
FILE_NAME=scp_monitor.log

Current_Time=`date +"%Y-%m-%d %H:%M:%S.%N"`
echo "[${Current_Time}] monitor start...." 
echo "[${Current_Time}] monitor start...." >> ${WORK_DIR}/log/${FILE_NAME}

port=7905

TCPListeningnum=`netstat -an | grep ":$port " | awk '$1 == "tcp" && $NF == "LISTEN" {print $0}' | wc -l`

if [ $TCPListeningnum = 1 ]
then
{
    echo "[${Current_Time}] The $port is listening"
}
else
{
    echo "[${Current_Time}] The port is not listening"
}
fi



while [ 1 ]
do
  Current_Time=`date +"%Y-%m-%d %H:%M:%S.%N"`
	TCPListeningnum=`netstat -an | grep ":$port " |   awk '$1 == "tcp" && $NF == "LISTEN" {print $0}' | wc -l`
	if [ $TCPListeningnum = 1 ]
	then
	{
    		echo "[${Current_Time}] The ${port} is listening" >> ${WORK_DIR}/log/${FILE_NAME}
	}
	else
	{
    		echo "[${Current_Time}] The  ${port} is not listening" >> ${WORK_DIR}/log/${FILE_NAME}
	    	echo "[${Current_Time}] killall  scp_platform now !" >> ${WORK_DIR}/log/${FILE_NAME}
	    	kscp
      	echo "[${Current_Time}] check ${PRMGRAM} quit, now restart ${PRMGRAM} ..." >> ${WORK_DIR}/log/${FILE_NAME}
      	scp_platform&
	}
	fi
    sleep 180
done

新版本如下:

start_monitor.sh #此脚本负责将monitor后台运行

 

#!/bin/bash

#start monitor background  without console!!

nohup ./monitor.sh &


monitor.sh #实际的monitor监控程序

 

 

#!/bin/bash
set -x

nohup  ./env.sh &

PRMGRAM=scp_platform
FILE_NAME=scp_monitor.log

Current_Time=`date +"%Y-%m-%d %H:%M:%S.%N"`
echo "[${Current_Time}] monitor start...." 
echo "[${Current_Time}] monitor start...." >> ${WORK_DIR}/log/${FILE_NAME}

port=7905

TCPListeningnum=`netstat -an | grep ":$port " | awk '$1 == "tcp" && $NF == "LISTEN" {print $0}' | wc -l`

if [ $TCPListeningnum = 1 ]
then
{
    echo "[${Current_Time}] The $port is listening"
}
else
{
    echo "[${Current_Time}] The port is not listening"
}
fi



while [ 1 ]
do
  Current_Time=`date +"%Y-%m-%d %H:%M:%S.%N"`
	TCPListeningnum=`netstat -an | grep ":$port " |   awk '$1 == "tcp" && $NF == "LISTEN" {print $0}' | wc -l`
	if [ $TCPListeningnum = 1 ]
	then
	{
    		echo "[${Current_Time}] The ${port} is listening" >> ${WORK_DIR}/log/${FILE_NAME}
	}
	else
	{
    		echo "[${Current_Time}] The  ${port} is not listening" >> ${WORK_DIR}/log/${FILE_NAME}
	    	echo "[${Current_Time}] killall  scp_platform now !" >> ${WORK_DIR}/log/${FILE_NAME}
	    	killall scp_platform
      	echo "[${Current_Time}] check ${PRMGRAM} quit, now restart ${PRMGRAM} ..." >> ${WORK_DIR}/log/${FILE_NAME}
      	nohup scp_platform&
	}
	fi
    sleep 180
done


这里之所以要sleep 180是是因为程序加载实际稍微有点长,要不然加载还没完成的时候是不可以判断有没有监听7905端口的

 

 

 

原来版本的env.sh #无需修改即可使用
env.sh主要是设置环境变量和自定义的变量

 

 

#bin/bash
export ROOT=/root/scp
export WORK_DIR=${ROOT}
export INCLUDE=${ROOT}/include
export OTL=${INCLUDE}/otl_mysql
export LD_LIBRARY_PATH=${ROOT}/lib:/usr/local/lib
export ACE_ROOT=${INCLUDE}
export ODBCINI=/usr/local/etc/odbc.ini
export ODBCSYSINI=/usr/local/etc
PATH=${PATH}:${ROOT}/bin
export PATH
odbcinst -j


alias wk='cd ${ROOT}'
alias bin='cd ${ROOT}/bin'
alias cfg='cd ${ROOT}/conf'
alias rmlog='rm -rf ${ROOT}/bin/log*.*; rm -rf ${ROOT}/log/*.*'
alias lis='netstat -an|grep -i 7905'
alias scp='${ROOT}/bin/scp_platform &'
alias moni='${ROOT}/bin/monitor.sh &'
alias myps='ps -fu root|grep -v grep|grep -i scp'
alias mymoni='ps -fu root|grep -v grep|grep -i moni'
alias kscp='killall -9 scp_platform'
alias kmoni='killall -9 monitor.sh'
isql
alias mynet='netstat -an | grep 7905'


ulimit -c unlimited
ulimit -n 65530
posted on 2020-11-19 12:06  DoubleLi  阅读(453)  评论(0编辑  收藏  举报