通过日志关键字检测判断obb程序是否工作正常
C118+Osmocom-bb 多机 gsm sniff环境,经常发生工作一段时间后,某个手机监听的arfcn就不工作了。
检查日志发现,日志最后有连续的多条:TOA AVG is not 16 qbits, correcting (got 15),然后日志就一动不动了,无法再继续抓取sms,只能重启obb程序。
不清楚这是obb的程序bug,还是基站每天不定时调整( 某些arfcn,并不是一天24小时都工作的,有时会断那么一小会儿 )导致的。
重启obb程序的过程不算复杂,无非是先刷机(我没试过硬刷),再监听。
可以在smsweb里专门写一个方法,结合Python+shell命令定期(30秒)去检测日志(使用tail和diff命令),当判断obb工作不正常时,重新刷机(全自动刷机硬件改造方法参考置顶文章),起动监听程序。
参考代码如下:
def monitor_log(): mysql = Database() while True: print("monitor log:") getusb = subprocess.Popen(["./osmocom-bb/getusb.sh"],stderr=subprocess.PIPE,stdout=subprocess.PIPE) usbResult = getusb.communicate() getusb.wait() device = re.findall(r'\d',usbResult[0])[0] #find arfcn str_sql = "SELECT * FROM sniff limit 0," + str(device) data = mysql.query(str_sql) for row in data: arf = str(row['arfcn']) power = str(row['power']) sptype = str(row['sptype']) tty = str(row['tty']) counter = 0 command = 'tail -n3 ./download_'+ tty +'.log' textlist = os.popen(command).readlines() for line in textlist: if "AVG" in line: print("find got 15 in log! dangerous!") counter = counter + 1 #logger.info("AVG counter:" + str(counter) + " " + str(tty) + " arfcn:" + str(arf) ) if int(counter) == 3: print("found 3 got 15! restart osmocon and sniff!") #cur_time = time.strftime('%Y/%m/%d %H:%M:%S',time.localtime(time.time())) logger.info("got 15 mon:" + str(tty) + " arfcn:" + str(arf) ) ps1=Process(target=download1,args=(str(tty),)) ps1.start() ps1.join(10) #time.sleep(10) ps2=Process(target=sniff,args=(str(tty),str(arf),)) ps2.start() ps2.join(30) #time.sleep(30) #subprocess.Popen("./osmocom-bb/test.sh",shell = True) # 检测文件是否有变动 cur_log = "download_" + tty + ".log" old_log = cur_log + ".old" getdiff = subprocess.Popen(["./diff.sh",cur_log,old_log],stderr=subprocess.PIPE,stdout=subprocess.PIPE) diffResult = getdiff.communicate() getdiff.wait() diff_ret = re.findall(r'\d',diffResult[0])[0] #logger.info("logchange mon:" + str(tty) + " arfcn:" + str(arf) + " diff_ret:" + str(diff_ret)) if int(diff_ret) == 0: # print("log not change in 30secs! restart osmocon and sniff!") # #cur_time = time.strftime('%Y/%m/%d %H:%M:%S',time.localtime(time.time())) logger.info("log diff:" + str(tty) + " arfcn:" + str(arf) ) ps1=Process(target=download1,args=(str(tty),)) ps1.start() ps1.join(10) #time.sleep(10) ps2=Process(target=sniff,args=(str(tty),str(arf),)) ps2.start() ps2.join(30) #time.sleep(30) #subprocess.Popen("./osmocom-bb/test.sh",shell = True) time.sleep(30)
diff.sh:
#!/bin/bash #diff ./download_0.log ./download_0.log.old diff $1 $2 >> diff_$1 #echo $? if [ $? = 0 ];then #echo "没区别" echo "0" else #echo "文件有变动" rm -fr $2 cp $1 $2 #echo "文件同步成功" echo "1" fi
说明:
1. 当日志里连续三行的日志都出现AVG关键字时,就认为obb工作不正常了,果断重新刷机监听。
2.当日志过了30秒后内容还和30秒前一样时,也是不正常的,重新刷机监听。