分析日志下载时间脚本
需求:
由于队列中feedback队列积压过多,查找到总线日志process_feedbacks_1.log有time out的下载错误,并且大量的衍生样本下载时间过长
目前需求:计算process_feedbacks_1.log日志中的每个衍生样本的下载时间
例如:
2014-12-22 10:29:58,073 __main__ DEBUG on_task_success, task_id = 15145129
2014-12-22 10:29:58,097 __main__ DEBUG receive AUTO success event, begin to redispatch PROC task
2014-12-22 10:29:58,097 __main__ DEBUG begin to process derivative file[filename = 92F69BD61E4AD511A664CAF7852018C5.D6336163]
2014-12-22 10:29:58,097 __main__ DEBUG AUTO_DERIVATIVE_TASK_FILETYPE: sample
2014-12-22 10:29:58,097 __main__ DEBUG in __download_derivative_file__(). output_file:/mnt/auto_derivative/92F69BD61E4AD511A664CAF7852018C5.D6336163
2014-12-22 10:30:02,388 __main__ DEBUG download ok, create file ok :%s/mnt/auto_derivative/92F69BD61E4AD511A664CAF7852018C5.D6336163
2014-12-22 10:30:02,414 __main__ DEBUG begin to process derivative file[filename = 83613B2FE6B780B3BBB08470171DBFD6.B16EDA9B]
2014-12-22 10:30:02,415 __main__ DEBUG AUTO_DERIVATIVE_TASK_FILETYPE: sample
2014-12-22 10:30:02,415 __main__ DEBUG in __download_derivative_file__(). output_file:/mnt/auto_derivative/83613B2FE6B780B3BBB08470171DBFD6.B16EDA9B
2014-12-22 10:30:03,806 __main__ DEBUG download ok, create file ok :%s/mnt/auto_derivative/83613B2FE6B780B3BBB08470171DBFD6.B16EDA9B
计算时间差
2014-12-22 10:29:58,097 __main__ DEBUG in __download_derivative_file__(). output_file:/mnt/auto_derivative/92F69BD61E4AD511A664CAF7852018C5.D6336163
2014-12-22 10:30:02,388 __main__ DEBUG download ok, create file ok :%s/mnt/auto_derivative/92F69BD61E4AD511A664CAF7852018C5.D6336163
2014-12-22 10:30:02,415 __main__ DEBUG in __download_derivative_file__(). output_file:/mnt/auto_derivative/83613B2FE6B780B3BBB08470171DBFD6.B16EDA9B
2014-12-22 10:30:03,806 __main__ DEBUG download ok, create file ok :%s/mnt/auto_derivative/83613B2FE6B780B3BBB08470171DBFD6.B16EDA9B
记录成time.txt文件,txt列表内容为
92F69BD61E4AD511A664CAF7852018C5.D6336163,时间间隔
83613B2FE6B780B3BBB08470171DBFD6.B16EDA9B,时间间隔
同时计算time.txt,时间差的平均值
脚本:
#!/usr/bin/env python
import datetime
import os,sys
import time
filename = sys.argv[1]
f = open('spend_time.txt','w')
f.write("md5,spend_time(s)\n")
def time_clac(date1):
date1_perfix,date1_suffix = date1.split(',')
sec=time.mktime(time.strptime(date1_perfix,'%Y-%m-%d %H:%M:%S'))
sec_1 = sec*1000
mis_sec = sec_1 + int(date1_suffix)
return mis_sec
for line in open(filename,'r'):
line = line.strip('\n')
if 'in __download_derivative_file__()' in line :
start_time = line.split('__main__')[0]
start_time_1 = time_clac(start_time)
md5 = line[-32:]
continue
if 'download ok' in line and md5 in line:
end_time = line.split('__main__')[0]
end_time_1 = time_clac(end_time)
spend_time = (end_time_1 - start_time_1) / 1000
print "%s,%s" %(md5,spend_time)
f.write("%s,%s\n"%(md5,spend_time))
f.close()