python-docx数据统计报告-项目G

近期新项目到了试运行阶段，安排运维组介入。初步规划了一下运维组的巡检内容。给巡检的同事写了一些工具，其中就报告数据统计报告。
该工具功能是，从ES中统计数据，写入word文档，同时使用统计的数据重新构造数据结构，生成分析图。分析图作为邮件的正文，word报告作为附件。
代码就不全贴了，包括docx的内容，之前帖子也写了很多。贴几个重点片段。

数据收集逻辑

报告除自己看之外，同时也要发给甲方代表。另外指定运维验收标准有要求报告提交的份数。所以需要做一个时间切分函数，将当天的0点到程序运行这个时间段按固定间隔切分成一个数据统计时间点的列表。

def time_interval(self):
	#存放需要查询数据的时间戳列表
	timelist = []
	#当前时间的时间戳
	times = time.localtime()
	timeStamp_now=round(time.mktime(times)) * 1000
	logging.info(r'巡检时间为: %s' % self.formattimestamp(timeStamp_now))
	#将一天按每2小时切分为12个时间戳，如果时间<程序执行时间就存放到时间戳列表中
	for i in range(0,24,2):
		date_stamp = datetime.now().replace(hour=i, minute=0, second=0, microsecond=0).timestamp()
		#转换为13位的时间戳
		date_stamp13 = round(date_stamp) * 1000
		# print(r'收集时间%s:00 ,转换为时间戳%s' % (i,date_stamp13))

		if date_stamp13 < timeStamp_now:
			timelist.append(date_stamp13)
			# self.log.info(r'数据收集时间点汇聚成功，%s',date_stamp13)

	return timelist

读取ES数据

需要注意的是es有很多方法可以对应es的语法，看看接口文档，并没有那么麻烦。比如es.count()
另外index是索引名，body是查询主体。很多文档上有doc_type其实就是es的type在7.6之后已经不强制使用了。

from elasticsearch import Elasticsearch

def getesdata_news(self,date):
	try:
		# Elasticsearch([ip],http_auth=('elastic','password'),port=9200)
		es = Elasticsearch(["192.168.x.x"], http_auth=('elastic', 'password'), port=9200)
	except TransportError  as e:
		print('- unable to connect %s' % repr(e))

	body = '{"query": {"range": {"it": {"gte": '+str(date) +'}}}, "aggs": {"group_by_ip_peer": {"terms": {"field": "media_name.keyword","size": 100}}}}'
	res = es.search(index="all_news_origin", body=body, _source='it')
	resdata = res['aggregations']['group_by_ip_peer']['buckets']

	return resdata

docx处理表单元格合并

以下部分片段，很简单。使用merge来合并单元格，使用for来填充行。注意处理行数，因为表格的行数是要先确定的，如果太小会报下标越界。另外需要注意的是单元格虽然合并了，但插入数据的时候还是按照原坐标定位的。比如0,1 和0,2合并，插入数据使用1或者2都是一个单元格。

#获取es检索完的数据
res = datareport_gab().selectdata()
logging.debug(r'结果集共有%s个:' % len(res))
rown = 2


table = self.doc.add_table(rows=len(res)+2, cols=9, style='Table Grid')
table.cell(0, 0).merge(table.cell(1, 0))
table.cell(0, 1).merge(table.cell(0, 2))
table.cell(0, 3).merge(table.cell(0, 4))
table.cell(0, 5).merge(table.cell(0, 6))
table.cell(0, 7).merge(table.cell(0, 8))

hdr_cells = table.rows[0].cells
hdr_cells[0].text = u"巡检时间"
hdr_cells[1].text = u"x"
hdr_cells[3].text = u"x"
hdr_cells[5].text = u"x"
hdr_cells[7].text = u"x"

简单分析图

import matplotlib.pyplot as plt

def createjpg(self):
	res = self.selectdata()
	x_data = []  //横坐标
	y_a = []   //趋势A，Y坐标是根据这几个值自己生成的

	for i in res:
		newx = time.strftime('%H:%M:%S', time.strptime(self.formattimestamp(i), '%Y-%m-%d %H:%M:%S'))
		x_data.append(newx)

		y_a.append(res[i]['facebook_a'])
	
    
	plt.plot(x_data, y_a, color='red', linewidth=2.0, linestyle='--')
        #如果是多个趋势，需要在之前定义。
	#plt.plot(x_data, y_b, color='blue', linewidth=3.0, linestyle='-.')
	#plt.plot(x_data, y_c, color='yellow', linewidth=2.0, linestyle='--')
	#plt.plot(x_data, y_d color='orange', linewidth=3.0, linestyle=':')

	plt.savefig(self.jpg_path)
	#plt.show()   //程序运行完打开分析图

posted @ 2021-01-20 09:44 名字很长容易被惦记阅读(218) 评论(0) 编辑收藏举报

刷新页面返回顶部