取文本里的文本转Json(记录)

只是单纯的很幼稚的做法 也许绕了很大很大的弯
Json的应用我还不太了解
文本里的内容大致:

>>> f=open('222.txt','r')
>>> a=f.read()
>>> f.close()
>>> print a
appid:c50000100_h50001 flow:0.00243473 mo:CUCC ip:192.168.1.176
appid:c50000103_t50004 flow:207.359 mo:CUCC ip:192.168.1.119
appid:c50000100_t50011 flow:5.72205e-06 mo:CUCC ip:192.168.1.19
appid:c50000100_h50000 flow:0.104045 mo:CUCC ip:192.168.1.10

第一次正则匹配:

import re
>>> b=re.compile(r"appid:(.*?)\s*flow:(.*?)\s*mo:(.*?)\s*ip:(.*?)\n").findall(a)
>>> print b
[('c50000100_h50001', '0.00243473', 'CUCC', '192.168.1.176'), ('c50000103_t50004', '207.359', 'CUCC', '192.168.1.119'), ('c50000100_t50011', '5.72205e-06', 'CUCC', '192.168.1.19')]
#List包含Tuple
>>> import json
>>> j=json.dumps(b)
>>> print j
[["c50000100_h50001", "0.00243473", "CUCC", "192.168.1.176"], ["c50000103_t50004", "207.359", "CUCC", "192.168.1.119"], ["c50000100_t50011", "5.72205e-06", "CUCC", "192.168.1.19"]]

问题:
1.因为行末换行符的关系最后一行没有匹配到
2.Json的格式是List包含List(Json叫做数组Array)[[]];需要的形式是List包含Dict{}object等同于Python的dict

第二次正则匹配:

>>> b=re.compile(r"appid:(.*?)\s*flow:(.*?)\s*mo:(.*?)\s*ip:(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})").findall(a)
>>> print b
[('c50000100_h50001', '0.00243473', 'CUCC', '192.168.1.176'), ('c50000103_t50004', '207.359', 'CUCC', '192.168.1.119'), ('c50000100_t50011', '5.72205e-06', 'CUCC', '192.168.1.19'), ('c50000100_h50000', '0.104045', 'CUCC', '192.168.1.10')]
#通过直接匹配最后的IP地址,全部匹配

>>> d={}
>>> l=[]
>>> for i in range(len(b)):
	d['appid']=b[i][0]
	d['flow']=b[i][1]
	d['mo']=b[i][2]
	d['ip']=b[i][3]
	l.append(d)
	d={} #没有这句一直只赋值最后一个。。。搞了半天 略囧

>>> print l
[{'ip': '192.168.1.176', 'mo': 'CUCC', 'flow': '0.00243473', 'appid': 'c50000100_h50001'}, {'ip': '192.168.1.119', 'mo': 'CUCC', 'flow': '207.359', 'appid': 'c50000103_t50004'}, {'ip': '192.168.1.19', 'mo': 'CUCC', 'flow': '5.72205e-06', 'appid': 'c50000100_t50011'}, {'ip': '192.168.1.10', 'mo': 'CUCC', 'flow': '0.104045', 'appid': 'c50000100_h50000'}]
#得到了List包含Dict
>>> import json
>>> j=json.dumps(l)
>>> print j
[{"ip": "192.168.1.176", "mo": "CUCC", "flow": "0.00243473", "appid": "c50000100_h50001"}, {"ip": "192.168.1.119", "mo": "CUCC", "flow": "207.359", "appid": "c50000103_t50004"}, {"ip": "192.168.1.19", "mo": "CUCC", "flow": "5.72205e-06", "appid": "c50000100_t50011"}, {"ip": "192.168.1.10", "mo": "CUCC", "flow": "0.104045", "appid": "c50000100_h50000"}]
#转成Json也正常

附一种脑洞大开的写法:

>>> l=[]
>>> d={}
>>> f=open('222.txt','r')
>>> for line in open('222.txt'):
	s=f.readline()
	s=s.replace('\n','')
	for i in range(4):
		d[(((s.split(' '))[i]).split(':'))[0]]=(((s.split(' '))[i]).split(':'))[1]
	l.append(d)
	d={}

>>> print l
[{'ip': '192.168.1.176', 'mo': 'CUCC', 'flow': '0.00243473', 'appid': 'c50000100_h50001'}, {'ip': '192.168.1.119', 'mo': 'CUCC', 'flow': '207.359', 'appid': 'c50000103_t50004'}, {'ip': '192.168.1.19', 'mo': 'CUCC', 'flow': '5.72205e-06', 'appid': 'c50000100_t50011'}, {'ip': '192.168.1.10', 'mo': 'CUCC', 'flow': '0.104045', 'appid': 'c50000100_h50000'}]
posted @ 2015-08-27 02:01  清水牧鱼  阅读(600)  评论(0编辑  收藏  举报