使用Python解析豆瓣上Json格式数据
现在的API接口多为xml或json,json解析更简洁相对xml来说
以豆瓣的API接口为例,解析返回的json数据:
https://api.douban.com/v2/book/1220562
{
"id":"1220562",
"alt":"http:\/\/book.douban.com\/book\/1220562",
"rating":{"max":10, "average":"7.0", "numRaters":282, "min":0},
"author":[{"name":"片山恭一"}, {"name":"豫人"}],
"alt_title":"",
"image":"http:\/\/img1.douban.com\/spic\/s1747553.jpg",
"title":"满月之夜白鲸现",
"mobile_link":"http:\/\/m.douban.com\/book\/subject\/1220562\/",
"summary":"那一年,是听莫扎特、钓鲈鱼和家庭破裂的一年。说到家庭破裂,母亲怪自己当初没有找到好男人,父亲则认为当时是被狐狸精迷住了眼,失常的是母亲,但出问题的是父亲……。",
"attrs":{
"publisher":["青岛出版社"],
"pubdate":["2005-01-01"],
"author":["片山恭一", "豫人"],
"price":["18.00元"],
"title":["满月之夜白鲸现"],
"binding":["平装(无盘)"],
"translator":["豫人"],
"pages":["180"]
},
"tags":[
{"count":106, "name":"片山恭一"},
{"count":50, "name":"日本"},
{"count":42, "name":"日本文学"},
{"count":30, "name":"满月之夜白鲸现"},
{"count":28, "name":"小说"},
{"count":10, "name":"爱情"},
{"count":7, "name":"純愛"},
{"count":6, "name":"外国文学"}
]
}
用python解析我们想要的数据如:id、rating里的max 、tags第一行的name值
import urllib2
import json
html = urllib2.urlopen(r'https://api.douban.com/v2/book/1220562')
hjson = json.loads(html.read())
print hjson['id']
print hjson['rating']['max']
print hjson['tags'][0]['name']
结果图: