json与api- 天气api 博客词频分析

一、json基础

1.1 json的介绍

　　json现在成为各种程序与语言之间交互的一种数据格式，本质是文本，字符串。

　　json有两种格式：

　　　　1. 类似字典 {k:v,k,v}

　　　　2. 类似列表 {}

　　python的json模块： json

1.2 json的方法

　　json和pickle的方法差不多。有两种：文件级别的转换与内存级别的转换！

1.2.1 文件级别的转换

　　load：文本(字符串) --》 dict

　　dump: dict -》文本(字符串)

import json

# 字典以json的格式写入到文件中
d = {
    'name': '娄辉',
    'city': 'hangzhou',
    'hobby': ['power', 'money', 'girl']
}
with open('1.json', 'w') as f:
    json.dump(d, f)

# 从json的文件中读取
with open('1.json', 'r') as f:
    s = json.load(f)

print(type(s))
print(s)

1.2.2 内存级别的转换

import json

# dict 到 json
d = {
    'name': '娄辉',
    'city': 'hangzhou',
    'hobby': ['power', 'money', 'girl']
}

x = json.dumps(d)
print(x)
print(type(x))

# json 到 dict
# json的文本内容，字符串
js = '{"name": "娄辉", "city": "hangzhou", "hobby": ["power", "money", "girl"]}'
a = json.loads(js)
print(a, type(a))

1.3 json的美化输出

　　在转换成json进行美化， dump与 dumps都支持！

　　注意的是：如果不是使用默认的分隔符，转字典的时候会报错的。我们一般都会使用默认的分隔符，：

x = json.dumps(d, indent=4, separators=(',', ';'))

二、api

　　应用程序接口（英语：Application Programming Interface，简称：API）

　　又称为应用编程接口，就是软件系统不同组成部分衔接的约定。

　　json就是各种api 的信息交互。

三、案列

3.1 天气案列

#!usr/bin/env python
# -*-  conding:utf-8 -*-
# weather.py
# 查询天气调用
# author: louhui

import requests
import json


def query_weather(city_name):
    '从阿里的api云市场获取数据，返回一个json数据'
    url = 'http://jisutqybmf.market.alicloudapi.com/weather/query'
    city = {'city': city_name}  # 定义get参数
    headers = {'Authorization': 'APPCODE 4e593528152b461fb7f6c78ce0a41878'}  # 定义请求的头部，阿里的认证
    r = requests.get(url=url, headers=headers, params=city)
    print(r.status_code)
    if r.status_code == 200:
        return r.json()
    else:
        print('发生错误，状态码为：', r.status_code)


def save(data: dict):
    '保存天气数据，dict -> json文本'
    with open('wea.json', 'w', encoding='utf-8') as f: # 注意编码
        json.dump(data, f, ensure_ascii=False)  # ensure_ascii为True的时候，中文只能显示unicode


def read():
    '从json文件中读取数据。json文本->dict'
    with open('wea.json', 'r', encoding='utf-8') as f:
        return json.load(f)


def main():
    # 1---避免接口使劲调用，先存为文本
    # city = '杭州'
    # data = query_weather(city)
    # save(data)

    # 2---从json文本中直接读取，进程操作
    data = read()
    weather_list = data['result']['daily']
    for date in weather_list:
        print(date['date'], date['week'],date['night']['weather'])


if __name__ == '__main__':
    main()

3.2 博客词频分析

from bs4 import BeautifulSoup  # 对
import requests
import jieba  # 从字符串分词
import collections  # 列表中统计个数


class BlogAnaly:
    def __init__(self, blog_url):
        blog_url = 'http://blog.51cto.com/de8ug/2110764'
        self.r = requests.get(blog_url)

    def trans_data(self):
        contents = BeautifulSoup(self.r.text, 'html.parser')  # 生产bs4对象，便于操作。 第二个参数为固定参数

        all_p = contents.find_all('p')  # 找出所有p便签的内容,即段落内容，是一个列表的形式

        all_text = ''
        for i in all_p:
            all_text += str(i.text)  # 转换为字符串，str确保转换为字符串

        text_list = []
        text = jieba.cut(all_text)  # 分词的功能,返回一个迭代器
        for i in text:
            text_list.append(i)
        
        return text_list
    
    def get_most_common(self, max_num=30):  # 数据直接在上面定义
        ret = {'status': 0, 'statusText': 'ok', 'data': {}}
        try:
            #  使用集合的统计功能
            couter = collections.Counter(self.trans_data())
            for key, v in couter.most_common(max_num):
                ret['data'][key] = v
        except Exception as e:
            ret['status'] = 1
            ret['statusText'] = e  # TODO 
        return ret

def main():
    x = BlogAnaly('https://www.cnblogs.com/flame7/p/9110579.html')
    dic = x.get_most_common()  # 这是一个接口
    print(dic.get('status'))
    if dic.get('status') == 0:
        print(dic.get('data'))


if __name__ == '__main__':
    main()

posted @ 2018-05-21 10:26 娄先生阅读(627) 评论(0) 收藏举报

刷新页面返回顶部

娄先生

奋斗的路上，一直很遥远！