猫有九命

2019年6月5日

spider_使用parse，urlencode，爬取豆瓣电影（get请求拼接url）

摘要： """使用urllib库爬取豆瓣电影 ajax（异步刷新）"""from urllib import request,parseimport chardetimport json# 定义豆瓣的urlurl = "https://movie.douban.com/j/chart/top_list?" 阅读全文

posted @ 2019-06-05 10:27 猫有九命阅读(346) 评论(0) 推荐(0)

sapider_使用urlencode函数拼接get请求参数

摘要： """使用urlencode 函数拼接get请求参数"""from urllib import parse,request# https://www.baidu.com/s?wd=端午节url = "http://www.baidu.com/s?"paramDic ={ "wd":"端午节"}# 将阅读全文

posted @ 2019-06-05 10:20 猫有九命阅读(350) 评论(0) 推荐(0)

2019年6月4日

spider_使用user-agent

摘要： """1.使用第一种反反爬措施User-Agent（伪装浏览器）"""import chardetimport requestsfrom urllib import request# 使用etree python3.5以上，不支持etree，from lxml import html# 1.得到所阅读全文

posted @ 2019-06-04 09:19 猫有九命阅读(520) 评论(0) 推荐(0)

spider_使用cookie模拟登录

摘要： """使用cook模拟登陆（反登录）"""from urllib import requestimport chardetdef baiDu(): url = "https://www.baidu.com/" headers={"User-Agent": "Mozilla/5.0 (Windows 阅读全文

posted @ 2019-06-04 09:17 猫有九命阅读(434) 评论(0) 推荐(0)

spider_使用ip代理

摘要： """使用ip代理进行网站访问,(反封禁ip手段)"""from urllib import requestimport chardetclass BaiDu(object): def baidu(self): url = "https://www.baidu.com/" headers = { " 阅读全文

posted @ 2019-06-04 09:15 猫有九命阅读(458) 评论(0) 推荐(0)

公告