Python爬虫(二)

先登录人人网,再使用登录后的cookie访问

源码:

 1 from get_and_post import get, post
 2 
 3 url = 'http://www.renren.com/home'
 4 
 5 # 将登录后的cookie信息拿过来用
 6 headers = {
 7     "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
 8     # "Accept-Encoding": "gzip, deflate",
 9     "Accept-Language": "zh-CN,zh;q=0.9",
10     "Connection": "keep-alive",
11     "Cookie": "anonymid=jktdyohaezr4g8; depovince=GW; _r01_=1; JSESSIONID=abcmUVX3G0Fk5ek_Mv3uw; ick_login=51a5ee04-4c76-4f74-9eb3-e2093120235d; __utma=151146938.10724825.1534232198.1534232198.1534232198.1; __utmc=151146938; __utmz=151146938.1534232198.1.1.utmcsr=renren.com|utmccn=(referral)|utmcmd=referral|utmcct=/967453635/profile; ick=84910b4e-b43f-42db-8b6b-8ec43fed4620; first_login_flag=1; ln_uact=17600402735; ln_hurl=http://head.xiaonei.com/photos/0/0/men_main.gif; loginfrom=syshome; jebecookies=ce910b85-5b7f-47b6-8709-d006a2651dc3|||||; _de=72F17782D2D53D93EEBC8C94373D3782; p=38a1027149fcb649437c1596c73bc3c35; t=eb1c81d339267e12b533ba1fa405e2ec5; societyguester=eb1c81d339267e12b533ba1fa405e2ec5; id=967453635; xnsid=c88e7c65; wp_fold=0",
12     "Host": "www.renren.com",
13     "Referer": "http://zhibo.renren.com/top",
14     "Upgrade-Insecure-Requests": "1",
15     "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",
16 }
17 
18 html_bytes = get(url,headers=headers)
19 
20 with open('renrenwang.html', 'wb') as f:
21     f.write(html_bytes)

 注:源码中导入的 get_and_post ,请到https://www.cnblogs.com/zhxd-python/p/9471500.html中查看

posted @ 2018-08-14 22:10  _积木城池  阅读(159)  评论(0编辑  收藏  举报