18、cookies与session学习笔记
本文记录学习 cookies 和 session 的一些小练习和知识点
知识点1 cookies 和 session 的由来
HTTP协议是无状态的协议,因为一旦浏览器和服务器之间的请求和响应完毕后,两者就会立马断开,也就是恢复成无状态。
这样就会导致一个问题,服务器永远无法辨认,也记不住用户的信息,于是cookies和session就出现了。
cookies不仅仅能实现自动登录(他自身携带了session的编码信息),网站还能根据cookies,记录你的浏览足迹,从而知道你的偏好,只要再加上推荐算法,就可以给你推送定制化的内容。
当然一份cookies不是永久有效的,他是有有效期的,过期后重新获取一份就可以了
知识点2 cookies的使用方法
获取 cookies
res = requests.post(url,headers=headers,data=data)
cookies = res.cookies
使用cookies
res = requests.post(url,headers=headers,data=data,cookies=cookies)
ps: data 是字典的形式
知识点3 session 的使用,以及使用过程中cookies的作用
session = requests.session()
用requests.session() 创建session对象,相当于创建了一个特定的会话,帮我们自动保持了cookies,但是此时的cookies是空的
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[]>
login = session.post(login_url,data=login_data,headers=headers)
在创建的session下用post发起登录请求,这里使用的是空的cookies和用户名密码,当用用户名和密码认证通过后,服务器发给了我一个cookies,后续可以使用这个cookies访问网站而不需要输入用户名和密码
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[<Cookie 328dab9653f517ceea1f6dfce2255032=2584219941bfcd0f4a161828d7340553 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_logged_in_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ce293f09056a29f312ea6de87972ceac5d163d5ca00f12fc53cc2535294a7f7ae for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_test_cookie=WP+Cookie+check for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-admin>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-content/plugins>]>
comment = session.post(comment_url,data=comment_data,headers=headers)
在创建的session下用post发起评论请求,这次访问不需要输入用户名和密码,因为这个session里面有上次访问时服务器给我的cookies
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[<Cookie 328dab9653f517ceea1f6dfce2255032=2584219941bfcd0f4a161828d7340553 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_logged_in_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ce293f09056a29f312ea6de87972ceac5d163d5ca00f12fc53cc2535294a7f7ae for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_test_cookie=WP+Cookie+check for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-admin>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-content/plugins>]>
知识点4 cookies的保存
session = requests.session()
login = session.post(login_url,data=login_data,headers=headers)
获取cookies
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[<Cookie 328dab9653f517ceea1f6dfce2255032=2584219941bfcd0f4a161828d7340553 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_logged_in_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7C7356d08183376075cb450dd37ee0f0234d809409aaacd09b817c61b6bb9e0433 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_test_cookie=WP+Cookie+check for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e for wordpress-edu-3autumn.localprod.forc.work/wp-admin>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e for wordpress-edu-3autumn.localprod.forc.work/wp-content/plugins>]>
cookies_dict = requests.utils.dict_from_cookiejar(session.cookies)
将cookies转换成字典
print(type(cookies_dict))
-- <class 'dict'>
print(cookies_dict)
-- {'328dab9653f517ceea1f6dfce2255032': '2584219941bfcd0f4a161828d7340553', 'wordpress_logged_in_9927dadafec8b913479e6af0fba5e181': 'spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7C7356d08183376075cb450dd37ee0f0234d809409aaacd09b817c61b6bb9e0433', 'wordpress_test_cookie': 'WP+Cookie+check', 'wordpress_sec_9927dadafec8b913479e6af0fba5e181': 'spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e'}
cookies_str = json.dumps(cookies_dict)
将字典转换成字符串
print(type(cookies_str))
-- <class 'str'>
print(cookies_str)
-- {"328dab9653f517ceea1f6dfce2255032": "2584219941bfcd0f4a161828d7340553", "wordpress_logged_in_9927dadafec8b913479e6af0fba5e181": "spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7C7356d08183376075cb450dd37ee0f0234d809409aaacd09b817c61b6bb9e0433", "wordpress_test_cookie": "WP+Cookie+check", "wordpress_sec_9927dadafec8b913479e6af0fba5e181": "spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e"}
with open('cookies.str','w',encoding='utf-8') as strfile:
strfile.write(cookies_str)
将字符串写入文件
cookies获取和保存过程如下:
1 session = requests.session() 2 3 login = session.post(login_url,data=login_data,headers=headers) 4 5 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies) 6 7 cookies_str = json.dumps(cookies_dict) 8 9 with open('cookies.str','w',encoding='utf-8') as strfile: 10 strfile.write(cookies_str) 11 12 13 '''cookies_str_read = open('cookies.str','r') 14 15 cookies_dict_read = json.loads(cookies_str_read.read()) 16 17 cookies_read = requests.utils.cookiejar_from_dict(cookies_dict_read) 18 19 session.cookies = cookies_read 20 21 comment = session.post(comment_url,data=comment_data,headers=headers) 22 23 print(comment)'''
知识点5 cookies的读取
读取cookies的过程正好与保存cookies的过程相反
cookies读取和使用过程如下:
1 session = requests.session() 2 3 4 '''login = session.post(login_url,data=login_data,headers=headers) 5 6 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies) 7 8 cookies_str = json.dumps(cookies_dict) 9 10 with open('cookies.str','w',encoding='utf-8') as strfile: 11 strfile.write(cookies_str) 12 ''' 13 14 15 cookies_str_read = open('cookies.str','r') 16 17 cookies_dict_read = json.loads(cookies_str_read.read()) 18 19 cookies_read = requests.utils.cookiejar_from_dict(cookies_dict_read) 20 21 session.cookies = cookies_read 22 23 comment = session.post(comment_url,data=comment_data,headers=headers) 24 25 print(comment)
完整代码如下
1 import requests 2 import json 3 4 headers = { 5 'Connection': 'keep-alive' , 6 'Pragma': 'no-cache' , 7 'Cache-Control': 'no-cache' , 8 'Origin': 'https://wordpress-edu-3autumn.localprod.forc.work' , 9 'Upgrade-Insecure-Requests': '1' , 10 'Content-Type': 'application/x-www-form-urlencoded' , 11 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36' , 12 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3' , 13 'Accept-Encoding': 'gzip, deflate, br' , 14 'Accept-Language': 'zh-CN,zh;q=0.9' 15 } 16 17 login_url = 'https://wordpress-edu-3autumn.localprod.forc.work/wp-login.php' 18 login_data = { 19 'log': 'spiderman', 20 'pwd': 'crawler334566', 21 'wp-submit': '登录', 22 'redirect_to': 'https://wordpress-edu-3autumn.localprod.forc.work', 23 'testcookie': '1' 24 } 25 26 comment_url = 'https://wordpress-edu-3autumn.localprod.forc.work/wp-comments-post.php' 27 comment_data = { 28 'comment': '最新的评论内容', 29 'submit': '发表评论', 30 'comment_post_ID': '15', 31 'comment_parent': '0' 32 } 33 34 session = requests.session() 35 36 login = session.post(login_url,data=login_data,headers=headers) 37 38 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies) 39 40 cookies_str = json.dumps(cookies_dict) 41 42 with open('cookies.str','w',encoding='utf-8') as strfile: 43 strfile.write(cookies_str) 44 45 46 cookies_str_read = open('cookies.str','r') 47 48 cookies_dict_read = json.loads(cookies_str_read.read()) 49 50 cookies_read = requests.utils.cookiejar_from_dict(cookies_dict_read) 51 52 session.cookies = cookies_read 53 54 comment = session.post(comment_url,data=comment_data,headers=headers) 55 56 print(comment)
下面是老师的完整代码,几个功能模块做成了函数,调用起来非常方便
1 import requests, json 2 session = requests.session() 3 headers = { 4 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36'} 5 6 def cookies_read(): 7 cookies_txt = open('cookies.txt', 'r') 8 cookies_dict = json.loads(cookies_txt.read()) 9 cookies = requests.utils.cookiejar_from_dict(cookies_dict) 10 return (cookies) 11 # 以上4行代码,是cookies读取。 12 13 def sign_in(): 14 url = ' https://wordpress-edu-3autumn.localprod.forc.work/wp-login.php' 15 data = {'log': input('请输入你的账号'), 16 'pwd': input('请输入你的密码'), 17 'wp-submit': '登录', 18 'redirect_to': 'https://wordpress-edu-3autumn.localprod.forc.work/wp-admin/', 19 'testcookie': '1'} 20 session.post(url, headers=headers, data=data) 21 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies) 22 cookies_str = json.dumps(cookies_dict) 23 f = open('cookies.txt', 'w') 24 f.write(cookies_str) 25 f.close() 26 # 以上5行代码,是cookies存储。 27 28 def write_message(): 29 url_2 = 'https://wordpress-edu-3autumn.localprod.forc.work/wp-comments-post.php' 30 data_2 = { 31 'comment': input('请输入你要发表的评论:'), 32 'submit': '发表评论', 33 'comment_post_ID': '13', 34 'comment_parent': '0' 35 } 36 return (session.post(url_2, headers=headers, data=data_2)) 37 #以上9行代码,是发表评论。 38 39 try: 40 session.cookies = cookies_read() 41 except FileNotFoundError: 42 sign_in() 43 session.cookies = cookies_read() 44 45 num = write_message() 46 if num.status_code == 200: 47 print('成功啦!') 48 else: 49 sign_in() 50 session.cookies = cookies_read() 51 num = write_message()