摘要: 7.4日内容: -爬取豌豆荚 爬取豌豆荚: 1.访问游戏主页 https://www.wandoujia.com/category/6001 2.点击查看更多,观察network内的请求 -请求url: page1: https: / /wwW . wandouj ia . com/wdjweb/a 阅读全文
posted @ 2019-07-04 09:31 Auraro997 阅读(166) 评论(0) 推荐(0) 编辑
摘要: ''' 爬取豌豆荚app数据 -请求url: page1: https: / /wwW . wandouj ia . com/wdjweb/api/ category/more? catId=6001&subCatId=0&page=2&ctoken=vbw9lj1sRQsRddx0hD-XqCNF ''' # 1.发送请求 imp... 阅读全文
posted @ 2019-07-04 09:29 Auraro997 阅读(365) 评论(0) 推荐(0) 编辑
摘要: import requests from bs4 import BeautifulSoup web='https://www.wandoujia.com/category/6001' web_g=requests.get(web) web_code=BeautifulSoup(web_g.text,'lxml') name=web_code.find_all(name='li',class_... 阅读全文
posted @ 2019-07-03 20:36 Auraro997 阅读(214) 评论(0) 推荐(0) 编辑
摘要: from selenium import webdriver import time driver = webdriver.Chrome(r'C:\Users\Auraro\Desktop/chromedriver.exe') try: driver.implicitly_wait(20) driver.get('https://www.wandoujia.com/categ... 阅读全文
posted @ 2019-07-03 20:17 Auraro997 阅读(290) 评论(0) 推荐(0) 编辑
摘要: ''' find:找一个 find_all:找多个 标签查找与属性查找: 标签: - 字符串过滤器 字符串全局匹配 name 属性匹配 attrs 属性查找匹配 text 文本匹配 - 正则过滤器 re模块匹配 - 列表过滤器 ... 阅读全文
posted @ 2019-07-03 18:24 Auraro997 阅读(433) 评论(0) 推荐(0) 编辑
摘要: html_doc = ''' The Dormouse's story $37 Once upon a time there were three little sisters; and their names were Elsie, Lacie and Tillie; and they lived at the bottom of a well. ... ''' from bs4 imp... 阅读全文
posted @ 2019-07-03 18:21 Auraro997 阅读(485) 评论(0) 推荐(0) 编辑
摘要: ''' 安装解析器: pip3 install lxml 安装解析库: pip3 install bs4 ''' html_doc = ''' The Dormouse's story $37 Once upon a time there were three little sisters; and their names were Elsie, Lacie and Tillie; and... 阅读全文
posted @ 2019-07-03 18:19 Auraro997 阅读(852) 评论(0) 推荐(0) 编辑
摘要: ''' 初级版 ''' import time from selenium import webdriver from selenium.webdriver.common.keys import Keys driver = webdriver.Chrome(r'C:\Users\Auraro\Desktop/chromedriver.exe') num = 1 try: drive... 阅读全文
posted @ 2019-07-03 18:17 Auraro997 阅读(1463) 评论(0) 推荐(0) 编辑
摘要: import time from selenium import webdriver browser = webdriver.Chrome() browser.get("https://www.baidu.com/") browser.get("https://www.taobao.com/") browser.get("https://www.sina.com/") # 后退 brows... 阅读全文
posted @ 2019-07-03 18:15 Auraro997 阅读(135) 评论(0) 推荐(0) 编辑
摘要: from selenium import webdriver from selenium.webdriver import ActionChains from selenium.webdriver.common.keys import Keys # 键盘按键操作 import time driver = webdriver.Chrome(r'C:\Users\Auraro\Desktop/c... 阅读全文
posted @ 2019-07-03 17:59 Auraro997 阅读(435) 评论(0) 推荐(0) 编辑