hank-li - 博客园

2019年5月4日

17.splash_case06_ScrapySplashTest-master

摘要： taobao.py items.py middlewares.py pipelines.py settings.py 阅读全文

posted @ 2019-05-04 13:30 hank-li 阅读(189) 评论(0) 推荐(0) 编辑

摘要： ``` # python执行lua脚本 import requests from urllib.parse import quote lua = ''' function main(splash) return 'hello' end ''' url = 'http://localhost:8050/execute?lua_source=' + quote(lua) response... 阅读全文

posted @ 2019-05-04 11:13 hank-li 阅读(91) 评论(0) 推荐(0) 编辑

17.splash_case02

摘要： ``` # 抓取《我不是药神》的豆瓣评论 import csv import time import requests from lxml import etree fw = open('douban_comments.csv', 'w') writer = csv.writer(fw) writer.writerow(['comment_time','comment_content']) ... 阅读全文

posted @ 2019-05-04 10:57 hank-li 阅读(98) 评论(0) 推荐(0) 编辑

17.splash_case01

摘要： ``` # 抓取今日头条，对比渲染和没有渲染的效果 import requests from lxml import etree # url = 'http://localhost:8050/render.html?url=https://www.toutiao.com&timeout=30&wait=0.5' url = 'https://www.toutiao.com' response... 阅读全文

posted @ 2019-05-04 10:36 hank-li 阅读(108) 评论(0) 推荐(0) 编辑

16.ajax_case09

摘要： ``` import requests import json import re from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.sup... 阅读全文

posted @ 2019-05-04 10:32 hank-li 阅读(153) 评论(0) 推荐(0) 编辑

16.ajax_case08

摘要： ``` # 抓取简书博客总阅读量 # https://www.jianshu.com/u/130f76596b02 import requests import json import re from lxml import etree header = { 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,... 阅读全文

posted @ 2019-05-04 10:05 hank-li 阅读(182) 评论(0) 推荐(0) 编辑

2019年5月3日

16.ajax_case07

摘要： ``` # 通过搜索接口抓取etherscan上的合约地址 # https://etherscan.io/ import requests import re header = { 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'User-Agent... 阅读全文

posted @ 2019-05-03 23:47 hank-li 阅读(128) 评论(0) 推荐(0) 编辑

16.ajax_case06

摘要： ``` # 抓取华尔街见闻实时快讯 # https://wallstreetcn.com/live/global?from=navbar import requests import json header = { 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',... 阅读全文

posted @ 2019-05-03 23:45 hank-li 阅读(102) 评论(0) 推荐(0) 编辑

16.ajax_case05

摘要： ``` # 抓取36氪快讯 # https://36kr.com/newsflashes import requests import json header = { 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'User-Agen... 阅读全文

posted @ 2019-05-03 23:31 hank-li 阅读(118) 评论(0) 推荐(0) 编辑

2019年5月2日

14.data.js

摘要： ``` dict_data = { "_id":1, name:"王五", age:55, gender:true } db.stu.insert(dict_data) db.stu.insert({_id:1,name:"李四",age:38,gender:true,like:"🐶🐶"}) d 阅读全文

posted @ 2019-05-02 17:36 hank-li 阅读(431) 评论(0) 推荐(0) 编辑

Hank

求学之路无坦途,问艺之路无捷径.

公告